# HG changeset patch # User iuc # Date 1522370389 14400 # Node ID 5a29ab10dba66a8c59a147564815e07c2873a73e # Parent bfa6c1b8a03c9017cbd27c72d1dbc8f89bbfd404 planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tool_collections/snpeff commit a69e15a4016b3088ec937d6f2349be091c6b1b71 diff -r bfa6c1b8a03c -r 5a29ab10dba6 snpEff.xml --- a/snpEff.xml Tue Mar 27 14:56:29 2018 -0400 +++ b/snpEff.xml Thu Mar 29 20:39:49 2018 -0400 @@ -1,4 +1,4 @@ - + annotate variants snpEff_macros.xml @@ -386,7 +386,7 @@ By genetic variant we mean difference between a genome and a "reference" genome. As an example, imagine we are sequencing a "sample". Here "sample" can mean anything that you are interested in studying, from a cell culture, to a mouse or a cancer patient. It is a standard procedure to compare your sample sequences against the corresponding "reference genome". For instance you may compare the cancer patient genome against the "reference genome". -In a typical sequencing experiment, you will find many places in the genome where your sample differs from the reference genome. These are called "genomic variants" or just "variants". +In a typical sequencing experiment, you will find many places in the genome where your sample differs from the reference genome. These are called "genomic variants" or just "variants". Typically, variants are categorized as follows: - SNP (Single-Nucleotide Polymorphism) Reference = 'A', Sample = 'C' @@ -397,7 +397,7 @@ This is not a comprehensive list, it is just to give you an idea. -Suppose you have a huge file describing all the differences between your sample and the reference genome. But you want to know more about these variants than just their genetic coordinates. E.g.: Are they in a gene? In an exon? Do they change protein coding? Do they cause premature stop codons? SnpEff can help you answer all these questions. The process of adding this information about the variants is called "Annotation". +Suppose you have a huge file describing all the differences between your sample and the reference genome. But you want to know more about these variants than just their genetic coordinates. E.g.: Are they in a gene? In an exon? Do they change protein coding? Do they cause premature stop codons? SnpEff can help you answer all these questions. The process of adding this information about the variants is called "Annotation". SnpEff provides several degrees of annotations, from simple (e.g. which gene is each variant affecting) to extremely complex annotations (e.g. will this non-coding variant affect the expression of a gene?). It should be noted that the more complex the annotations, the more it relies in computational predictions. Such computational predictions can be incorrect, so results from SnpEff (or any prediction algorithm) cannot be trusted blindly, they must be analyzed and independently validated by corresponding wet-lab experiments. @snpeff_in_galaxy_info@ diff -r bfa6c1b8a03c -r 5a29ab10dba6 snpEff_create_db.xml --- a/snpEff_create_db.xml Tue Mar 27 14:56:29 2018 -0400 +++ b/snpEff_create_db.xml Thu Mar 29 20:39:49 2018 -0400 @@ -1,4 +1,4 @@ - + database from Genbank record snpEff_macros.xml @@ -23,11 +23,11 @@ #end if mkdir -p '${snpeff_output.files_path}'/'${genome_version}' && - + ln -s '${input_gbk}' '${snpeff_output.files_path}'/'${genome_version}'/genes.gbk && - snpEff @java_options@ build -v - -configOption '${genome_version}'.genome='${genome_version}' + snpEff @java_options@ build -v + -configOption '${genome_version}'.genome='${genome_version}' -configOption '${genome_version}'.codonTable='${codon_table}' -genbank -dataDir '$snpeff_output.files_path' '$genome_version' diff -r bfa6c1b8a03c -r 5a29ab10dba6 snpEff_databases.xml --- a/snpEff_databases.xml Tue Mar 27 14:56:29 2018 -0400 +++ b/snpEff_databases.xml Thu Mar 29 20:39:49 2018 -0400 @@ -1,4 +1,4 @@ - + list available databases snpEff_macros.xml @@ -10,18 +10,18 @@ '${snpeff_dbs}' - + ]]> @@ -69,7 +69,7 @@ mm10 Mouse http://downloads.sourceforge.net/project/snpeff/databases/v4_3/snpEff_v4_3_mm10.zip mm9 Mouse http://downloads.sourceforge.net/project/snpeff/databases/v4_3/snpEff_v4_3_mm9.zip -This means that there two available snpEff databases for mouse genome versions mm9 and mm10. In order to download these databases you should use identifier from the first column (e.g., mm9 or mm10 in this case). +This means that there two available snpEff databases for mouse genome versions mm9 and mm10. In order to download these databases you should use identifier from the first column (e.g., mm9 or mm10 in this case). ------- @@ -80,7 +80,7 @@ There are two ways to use names of databases obtained with this tool in Galaxy's version on snpEff: #. Use **SnpEff download** tool. It will download the database to the history and you will be able to use it in **SnpEff eff** tool using *Downloaded snpEff database in your history* option of the **Genome source** parameter. - #. Use *Download on demand* option of the **SnpEff eff** tool (again, **Genome source** parameter). In this case snpEff will download the database before performing annotation. + #. Use *Download on demand* option of the **SnpEff eff** tool (again, **Genome source** parameter). In this case snpEff will download the database before performing annotation. @snpeff_in_galaxy_info@ @external_documentation@ diff -r bfa6c1b8a03c -r 5a29ab10dba6 snpEff_download.xml --- a/snpEff_download.xml Tue Mar 27 14:56:29 2018 -0400 +++ b/snpEff_download.xml Thu Mar 29 20:39:49 2018 -0400 @@ -1,4 +1,4 @@ - + download a pre-built database snpEff_macros.xml @@ -34,7 +34,7 @@ **What it does** -This tool downloads a specified database from @snpeff_database_url@. It deposits it into the history. +This tool downloads a specified database from @snpeff_database_url@. It deposits it into the history. ------- @@ -46,7 +46,7 @@ #. Download mm10 snpEff database by typing *mm10* into **Select the annotation database...** text box. #. Use **SnpEff eff** by choosing the downloaded database from the history using *Downloaded snpEff database in your history* option of the **Genome source** parameter. - + @snpeff_in_galaxy_info@ @external_documentation@ ]]> diff -r bfa6c1b8a03c -r 5a29ab10dba6 snpEff_macros.xml --- a/snpEff_macros.xml Tue Mar 27 14:56:29 2018 -0400 +++ b/snpEff_macros.xml Thu Mar 29 20:39:49 2018 -0400 @@ -13,7 +13,7 @@ snpEff -version ]]> - 4.3.1t + 4.3+T SnpEff4.3 https://sourceforge.net/projects/snpeff/files/databases/v4_3/ -Xmx\${GALAXY_MEMORY_MB:-8192}m @@ -36,7 +36,7 @@ **Pre-cached databases** -Many standard (e.g., human, mouse, *Drosophila*) databases are likely pre-cached within a given Galaxy instance. You should be able to see them listed in **Genome** drop-down of **SbpEff eff** tool. +Many standard (e.g., human, mouse, *Drosophila*) databases are likely pre-cached within a given Galaxy instance. You should be able to see them listed in **Genome** drop-down of **SbpEff eff** tool. In you *do not see them* keep reading... @@ -48,17 +48,17 @@ #. Use **SnpEff download** tool to download the database. #. Finally, use **SnpEff eff** by choosing the downloaded database from the history using *Downloaded snpEff database in your history* option of the **Genome source** parameter. -Alternatively, you can specify the name of the database directly in **SnpEff eff** using the *Download on demand* option (again, **Genome source** parameter). In this case snpEff will download the database before performing annotation. +Alternatively, you can specify the name of the database directly in **SnpEff eff** using the *Download on demand* option (again, **Genome source** parameter). In this case snpEff will download the database before performing annotation. **Create your own database** In cases when you are dealing with bacterial or viral (or, frankly, any other) genomes it may be easier to create database yourself. For this you need: - #. Download Genbank record corresponding to your genome of interest from from NCBI. - #. Use **SnpEff build** to create the database. + #. Download Genbank record corresponding to your genome of interest from NCBI. + #. Use **SnpEff build** to create the database. #. Use the database in **SnpEff eff** (using *Custom* option for **Genome source** parameter). -Creating custom database has one benefit. The **SnpEff build** tool normally produces two outputs: (1) a SnpEff database and (2) FASTA file containing sequences from the Genbank file. If you are performing your experiment from the beginning by mapping reads against a genome and finding variants before annotating them with SnpEff you can use **this FASTA file** as a reference to map your reads against. This will guarantee that you will not have any issues related to reference sequence naming -- the most common source of SnpEff errors. +Creating custom database has one benefit. The **SnpEff build** tool normally produces two outputs: (1) a SnpEff database and (2) FASTA file containing sequences from the Genbank file. If you are performing your experiment from the beginning by mapping reads against a genome and finding variants before annotating them with SnpEff you can use **this FASTA file** as a reference to map your reads against. This will guarantee that you will not have any issues related to reference sequence naming -- the most common source of SnpEff errors. @@ -70,4 +70,4 @@ - \ No newline at end of file +