diff snpEff/snpEff.xml @ 0:481a95ca5339

Migrated tool version 0.9 from old tool shed archive to new tool shed repository
author pcingola
date Tue, 07 Jun 2011 17:08:59 -0400
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/snpEff/snpEff.xml	Tue Jun 07 17:08:59 2011 -0400
@@ -0,0 +1,99 @@
+<tool id="snpEff" name="Compute SNP effect" version="0.9">
+	<description>for each SNP in a file</description>
+	<command>java -jar /usr/local/snpEff/snpEff.jar -c /usr/local/snpEff/snpEff.config $genomeVersion $input > $output</command>
+	<!-- <command>java -Xmx2048M -jar /usr/local/snpEff/snpEff.jar -c /usr/local/snpEff/snpEff.config $genomeVersion $input > $output</command> -->
+	<inputs>
+		<param format="interval" name="input" type="data" label="Source file"/>
+		<param name="genomeVersion" type="select" label="Genome">
+			<option value="hg37">Human (hg37)</option>
+			<option value="mm37">Mouse (mm37)</option>
+		</param>
+	</inputs>
+	<outputs>
+		<data format="tabular" name="output" />
+	</outputs>
+
+	<help>
+
+This tool computes the effect of SNPs. Current predictions include
+
+	* GENE : ENSEMBL gene ID, gene name and bio-type
+	* TRANSCRIPT : ENSEMBL transcript ID
+	* INTRON : The SNP hits a transcript, but no exons (or UTRs)
+	* EXON : ENSEMBL exon ID
+		* SYNONYMOUS_CODING : The SNP changes the DNA sequence in a way that produces the same amino acid
+		* NON_SYNONYMOUS_CODING : The SNP changes the DNA sequence in a way that produces a different amino acid
+		* STOP_GAINED : The SNP creates a new STOP codon
+		* STOP_LOST : The SNP changes a STOP codon into an amino acid
+	* UPSTREAM : The SNP is 2K bases upstream of a transcript (before 5 prime UTR)
+	* DOWNSTREAM : The SNP is 2K bases downstream of a transcript (after 3 prime UTR)
+	* 5PRIME_UTR : The SNP is in the 5 prime UTR region
+	* 3PRIME_UTR : The SNP is in the 3 prime UTR region
+	* INTERGENIC : The SNP does not any known gene or up/downstream region
+
+-----
+
+.. class:: infomark
+
+**File format**
+
+The file format must be tab-separated format, containing five columns that correspond to:
+
+	* chromosome_name
+	* chromosome_start_position
+	* chromosome_end_position
+	* allele: "base_Ori / base_Snp"
+	* strand: {+,-}
+
+.. class:: warningmark 
+
+**WARNING** Insertions and deletions are not supported, so chromosome_start_position is equal to chromosome_end_position::
+ 
+-----
+
+.. class:: infomark
+
+**Input file format example**
+
+This is an example of an input file::
+
+    5   140532    140532    T/C   +
+    12  1017956   1017956   T/A   +
+    2   946507    946507    G/C   +
+    14  19584687  19584687  C/T   -
+    19  66520     66520     G/A   +
+    8   150029    150029    A/T   +
+
+-----
+
+.. class:: infomark
+
+**Output file format example**
+
+The output file consist of one line per SNP effect. This means that you usually get more than one line per SNP. The format is tab separated cinsisting of two columns
+
+	* SNP:  chr:position_besOri/baseSnp
+	* Effect : Effect[, EXON:ExonId, TRANSCRIPT:TranscriptId, GENE:GeneId (genenName bioType)]
+
+This is an example of an output file::
+
+    chr2:946507_G/C        UPSTREAM, ENST00000452177
+    chr2:946507_G/C        UPSTREAM, ENST00000450962
+    chr2:946507_G/C        UPSTREAM, ENST00000308624
+    chr2:946507_G/C        UPSTREAM, ENST00000407292
+    chr5:140532_T/C        NON_SYNONYMOUS_CODING (V/A), EXON:ENSE00001319336, TRANSCRIPT:ENST00000283426, GENE:ENSG00000153404(PLEKHG4B protein_coding)
+    chr5:140532_T/C        NON_SYNONYMOUS_CODING (V/A), EXON:ENSE00001319336, TRANSCRIPT:ENST00000398036, GENE:ENSG00000153404(PLEKHG4B protein_coding)
+    chr5:140532_T/C        DOWNSTREAM, ENST00000398036
+    chr8:150029_A/T        NON_SYNONYMOUS_CODING (Y/N), EXON:ENSE00001913609, TRANSCRIPT:ENST00000490482, GENE:ENSG00000223508(RPL23AP53 protein_coding)
+    chr12:1017956_T/A      STOP_LOST (*/K), EXON:ENSE00001527897, TRANSCRIPT:ENST00000340908, GENE:ENSG00000060237(WNK1 protein_coding)
+    chr12:1017956_T/A      STOP_LOST (*/K), EXON:ENSE00001527897, TRANSCRIPT:ENST00000252477, GENE:ENSG00000060237(WNK1 protein_coding)
+    chr12:1017956_T/A      STOP_LOST (*/K), EXON:ENSE00001527897, TRANSCRIPT:ENST00000315939, GENE:ENSG00000060237(WNK1 protein_coding)
+    chr12:1017956_T/A      UPSTREAM, ENST00000340908
+    chr14:19584687_C/T     3PRIME_UTR, ENSE00001583193, TRANSCRIPT:ENST00000409832, GENE:ENSG00000222036(POTEG protein_coding)
+
+.. class:: warningmark 
+
+**WARNING** You may get the same effect on one exon repeated because it acts on different transcripts::
+	</help>
+</tool>
+