annotate snpSift_filter.xml @ 9:937367efb1da default tip

Change tool dependency to package_snpeff_3_2, now uses environment variable: SNPEFF_JAR_PATH for the location of snpeff jar files.
author Jim Johnson <jj@umn.edu>
date Wed, 18 Sep 2013 10:49:56 -0500
parents 13b6ad2ddace
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
8
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
1 <tool id="snpSift_filter" name="SnpSift Filter" version="3.2">
0
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
2 <options sanitize="False" />
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
3 <description>Filter variants using arbitrary expressions</description>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
4 <requirements>
8
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
5 <requirement type="package" version="3.2">snpEff</requirement>
0
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
6 </requirements>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
7 <command>
9
937367efb1da Change tool dependency to package_snpeff_3_2, now uses environment variable: SNPEFF_JAR_PATH for the location of snpeff jar files.
Jim Johnson <jj@umn.edu>
parents: 8
diff changeset
8 java -Xmx6G -jar \$SNPEFF_JAR_PATH/SnpSift.jar filter -f $input -e $exprFile $inverse $pass
8
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
9 #if $filterId and len($filterId.__str__.strip()) > 0:
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
10 --filterId = "$filterId"
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
11 #end if
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
12 #if $addFilter and len($addFilter.__str__.strip()) > 0:
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
13 --addFilter = "$addFilter"
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
14 #end if
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
15 #if $rmFilter and len($rmFilter.__str__.strip()) > 0:
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
16 --rmFilter = "$rmFilter"
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
17 #end if
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
18 > $output
0
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
19 </command>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
20 <inputs>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
21 <param format="vcf" name="input" type="data" label="VCF input"/>
8
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
22 <param name="expr" type="text" label="Expression" size="120"/>
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
23 <param name="inverse" type="boolean" truevalue="--inverse" falsevalue="" checked="false" label="Inverse. Show lines that do not match filter expression"/>
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
24 <param name="pass" type="boolean" truevalue="--pass" falsevalue="" checked="false" label="Use 'PASS' field instead of filtering out VCF entries"/>
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
25 <param name="filterId" type="text" value="" optional="true" label="ID for this filter (##FILTER tag in header and FILTER VCF field)." size="10"/>
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
26 <param name="addFilter" type="text" value="" optional="true" label="Add a string to FILTER VCF field if 'expression' is true." size="10"/>
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
27 <param name="rmFilter" type="text" value="" optional="true" label="Remove a string from FILTER VCF field if 'expression' is true (and 'str' is in the field)." size="10"/>
0
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
28 </inputs>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
29 <configfiles>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
30 <configfile name="exprFile">
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
31 $expr
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
32 </configfile>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
33 </configfiles>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
34
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
35 <outputs>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
36 <data format="vcf" name="output" />
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
37 </outputs>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
38 <stdio>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
39 <exit_code range=":-1" level="fatal" description="Error: Cannot open file" />
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
40 <exit_code range="1:" level="fatal" description="Error" />
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
41 </stdio>
5
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
42
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
43 <tests>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
44
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
45 <test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
46 <param name="input" ftype="vcf" value="test01.vcf"/>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
47 <param name="expr" value="QUAL >= 50"/>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
48 <output name="output">
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
49 <assert_contents>
8
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
50 <has_text text="28837706" />
5
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
51 <not_has_text text="NT_166464" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
52 </assert_contents>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
53 </output>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
54 </test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
55
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
56 <test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
57 <param name="input" ftype="vcf" value="test01.vcf"/>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
58 <param name="expr" value="(CHROM = '19')"/>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
59 <output name="output">
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
60 <assert_contents>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
61 <has_text text="3205820" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
62 <not_has_text text="NT_16" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
63 </assert_contents>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
64 </output>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
65 </test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
66
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
67 <test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
68 <param name="input" ftype="vcf" value="test01.vcf"/>
8
13b6ad2ddace SnpEffect v3.2
Jim Johnson <jj@umn.edu>
parents: 5
diff changeset
69 <param name="expr" value="(POS >= 20175) &amp; (POS &lt;= 35549)"/>
5
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
70 <output name="output">
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
71 <assert_contents>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
72 <has_text text="20175" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
73 <has_text text="35549" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
74 <has_text text="22256" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
75 <not_has_text text="18933" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
76 <not_has_text text="37567" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
77 </assert_contents>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
78 </output>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
79 </test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
80
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
81 <test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
82 <param name="input" ftype="vcf" value="test01.vcf"/>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
83 <param name="expr" value="( DP >= 5 )"/>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
84 <output name="output">
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
85 <assert_contents>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
86 <has_text text="DP=5;" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
87 <has_text text="DP=6;" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
88 <not_has_text text="DP=1;" />
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
89 </assert_contents>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
90 </output>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
91 </test>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
92
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
93 </tests>
192a236898f5 Add test cases for SnpSift
Jim Johnson <jj@umn.edu>
parents: 4
diff changeset
94
0
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
95 <help>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
96
1
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
97 **SnpSift filter**
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
98
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
99 You can filter ia vcf file using arbitrary expressions, for instance "(QUAL > 30) | (exists INDEL) | ( countHet() > 2 )". The actual expressions can be quite complex, so it allows for a lot of flexibility.
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
100
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
101 Some examples:
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
102
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
103 - *I want to filter out samples with quality less than 30*:
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
104
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
105 * **( QUAL &gt; 30 )**
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
106
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
107 - *...but we also want InDels that have quality 20 or more*:
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
108
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
109 * **(( exists INDEL ) &amp; (QUAL >= 20)) | (QUAL >= 30 )**
0
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
110
1
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
111 - *...or any homozygous variant present in more than 3 samples*:
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
112
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
113 * **(countHom() > 3) | (( exists INDEL ) &amp; (QUAL >= 20)) | (QUAL >= 30 )**
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
114
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
115 - *...or any heterozygous sample with coverage 25 or more*:
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
116
4
fe23d90249ee Remove help error copied wiki: and is single ampersand
Jim Johnson <jj@umn.edu>
parents: 1
diff changeset
117 * **((countHet() > 0) &amp; (DP >= 25)) | (countHom() > 3) | (( exists INDEL ) &amp; (QUAL >= 20)) | (QUAL >= 30 )**
1
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
118
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
119 - *I want to keep samples where the genotype for the first sample is homozygous variant and the genotype for the second sample is reference*:
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
120
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
121 * **isHom( GEN[0] ) &amp; isVariant( GEN[0] ) &amp; isRef( GEN[1] )**
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
122
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
123
2c595fea585c Add more doocumentation
Jim Johnson <jj@umn.edu>
parents: 0
diff changeset
124 For complete details about this tool and epressions that can be used, please go to http://snpeff.sourceforge.net/SnpSift.html#filter
0
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
125
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
126 </help>
c07c403fc470 Uploaded
jjohnson
parents:
diff changeset
127 </tool>