0
|
1 <tool id="poisson2test" name="Poisson two-sample test" version="1.0.0">
|
|
2 <description></description>
|
|
3 <requirements>
|
|
4 <requirement type="package">taxonomy</requirement>
|
|
5 </requirements>
|
|
6 <command interpreter="python">poisson2test.py $input1 $input2 $input3 $input4 $input5 $output1 2>/dev/null </command>
|
|
7 <inputs>
|
|
8 <param name="input1" format="tabular" type="data" label="Input File"/>
|
|
9 <param name="input2" type="integer" size="5" value="2" label="First Column"/>
|
|
10 <param name="input3" type="integer" size="5" value="3" label="Second Column"/>
|
|
11 <param name="input4" type="float" size="5" value="1" label="D value"/>
|
|
12 <param name="input5" type="select" label="correction method">
|
|
13 <option value="0">Bonferroni</option>
|
|
14 <option value="1">FDR</option>
|
|
15 </param>
|
|
16 </inputs>
|
|
17 <outputs>
|
|
18 <data format="tabular" name="output1" />
|
|
19 </outputs>
|
|
20 <tests>
|
|
21 <test>
|
|
22 <param name="input1" value="poisson2test1.tabular" ftype="tabular"/>
|
|
23 <param name="input2" value="2" />
|
|
24 <param name="input3" value="3" />
|
|
25 <param name="input4" value="0.44" />
|
|
26 <param name="input5" value="0" />
|
|
27 <output name="output1" file="poisson2test1_out.tabular" />
|
|
28 </test>
|
|
29 <test>
|
|
30 <param name="input1" value="poisson2test2.tabular" ftype="tabular"/>
|
|
31 <param name="input2" value="2" />
|
|
32 <param name="input3" value="3" />
|
|
33 <param name="input4" value="0.44" />
|
|
34 <param name="input5" value="0" />
|
|
35 <output name="output1" file="poisson2test2_out.tabular" />
|
|
36 </test>
|
|
37 </tests>
|
|
38 <help>
|
|
39
|
|
40 **What it does**
|
|
41
|
|
42 Suppose you have metagenomic samples from two different locations and have classified the reads unique to various taxa. Now you want to test if the number of reads that fall in a particular taxon in location 1 is different from those that fall in the same taxon in location 2.
|
|
43 This utility performs this analysis. It assumes that the data comes from a Poisson process and calculates two Z scores (Z1 and Z2) based on the work by Shiue and Bain; 1982 (Z1) and Huffman; 1984 (Z2).
|
|
44
|
|
45 -----
|
|
46
|
|
47 **Z score formula**
|
|
48
|
|
49 Equation 1:
|
|
50
|
|
51 .. image:: ./static/images/poisson2test_eqn1.png
|
|
52
|
|
53
|
|
54 Equation 2:
|
|
55
|
|
56 .. image:: ./static/images/poisson2test_eqn2.png
|
|
57
|
|
58
|
|
59 X = number of reads falling in a particular taxon in location 1
|
|
60
|
|
61 Y = number of reads falling in the same taxon in location 2
|
|
62
|
|
63 d = correction factor that accounts for biases in sample collection, DNA concentration, read numbers etc. between the two locations.
|
|
64
|
|
65 Not only that, this utility also provides corresponding p-values and corrected p-values (using Bonferroni or False Discovery Rate (FDR)). It takes in an input file (a tab delimited file consisting of three or more columns (taxon/category, read counts in location 1, read counts in location 2)), columns to compare, d value and a correction method 0 (Bonferroni) or 1 (FDR).
|
|
66
|
|
67 -----
|
|
68
|
|
69 **Example**
|
|
70
|
|
71 - Input File: phylum, read count in location-1, read count in location-2::
|
|
72
|
|
73 Annelida 36 2
|
|
74 Apicomplexa 17 8
|
|
75 Arthropoda 1964 928
|
|
76 Ascomycota 436 49
|
|
77 Basidiomycota 77 55
|
|
78
|
|
79 - Arguments to be supplied by the user::
|
|
80
|
|
81 col_i col_j d-value correction-method
|
|
82
|
|
83 2 3 0.44 Bonferroni
|
|
84
|
|
85 - Output File: phylum, readcount1, readcount2, z1, z2, p1, p2, corrected p1, corrected p2::
|
|
86
|
|
87 Annelida 36 2 3.385 4.276 0.000356 0.000010 0.00463 0.00012
|
|
88 Apicomplexa 17 8 -0.157 -0.156 0.437707 0.438103 1.00000 1.00000
|
|
89 Arthropoda 1964 928 -1.790 -1.777 0.036755 0.037744 0.47782 0.49067
|
|
90 Ascomycota 436 49 9.778 11.418 0.000000 0.000000 0.00000 0.00000
|
|
91 Basidiomycota 77 55 -2.771 -2.659 0.002792 0.003916 0.03629 0.05091
|
|
92
|
|
93 -----
|
|
94
|
|
95 **Note**
|
|
96
|
|
97 - Input file should be Tab delimited
|
|
98 - i < j
|
|
99 - d cannot be 0
|
|
100 - k = Bonferroni or FDR
|
|
101
|
|
102 -----
|
|
103
|
|
104 **References**
|
|
105
|
|
106 - Shiue, W. and Bain, L. (1982). Experiment Size and Power Comparisons for Two-Sample Poisson Tests. Applied Statistics 31, 130-134.
|
|
107
|
|
108 - Huffman, M. D. (1984). An Improved Approximate Two-Sample Poisson Test. Applied Statistics 33, 224-226.
|
|
109
|
|
110 </help>
|
|
111 </tool>
|
|
112
|
|
113
|