Mercurial > repos > xuebing > sharplabtool
comparison tools/rgenetics/rgQQ.xml @ 0:9071e359b9a3
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:37:19 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:9071e359b9a3 |
---|---|
1 <tool id="rgQQ1" name="QQ Plots:"> | |
2 <code file="rgQQ_code.py"/> | |
3 | |
4 <description>for p values from an analysis </description> | |
5 | |
6 <command interpreter="python"> | |
7 rgQQ.py "$input1" "$title" "$sample" "$cols" "$allqq" "$height" "$width" "$logtrans" "$allqq.id" "$__new_file_path__" | |
8 </command> | |
9 | |
10 <inputs> | |
11 <page> | |
12 <param name="input1" type="data" label="Choose the History dataset containing p values to QQ plot" | |
13 size="80" format="tabular" help="Dataset missing? See Tip below" /> | |
14 <param name="title" type="text" size="80" label = "Descriptive title for QQ plot" value="QQ" /> | |
15 | |
16 <param name="logtrans" type="boolean" label = "Use a log scale - recommended for p values in range 0-1.0" | |
17 truevalue="true" falsevalue="false"/> | |
18 <param name="sample" type="float" label="Random sample fraction - set to 1.0 for all data points" value="0.01" | |
19 help="If you have a million values, the QQ plots will be huge - a random sample of 1% will be fine" /> | |
20 <param name="height" type="integer" label="PDF image height (inches)" value="6" /> | |
21 <param name="width" type="integer" label="PDF image width (inches)" value="6" /> | |
22 </page> | |
23 <page> | |
24 <param name="cols" type="select" display="checkboxes" multiple="True" | |
25 help="Choose from these numeric columns in the data file to make a quantile-quantile plot against a uniform distribution" | |
26 label="Columns (p values 0-1 eg) to make QQ plots" dynamic_options="get_columns( input1 )" /> | |
27 </page> | |
28 </inputs> | |
29 | |
30 <outputs> | |
31 <data format="pdf" name="allqq" label="${title}.html"/> | |
32 </outputs> | |
33 | |
34 <tests> | |
35 <test> | |
36 <param name='input1' value='tinywga.pphe' /> | |
37 <param name='title' value="rgQQtest1" /> | |
38 <param name='logtrans' value="false" /> | |
39 <param name='sample' value='1.0' /> | |
40 <param name='height' value='8' /> | |
41 <param name='width' value='10' /> | |
42 <param name='cols' value='3' /> | |
43 <output name='allqq' file='rgQQtest1.pdf' ftype='binary' compare="diff" lines_diff="29"/> | |
44 </test> | |
45 </tests> | |
46 | |
47 <help> | |
48 | |
49 .. class:: infomark | |
50 | |
51 **Explanation** | |
52 | |
53 A quantile-quantile (QQ) plot is a good way to see systematic departures from the null expectation of uniform p-values | |
54 from a genomic analysis. If the QQ plot shows departure from the null (ie a uniform 0-1 distribution), you hope that this will be | |
55 in the very smallest p-values suggesting that there might be some interesting results to look at. A log scale will help emphasise departures | |
56 from the null at low p values more clear | |
57 | |
58 ----- | |
59 | |
60 .. class:: infomark | |
61 | |
62 **Syntax** | |
63 | |
64 This tool has 2 pages. On the first one you choose the data set and output options, then on the second page, the | |
65 column names are shown so you can choose the one containing the p values you wish to plot. | |
66 | |
67 - **History data** is one of your history tabular data sets | |
68 - **Descriptive Title** is the text to appear in the output file names to remind you what the plots are! | |
69 - **Use a Log scale** is recommended for p values in the range 0-1 as it highlights departures from the null at small p values | |
70 - **Random Sample Fraction** is the fraction of points to randomly sample - highly recommended for >5k or so values | |
71 - **Height and Width** will determine the scale of the pdf images | |
72 | |
73 | |
74 ----- | |
75 | |
76 .. class:: infomark | |
77 | |
78 **Summary** | |
79 | |
80 Generate a uniform QQ plot for any large number of p values from an analysis. | |
81 Essentially a plot of n ranked p values against their rank as a centile - ie rank/n | |
82 | |
83 Works well where you have a column containing p values from | |
84 a statistical test of some sort. These will be plotted against the values expected under the null. Departure | |
85 from the diagonal suggests one distribution is more extreme than the other. You hope your p values are | |
86 smaller than expected under the null. | |
87 | |
88 The sampling fraction will help cut down the size of the pdfs. If there are fewer than 5k points on any plot, all will be shown. | |
89 Otherwise the sampling fraction will be used or 5k, whichever is larger. | |
90 | |
91 Note that the use of a log scale is ill-advised if you are plotting log transformed p values because the | |
92 uniform distribution chosen for the qq plot is always 0-1 and log transformation is applied if required. | |
93 The most useful plots for p values are log QQ plots of untransformed p values in the range 0-1 | |
94 | |
95 Originally designed and written for family based data from the CAMP Illumina run of 2007 by | |
96 ross lazarus (ross.lazarus@gmail.com) | |
97 | |
98 </help> | |
99 </tool> |