Mercurial > repos > xuebing > sharplabtool
comparison tools/human_genome_variation/gpass.xml @ 0:9071e359b9a3
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:37:19 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:9071e359b9a3 |
---|---|
1 <tool id="hgv_gpass" name="GPASS" version="1.0.0"> | |
2 <description>significant single-SNP associations in case-control studies</description> | |
3 | |
4 <command interpreter="perl"> | |
5 gpass.pl ${input1.extra_files_path}/${input1.metadata.base_name}.map ${input1.extra_files_path}/${input1.metadata.base_name}.ped $output $fdr | |
6 </command> | |
7 | |
8 <inputs> | |
9 <param name="input1" type="data" format="lped" label="Dataset"/> | |
10 <param name="fdr" type="float" value="0.05" label="FDR"/> | |
11 </inputs> | |
12 | |
13 <outputs> | |
14 <data name="output" format="tabular" /> | |
15 </outputs> | |
16 | |
17 <requirements> | |
18 <requirement type="package">gpass</requirement> | |
19 </requirements> | |
20 | |
21 <!-- we need to be able to set the seed for the random number generator | |
22 <tests> | |
23 <test> | |
24 <param name='input1' value='gpass_and_beam_input' ftype='lped' > | |
25 <metadata name='base_name' value='gpass_and_beam_input' /> | |
26 <composite_data value='gpass_and_beam_input.ped' /> | |
27 <composite_data value='gpass_and_beam_input.map' /> | |
28 <edit_attributes type='name' value='gpass_and_beam_input' /> | |
29 </param> | |
30 <param name="fdr" value="0.05" /> | |
31 <output name="output" file="gpass_output.txt" /> | |
32 </test> | |
33 </tests> | |
34 --> | |
35 | |
36 <help> | |
37 **Dataset formats** | |
38 | |
39 The input dataset must be in lped_ format, and the output is tabular_. | |
40 (`Dataset missing?`_) | |
41 | |
42 .. _lped: ./static/formatHelp.html#lped | |
43 .. _tabular: ./static/formatHelp.html#tab | |
44 .. _Dataset missing?: ./static/formatHelp.html | |
45 | |
46 ----- | |
47 | |
48 **What it does** | |
49 | |
50 GPASS (Genome-wide Poisson Approximation for Statistical Significance) | |
51 detects significant single-SNP associations in case-control studies at a user-specified FDR. Unlike previous methods, this tool can accurately approximate the genome-wide significance and FDR of SNP associations, while adjusting for millions of multiple comparisons, within seconds or minutes. | |
52 | |
53 The program has two main functionalities: | |
54 | |
55 1. Detect significant single-SNP associations at a user-specified false | |
56 discovery rate (FDR). | |
57 | |
58 *Note*: a "typical" definition of FDR could be | |
59 FDR = E(# of false positive SNPs / # of significant SNPs) | |
60 | |
61 This definition however is very inappropriate for association mapping, since SNPs are | |
62 highly correlated. Our FDR is | |
63 defined differently to account for SNP correlations, and thus will obtain | |
64 a proper FDR in terms of "proportion of false positive loci". | |
65 | |
66 2. Approximate the significance of a list of candidate SNPs, adjusting for | |
67 multiple comparisons. If you have isolated a few SNPs of interest and want | |
68 to know their significance in a GWAS, you can supply the GWAS data and let | |
69 the program specifically test those SNPs. | |
70 | |
71 | |
72 *Also note*: the number of SNPs in a study cannot be both too small and at the same | |
73 time too clustered in a local region. A few hundreds of SNPs, or tens of SNPs | |
74 spread in different regions, will be fine. The sample size cannot be too small | |
75 either; around 100 or more individuals (case + control combined) will be fine. | |
76 Otherwise use permutation. | |
77 | |
78 ----- | |
79 | |
80 **Example** | |
81 | |
82 - input map file:: | |
83 | |
84 1 rs0 0 738547 | |
85 1 rs1 0 5597094 | |
86 1 rs2 0 9424115 | |
87 etc. | |
88 | |
89 - input ped file:: | |
90 | |
91 1 1 0 0 1 1 G G A A A A A A A A A G A A G G G G A A G G G G G G A A A A A G A A G G A G A G A A G G A A G G A A G G A G A A G G A A G G A A A G A G G G A G G G G G A A A G A A G G G G G G G G A G A A A A A A A A | |
92 1 1 0 0 1 1 G G A G G G A A A A A G A A G G G G G G A A G G A G A G G G G G A G G G A G A A G G A G G G A A G G G G A G A G G G A G A A A A G G G G A G A G G G A G A A A A A G G G A G G G A G G G G G A A G G A G | |
93 etc. | |
94 | |
95 - output dataset, showing significant SNPs and their p-values and FDR:: | |
96 | |
97 #ID chr position Statistics adj-Pvalue FDR | |
98 rs35 chr1 136606952 4.890849 0.991562 0.682138 | |
99 rs36 chr1 137748344 4.931934 0.991562 0.795827 | |
100 rs44 chr2 14423047 7.712832 0.665086 0.218776 | |
101 etc. | |
102 | |
103 ----- | |
104 | |
105 **Reference** | |
106 | |
107 Zhang Y, Liu JS. (2010) | |
108 Fast and accurate significance approximation for genome-wide association studies. | |
109 Submitted. | |
110 | |
111 </help> | |
112 </tool> |