comparison tools/human_genome_variation/gpass.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:9071e359b9a3
1 <tool id="hgv_gpass" name="GPASS" version="1.0.0">
2 <description>significant single-SNP associations in case-control studies</description>
3
4 <command interpreter="perl">
5 gpass.pl ${input1.extra_files_path}/${input1.metadata.base_name}.map ${input1.extra_files_path}/${input1.metadata.base_name}.ped $output $fdr
6 </command>
7
8 <inputs>
9 <param name="input1" type="data" format="lped" label="Dataset"/>
10 <param name="fdr" type="float" value="0.05" label="FDR"/>
11 </inputs>
12
13 <outputs>
14 <data name="output" format="tabular" />
15 </outputs>
16
17 <requirements>
18 <requirement type="package">gpass</requirement>
19 </requirements>
20
21 <!-- we need to be able to set the seed for the random number generator
22 <tests>
23 <test>
24 <param name='input1' value='gpass_and_beam_input' ftype='lped' >
25 <metadata name='base_name' value='gpass_and_beam_input' />
26 <composite_data value='gpass_and_beam_input.ped' />
27 <composite_data value='gpass_and_beam_input.map' />
28 <edit_attributes type='name' value='gpass_and_beam_input' />
29 </param>
30 <param name="fdr" value="0.05" />
31 <output name="output" file="gpass_output.txt" />
32 </test>
33 </tests>
34 -->
35
36 <help>
37 **Dataset formats**
38
39 The input dataset must be in lped_ format, and the output is tabular_.
40 (`Dataset missing?`_)
41
42 .. _lped: ./static/formatHelp.html#lped
43 .. _tabular: ./static/formatHelp.html#tab
44 .. _Dataset missing?: ./static/formatHelp.html
45
46 -----
47
48 **What it does**
49
50 GPASS (Genome-wide Poisson Approximation for Statistical Significance)
51 detects significant single-SNP associations in case-control studies at a user-specified FDR. Unlike previous methods, this tool can accurately approximate the genome-wide significance and FDR of SNP associations, while adjusting for millions of multiple comparisons, within seconds or minutes.
52
53 The program has two main functionalities:
54
55 1. Detect significant single-SNP associations at a user-specified false
56 discovery rate (FDR).
57
58 *Note*: a "typical" definition of FDR could be
59 FDR = E(# of false positive SNPs / # of significant SNPs)
60
61 This definition however is very inappropriate for association mapping, since SNPs are
62 highly correlated. Our FDR is
63 defined differently to account for SNP correlations, and thus will obtain
64 a proper FDR in terms of "proportion of false positive loci".
65
66 2. Approximate the significance of a list of candidate SNPs, adjusting for
67 multiple comparisons. If you have isolated a few SNPs of interest and want
68 to know their significance in a GWAS, you can supply the GWAS data and let
69 the program specifically test those SNPs.
70
71
72 *Also note*: the number of SNPs in a study cannot be both too small and at the same
73 time too clustered in a local region. A few hundreds of SNPs, or tens of SNPs
74 spread in different regions, will be fine. The sample size cannot be too small
75 either; around 100 or more individuals (case + control combined) will be fine.
76 Otherwise use permutation.
77
78 -----
79
80 **Example**
81
82 - input map file::
83
84 1 rs0 0 738547
85 1 rs1 0 5597094
86 1 rs2 0 9424115
87 etc.
88
89 - input ped file::
90
91 1 1 0 0 1 1 G G A A A A A A A A A G A A G G G G A A G G G G G G A A A A A G A A G G A G A G A A G G A A G G A A G G A G A A G G A A G G A A A G A G G G A G G G G G A A A G A A G G G G G G G G A G A A A A A A A A
92 1 1 0 0 1 1 G G A G G G A A A A A G A A G G G G G G A A G G A G A G G G G G A G G G A G A A G G A G G G A A G G G G A G A G G G A G A A A A G G G G A G A G G G A G A A A A A G G G A G G G A G G G G G A A G G A G
93 etc.
94
95 - output dataset, showing significant SNPs and their p-values and FDR::
96
97 #ID chr position Statistics adj-Pvalue FDR
98 rs35 chr1 136606952 4.890849 0.991562 0.682138
99 rs36 chr1 137748344 4.931934 0.991562 0.795827
100 rs44 chr2 14423047 7.712832 0.665086 0.218776
101 etc.
102
103 -----
104
105 **Reference**
106
107 Zhang Y, Liu JS. (2010)
108 Fast and accurate significance approximation for genome-wide association studies.
109 Submitted.
110
111 </help>
112 </tool>