0
|
1 <tool id="hgv_beam" name="BEAM" version="1.0.0">
|
|
2 <description>significant single- and multi-locus SNP associations in case-control studies</description>
|
|
3
|
|
4 <command interpreter="bash">
|
|
5 BEAM2_wrapper.sh map=${input.extra_files_path}/${input.metadata.base_name}.map ped=${input.extra_files_path}/${input.metadata.base_name}.ped $burnin $mcmc $pvalue significance=$significance posterior=$posterior
|
|
6 </command>
|
|
7
|
|
8 <inputs>
|
|
9 <param format="lped" name="input" type="data" label="Dataset"/>
|
|
10 <param name="burnin" label="Number of MCMC burn-in steps" type="integer" value="200" />
|
|
11 <param name="mcmc" label="Number of MCMC sampling steps" type="integer" value="200" />
|
|
12 <param name="pvalue" label="Significance cutoff (after Bonferroni adjustment)" type="float" value="0.05" />
|
|
13 </inputs>
|
|
14
|
|
15 <outputs>
|
|
16 <data format="tabular" name="significance" />
|
|
17 <data format="tabular" name="posterior" />
|
|
18 </outputs>
|
|
19
|
|
20 <requirements>
|
|
21 <requirement type="package">beam</requirement>
|
|
22 <requirement type="binary">mv</requirement>
|
|
23 <requirement type="binary">rm</requirement>
|
|
24 </requirements>
|
|
25
|
|
26 <!-- broken. will be fixed soon.
|
|
27 <tests>
|
|
28 <test>
|
|
29 <param name='input' value='gpass_and_beam_input' ftype='lped' >
|
|
30 <metadata name='base_name' value='gpass_and_beam_input' />
|
|
31 <composite_data value='gpass_and_beam_input.ped' />
|
|
32 <composite_data value='gpass_and_beam_input.map' />
|
|
33 <edit_attributes type='name' value='gpass_and_beam_input' />
|
|
34 </param>
|
|
35 <param name="burnin" value="200"/>
|
|
36 <param name="mcmc" value="200"/>
|
|
37 <param name="pvalue" value="0.05"/>
|
|
38 <output name="significance" file="beam_output1.tab"/>
|
|
39 <output name="posterior" file="beam_output2.tab"/>
|
|
40 </test>
|
|
41 </tests>
|
|
42 -->
|
|
43
|
|
44 <help>
|
|
45 .. class:: infomark
|
|
46
|
|
47 This tool can take a long time to run, depending on the number of SNPs, the
|
|
48 sample size, and the number of MCMC steps specified. If you have hundreds
|
|
49 of thousands of SNPs, it may take over a day. The main tasks that slow down
|
|
50 this tool are searching for interactions and dynamically partitioning the
|
|
51 SNPs into blocks. Optimization is certainly possible, but hasn't been done
|
|
52 yet. **If your only interest is to detect SNPs with primary effects (i.e.,
|
|
53 single-SNP associations), please use the GPASS tool instead.**
|
|
54
|
|
55 -----
|
|
56
|
|
57 **Dataset formats**
|
|
58
|
|
59 The input dataset must be in lped_ format. The output datasets are both tabular_.
|
|
60 (`Dataset missing?`_)
|
|
61
|
|
62 .. _lped: ./static/formatHelp.html#lped
|
|
63 .. _tabular: ./static/formatHelp.html#tabular
|
|
64 .. _Dataset missing?: ./static/formatHelp.html
|
|
65
|
|
66 -----
|
|
67
|
|
68 **What it does**
|
|
69
|
|
70 BEAM (Bayesian Epistasis Association Mapping) uses a Markov Chain Monte Carlo (MCMC) method to infer SNP block structures and detect both single-marker
|
|
71 and interaction effects from case-control SNP data.
|
|
72 This tool also partitions SNPs into blocks based on linkage disequilibrium (LD). The method utilized is Bayesian, so the outputs are posterior probabilities of association, along with block partitions. An advantage of this method is that it provides uncertainty measures for the associations and block partitions, and it scales well from small to large sample sizes. It is powerful in detecting gene-gene interactions, although slow for large datasets.
|
|
73
|
|
74 -----
|
|
75
|
|
76 **Example**
|
|
77
|
|
78 - input map file::
|
|
79
|
|
80 1 rs0 0 738547
|
|
81 1 rs1 0 5597094
|
|
82 1 rs2 0 9424115
|
|
83 etc.
|
|
84
|
|
85 - input ped file::
|
|
86
|
|
87 1 1 0 0 1 1 G G A A A A A A A A A G A A G G G G A A G G G G G G A A A A A G A A G G A G A G A A G G A A G G A A G G A G A A G G A A G G A A A G A G G G A G G G G G A A A G A A G G G G G G G G A G A A A A A A A A
|
|
88 1 1 0 0 1 1 G G A G G G A A A A A G A A G G G G G G A A G G A G A G G G G G A G G G A G A A G G A G G G A A G G G G A G A G G G A G A A A A G G G G A G A G G G A G A A A A A G G G A G G G A G G G G G A A G G A G
|
|
89 etc.
|
|
90
|
|
91 - first output file, significance.txt::
|
|
92
|
|
93 ID chr position results
|
|
94 rs0 chr1 738547 10 20 score= 45.101397 , df= 8 , p= 0.000431 , N=1225
|
|
95
|
|
96 - second output file, posterior.txt::
|
|
97
|
|
98 id: chr position marginal + interaction = total posterior
|
|
99 0: 1 738547 0.0000 + 0.0000 = 0.0000
|
|
100 1: 1 5597094 0.0000 + 0.0000 = 0.0000
|
|
101 2: 1 9424115 0.0000 + 0.0000 = 0.0000
|
|
102 3: 1 13879818 0.0000 + 0.0000 = 0.0000
|
|
103 4: 1 13934751 0.0000 + 0.0000 = 0.0000
|
|
104 5: 1 16803491 0.0000 + 0.0000 = 0.0000
|
|
105 6: 1 17236854 0.0000 + 0.0000 = 0.0000
|
|
106 7: 1 18445387 0.0000 + 0.0000 = 0.0000
|
|
107 8: 1 21222571 0.0000 + 0.0000 = 0.0000
|
|
108 etc.
|
|
109
|
|
110 id: chr position block_boundary | allele counts in cases and controls
|
|
111 0: 1 738547 1.000 | 156 93 251 | 169 83 248
|
|
112 1: 1 5597094 1.000 | 323 19 158 | 328 16 156
|
|
113 2: 1 9424115 1.000 | 366 6 128 | 369 11 120
|
|
114 3: 1 13879818 1.000 | 252 31 217 | 278 32 190
|
|
115 4: 1 13934751 1.000 | 246 64 190 | 224 58 218
|
|
116 5: 1 16803491 1.000 | 91 160 249 | 91 174 235
|
|
117 6: 1 17236854 1.000 | 252 43 205 | 249 44 207
|
|
118 7: 1 18445387 1.000 | 205 66 229 | 217 56 227
|
|
119 8: 1 21222571 1.000 | 353 9 138 | 352 8 140
|
|
120 etc.
|
|
121
|
|
122 The "id" field is an internally used index.
|
|
123
|
|
124 -----
|
|
125
|
|
126 **References**
|
|
127
|
|
128 Zhang Y, Liu JS. (2007)
|
|
129 Bayesian inference of epistatic interactions in case-control studies.
|
|
130 Nat Genet. 39(9):1167-73. Epub 2007 Aug 26.
|
|
131
|
|
132 Zhang Y, Zhang J, Liu JS. (2010)
|
|
133 Block-based bayesian epistasis association mapping with application to WTCCC type 1 diabetes data.
|
|
134 Submitted.
|
|
135
|
|
136 </help>
|
|
137 </tool>
|