0
|
1 <tool id="hgv_ldtools" name="LD" version="1.0.0">
|
|
2 <description>linkage disequilibrium and tag SNPs</description>
|
|
3
|
|
4 <command interpreter="bash">
|
|
5 ldtools_wrapper.sh rsquare=$rsquare freq=$freq input=$input output=$output
|
|
6 </command>
|
|
7
|
|
8 <inputs>
|
|
9 <param format="tabular" name="input" type="data" label="Dataset"/>
|
|
10 <param name="rsquare" label="r<sup>2</sup> threshold" type="float" value="0.64">
|
|
11 <validator type="in_range" message="rsquare must be in range [0.00, 1.00]" min="0.00" max="1.00" />
|
|
12 </param>
|
|
13 <param name="freq" label="Minimum allele frequency threshold" type="float" value="0.00">
|
|
14 <validator type="in_range" message="freq must be in range (0.00, 0.50]" min="0.00" max="0.50" />
|
|
15 </param>
|
|
16 </inputs>
|
|
17
|
|
18 <outputs>
|
|
19 <data format="tabular" name="output" />
|
|
20 </outputs>
|
|
21
|
|
22 <tests>
|
|
23 <test>
|
|
24 <param name="input" value="ldInput1.txt" />
|
|
25 <param name="rsquare" value="0.64" />
|
|
26 <param name="freq" value="0.00" />
|
|
27 <output name="output" file="ldOutput1.txt" />
|
|
28 </test>
|
|
29 </tests>
|
|
30
|
|
31 <help>
|
|
32 **Dataset formats**
|
|
33
|
|
34 The input and output datasets are tabular_.
|
|
35 (`Dataset missing?`_)
|
|
36
|
|
37 .. _tabular: ./static/formatHelp.html#tab
|
|
38 .. _Dataset missing?: ./static/formatHelp.html
|
|
39
|
|
40 -----
|
|
41
|
|
42 **What it does**
|
|
43
|
|
44 This tool can be used to analyze the patterns of linkage disequilibrium
|
|
45 (LD) between polymorphic sites in a locus. SNPs are grouped based on the
|
|
46 threshold level of LD as measured by r\ :sup:`2` (regardless of genomic
|
|
47 position), and a representative "tag SNP" is reported for each group.
|
|
48 The other SNPs in the group are in LD with the tag SNP, but not necessarily
|
|
49 with each other.
|
|
50
|
|
51 The underlying algorithm is the same as the one used in ldSelect (Carlson
|
|
52 et al. 2004). However, this tool is implemented to be much faster and more
|
|
53 efficient than ldSelect.
|
|
54
|
|
55 The input is a tabular file with genotype information for each individual
|
|
56 at each SNP site, in exactly four columns: site ID, sample ID, and the
|
|
57 two allele nucleotides.
|
|
58
|
|
59 -----
|
|
60
|
|
61 **Example**
|
|
62
|
|
63 - input file::
|
|
64
|
|
65 rs2334386 NA20364 G T
|
|
66 rs2334386 NA20363 G G
|
|
67 rs2334386 NA20360 G G
|
|
68 rs2334386 NA20359 G G
|
|
69 rs2334386 NA20358 G G
|
|
70 rs2334386 NA20356 G G
|
|
71 rs2334386 NA20357 G G
|
|
72 rs2334386 NA20350 G G
|
|
73 rs2334386 NA20349 G G
|
|
74 rs2334386 NA20348 G G
|
|
75 rs2334386 NA20347 G G
|
|
76 rs2334386 NA20346 G G
|
|
77 rs2334386 NA20345 G G
|
|
78 rs2334386 NA20344 G G
|
|
79 rs2334386 NA20342 G G
|
|
80 etc.
|
|
81
|
|
82 - output file::
|
|
83
|
|
84 rs2238748 rs2793064,rs6518516,rs6518517,rs2283641,rs5993533,rs715590,rs2072123,rs2105421,rs2800954,rs1557847,rs807750,rs807753,rs5993488,rs8138035,rs2800980,rs2525079,rs5992353,rs712966,rs2525036,rs807743,rs1034727,rs807744,rs2074003
|
|
85 rs2871023 rs1210715,rs1210711,rs5748189,rs1210709,rs3788298,rs7284649,rs9306217,rs9604954,rs1210703,rs5748179,rs5746727,rs5748190,rs5993603,rs2238766,rs885981,rs2238763,rs5748165,rs9605996,rs9606001,rs5992398
|
|
86 rs7292006 rs13447232,rs5993665,rs2073733,rs1057457,rs756658,rs5992395,rs2073760,rs739369,rs9606017,rs739370,rs4493360,rs2073736
|
|
87 rs2518840 rs1061325,rs2283646,rs362148,rs1340958,rs361956,rs361991,rs2073754,rs2040771,rs2073740,rs2282684
|
|
88 rs2073775 rs10160,rs2800981,rs807751,rs5993492,rs2189490,rs5747997,rs2238743
|
|
89 rs5747263 rs12159924,rs2300688,rs4239846,rs3747025,rs3747024,rs3747023,rs2300691
|
|
90 rs433576 rs9605439,rs1109052,rs400509,rs401099,rs396012,rs410456,rs385105
|
|
91 rs2106145 rs5748131,rs2013516,rs1210684,rs1210685,rs2238767,rs2277837
|
|
92 rs2587082 rs2257083,rs2109659,rs2587081,rs5747306,rs2535704,rs2535694
|
|
93 rs807667 rs2800974,rs756651,rs762523,rs2800973,rs1018764
|
|
94 rs2518866 rs1206542,rs807467,rs807464,rs807462,rs712950
|
|
95 rs1110661 rs1110660,rs7286607,rs1110659,rs5992917,rs1110662
|
|
96 rs759076 rs5748760,rs5748755,rs5748752,rs4819925,rs933461
|
|
97 rs5746487 rs5992895,rs2034113,rs2075455,rs1867353
|
|
98 rs5748212 rs5746736,rs4141527,rs5748147,rs5748202
|
|
99 etc.
|
|
100
|
|
101 -----
|
|
102
|
|
103 **Reference**
|
|
104
|
|
105 Carlson CS, Eberle MA, Rieder MJ, Yi Q, Kruglyak L, Nickerson DA. (2004)
|
|
106 Selecting a maximally informative set of single-nucleotide polymorphisms for
|
|
107 association analyses using linkage disequilibrium.
|
|
108 Am J Hum Genet. 74(1):106-20. Epub 2003 Dec 15.
|
|
109
|
|
110 </help>
|
|
111 </tool>
|