0
|
1 <tool id="rgCaCo1" name="Case Control:">
|
|
2 <description>for unrelated subjects</description>
|
|
3 <command interpreter="python">
|
|
4 rgCaCo.py '$i.extra_files_path/$i.metadata.base_name' "$title" '$out_file1' '$logf' '$logf.files_path' '$gffout'
|
|
5 </command>
|
|
6 <inputs>
|
|
7 <param name="i" type="data" label="RGenetics genotype data from your current history"
|
|
8 format="pbed" />
|
|
9 <param name='title' type='text' size="132" value='CaseControl' label="Title for this job"/>
|
|
10
|
|
11 </inputs>
|
|
12
|
|
13 <outputs>
|
|
14 <data format="tabular" name="out_file1" label="${title}_rgCaCo.xls" />
|
|
15 <data format="txt" name="logf" label="${title}_rgCaCo.log"/>
|
|
16 <data format="gff" name="gffout" label="${title}_rgCaCoTop.gff" />
|
|
17 </outputs>
|
|
18 <tests>
|
|
19 <test>
|
|
20 <param name='i' value='tinywga' ftype='pbed' >
|
|
21 <metadata name='base_name' value='tinywga' />
|
|
22 <composite_data value='tinywga.bim' />
|
|
23 <composite_data value='tinywga.bed' />
|
|
24 <composite_data value='tinywga.fam' />
|
|
25 <edit_attributes type='name' value='tinywga' />
|
|
26 </param>
|
|
27 <param name='title' value='rgCaCotest1' />
|
|
28 <output name='out_file1' file='rgCaCotest1_CaCo.xls' ftype='tabular' compare='diff' />
|
|
29 <output name='logf' file='rgCaCotest1_CaCo_log.txt' ftype='txt' compare='diff' lines_diff='20' />
|
|
30 <output name='gffout' file='rgCaCotest1_CaCo_topTable.gff' ftype='gff' compare='diff' />
|
|
31 </test>
|
|
32 </tests>
|
|
33 <help>
|
|
34
|
|
35 .. class:: infomark
|
|
36
|
|
37 **Syntax**
|
|
38
|
|
39 - **Genotype file** is the input case control data chosen from available library Plink binary files
|
|
40 - **Map file** is the linkage format .map file corresponding to the genotypes in the Genotype file
|
|
41 - **Type of test** is the kind of test statistic to report such as Armitage trend test or genotype test
|
|
42 - **Format** determines how your data will be returned to your Galaxy workspace
|
|
43
|
|
44 -----
|
|
45
|
|
46 **Summary**
|
|
47
|
|
48 This tool will perform some standard statistical tests comparing subjects designated as
|
|
49 affected (cases) and unaffected subjects (controls). To avoid bias, it is important that
|
|
50 controls who had been affected would have been eligible for sampling as cases. This may seem
|
|
51 odd, but it requires that the cases and controls are drawn from the same sampling frame.
|
|
52
|
|
53 The armitage trend test is robust to departure from HWE and so very attractive - after all, a real disease
|
|
54 mutation may well result in distorted HWE at least in cases. All the others are susceptible to
|
|
55 bias in the presence of HWE departures.
|
|
56
|
|
57 All of these tests are exquisitely sensitive to non-differential population stratification in cases
|
|
58 compared to controls and this must be tested before believing any results here. Use the PCA method for
|
|
59 100k markers or more.
|
|
60
|
|
61 If you don't see the genotype data set you want here, it can be imported using one of the methods available from
|
|
62 the Galaxy Get Data tool page.
|
|
63
|
|
64 Output format can be UCSC .bed if you want to see your
|
|
65 results as a fully fledged UCSC track. A map file containing the chromosome and offset for each marker is required for
|
|
66 writing this kind of output.
|
|
67 Alternatively you can use .gg for the UCSC Genome Graphs tool which has all of the advantages
|
|
68 of the the .bed track, plus a neat, visual front end that displays a lot of useful clues.
|
|
69 Either of these are a very useful way of quickly getting a look
|
|
70 at your data in full genomic context.
|
|
71
|
|
72 Finally, if you can't live without
|
|
73 spreadsheet data, choose the .xls tab delimited format. It's not a stupid binary excel file. Just a plain old tab delimited
|
|
74 one with a header. Fortunately excel is dumb enough to open these without much protest.
|
|
75
|
|
76
|
|
77 -----
|
|
78
|
|
79 .. class:: infomark
|
|
80
|
|
81 **Attribution**
|
|
82
|
|
83 This Galaxy tool relies on Plink (see Plinksrc_) to test Casae Control association models.
|
|
84
|
|
85 So, we rely on the author (Shaun Purcell) for the documentation you need specific to those settings - they are very nicely documented - see
|
|
86 DOC_
|
|
87
|
|
88 Tool and Galaxy datatypes originally designed and written for the Rgenetics
|
|
89 series of whole genome scale statistical genetics tools by ross lazarus (ross.lazarus@gmail.com)
|
|
90
|
|
91 Copyright Ross Lazarus March 2007
|
|
92 This Galaxy wrapper is released licensed under the LGPL_ but is about as useful as a chocolate teapot without Plink which is GPL.
|
|
93
|
|
94 I'm no lawyer, but it looks like you got GPL if you use this software. Good luck.
|
|
95
|
|
96 .. _Plinksrc: http://pngu.mgh.harvard.edu/~purcell/plink/
|
|
97
|
|
98 .. _LGPL: http://www.gnu.org/copyleft/lesser.html
|
|
99
|
|
100 .. _DOC: http://pngu.mgh.harvard.edu/~purcell/plink/anal.shtml#cc
|
|
101
|
|
102 </help>
|
|
103 </tool>
|