comparison gemini_stats.xml @ 0:ac761838cdaf draft

planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/gemini commit 4bbfca6f0e9cae9a8f263aad4eab7304c96358c4
author iuc
date Thu, 18 Feb 2016 08:55:47 -0500
parents
children ee894347fcd6
comparison
equal deleted inserted replaced
-1:000000000000 0:ac761838cdaf
1 <tool id="gemini_@BINARY@" name="GEMINI @BINARY@" version="@VERSION@.0">
2 <description>Compute useful variant statistics</description>
3 <macros>
4 <import>gemini_macros.xml</import>
5 <token name="@BINARY@">stats</token>
6 </macros>
7 <expand macro="requirements" />
8 <expand macro="stdio" />
9 <expand macro="version_command" />
10 <command>
11 <![CDATA[
12 gemini @BINARY@
13 $stats_type
14
15 #if $gt_filter.strip():
16 --gt-filter "${gt_filter}"
17 #end if
18
19 #if $summarize.strip():
20 --gt-filter "${gt_filter}"
21 #end if
22
23 "${ infile }"
24 > "${ outfile }"
25 ]]>
26 </command>
27 <inputs>
28 <expand macro="infile" />
29
30 <param name="stats_type" type="select" label="Studying ..." help="">
31 <option value="--tstv">Compute the transition and transversion ratios for the snps (--tstv)</option>
32 <option value="--tstv-coding">Compute the transition/transversion ratios for the snps in the coding regions (--tstv-coding)</option>
33 <option value="--tstv-noncoding">Compute the transition/transversion ratios for the snps in the non-coding regions (--tstv-noncoding)</option>
34 <option value="--snp-counts">Compute the type and count of the snps (--snp-counts)</option>
35 <option value="--sfs">Calculate the site frequency spectrum of the variants (--sfs)</option>
36 <option value="--mds">Compute the pair-wise genetic distance between each sample (--mds)</option>
37 <option value="--vars-by-sample">Return the total variants per sample, sum of homozygous and heterozygous variants (--vars-by-sample)</option>
38 <option value="--gts-by-sample">Return the count of each genotype class observed per sample (--gts-by-sample)</option>
39 </param>
40
41 <param name="gt_filter" type="text" area="True" size="5x50" label="Restrictions to apply to genotype values" help="(--gt-filer)">
42 <expand macro="sanitize_query" />
43 </param>
44
45 <param name="summarize" type="text" area="True" size="5x50" label="The query to be issued to the database to summarize" help="(--summarize)">
46 <expand macro="sanitize_query" />
47 </param>
48
49 </inputs>
50 <outputs>
51 <data name="outfile" format="tabular" />
52 </outputs>
53 <tests>
54 <test>
55 <param name="infile" value="gemini_burden_input.db" ftype="gemini.sqlite" />
56 <param name="stats_type" value="--vars-by-sample" />
57 <output name="outfile" file="gemini_stats_result.tabular" />
58 </test>
59 </tests>
60 <help><![CDATA[
61 **What it does**
62
63 The stats tool computes some useful variant statistics for a GEMINI database.
64 Like computing the transition and transversion ratios for the snps.
65
66 **Settings and examples**
67
68 --tstv-coding:
69 Compute the transition/transversion ratios for the snps in the coding regions.
70
71 --tstv-noncoding:
72 Compute the transition/transversion ratios for the snps in the non-coding regions.
73
74 EXAMPLE Compute the type and count of the snps; --snp-counts::
75
76 type count
77 A->G 2
78 C->T 1
79 G->A 1
80
81 EXAMPLE Calculate the site frequency spectrum of the variants; --sfs::
82
83 aaf count
84 0.125 2
85 0.375 1
86
87 EXAMPLE Compute the pair-wise genetic distance between each sample; --mds::
88
89 sample1 sample2 distance
90 M10500 M10500 0.0
91 M10475 M10478 1.25
92 M10500 M10475 2.0
93 M10500 M10478 0.5714
94
95 EXAMPLE Return a count of the types of genotypes per sample; --gts-by-sample::
96
97 sample num_hom_ref num_het num_hom_alt num_unknown total
98 M10475 4 1 3 1 9
99 M10478 2 2 4 1 9
100
101
102
103 EXAMPLE Return the total variants per sample (sum of homozygous and heterozygous variants); --vars-by-sample::
104
105 sample total
106 M10475 4
107 M10478 6
108
109 **Final solution**
110
111 --summarize:
112 If none of these tools are exactly what you want, you can summarize the variants per sample of an arbitrary query using the –summarize flag.
113
114 EXAMPLE If you wanted to know, for each sample, how many variants are on chromosome 1 that are also in dbSNP;--summarize "select * from variants where in_dbsnp=1 and chrom='chr1'"::
115
116 sample total num_het num_hom_alt
117 M10475 1 1 0
118 M128215 1 1 0
119 M10478 2 2 0
120 M10500 2 1 1
121
122
123 ]]></help>
124 <expand macro="citations"/>
125 </tool>