annotate aggregate_gd_indivs.xml @ 28:184d14e4270d

Update to Miller Lab devshed revision 4ede22dd5500
author Richard Burhans <burhans@bx.psu.edu>
date Wed, 17 Jul 2013 12:46:46 -0400
parents 8997f2ca8c7a
children a631c2f6d913
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
1 <tool id="gd_sum_gd_snp" name="Aggregate Individuals" version="1.1.0">
13
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
2 <description>: Append summary columns for a population</description>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
3
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
4 <command interpreter="python">
27
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
5 #import json
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
6 #import base64
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
7 #import zlib
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
8 #set $ind_names = $input.dataset.metadata.individual_names
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
9 #set $ind_colms = $input.dataset.metadata.individual_columns
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
10 #set $ind_dict = dict(zip($ind_names, $ind_colms))
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
11 #set $ind_json = json.dumps($ind_dict, separators=(',',':'))
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
12 #set $ind_comp = zlib.compress($ind_json, 9)
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
13 #set $ind_arg = base64.b64encode($ind_comp)
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
14 aggregate_gd_indivs.py '$input' '$p1_input' '$output'
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
15 #if $input_type.choice == '0'
27
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
16 'gd_snp'
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
17 #else if $input_type.choice == '1'
27
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
18 'gd_genotype'
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
19 #end if
27
8997f2ca8c7a Update to Miller Lab devshed revision bae0d3306d3b
Richard Burhans <burhans@bx.psu.edu>
parents: 26
diff changeset
20 '$ind_arg'
13
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
21 </command>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
22
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
23 <inputs>
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
24
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
25 <conditional name="input_type">
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
26 <param name="choice" type="select" format="integer" label="Input format">
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
27 <option value="0" selected="true">gd_snp</option>
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
28 <option value="1">gd_genotype</option>
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
29 </param>
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
30
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
31 <when value="0">
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
32 <param name="input" type="data" format="gd_snp" label="SNP dataset" />
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
33 </when>
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
34 <when value="1">
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
35 <param name="input" type="data" format="gd_genotype" label="Genotype dataset" />
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
36 </when>
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
37 </conditional>
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
38
13
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
39 <param name="p1_input" type="data" format="gd_indivs" label="Population individuals" />
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
40 </inputs>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
41
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
42 <outputs>
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
43 <data name="output" format="input" format_source="input" metadata_source="input" />
13
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
44 </outputs>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
45
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
46 <tests>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
47 <test>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
48 <param name="input" value="test_in/sample.gd_snp" ftype="gd_snp" />
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
49 <param name="p1_input" value="test_in/a.gd_indivs" ftype="gd_indivs" />
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
50 <output name="output" file="test_out/modify_snp_table/modify.gd_snp" />
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
51 </test>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
52 </tests>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
53
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
54 <help>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
55
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
56 **Dataset formats**
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
57
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
58 The input datasets are in gd_snp_, gd_genotype_, and gd_indivs_ formats.
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
59 The output dataset is in gd_snp_ or gd_genotype_ format. (`Dataset missing?`_)
13
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
60
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
61 .. _gd_snp: ./static/formatHelp.html#gd_snp
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
62 .. _gd_genotype: ./static/formatHelp.html#gd_genotype
13
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
63 .. _gd_indivs: ./static/formatHelp.html#gd_indivs
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
64 .. _Dataset missing?: ./static/formatHelp.html
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
65
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
66 -----
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
67
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
68 **What it does**
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
69
26
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
70 The user specifies that some of the individuals in a gd_snp or gd_genotype
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
71 dataset form a "population", by supplying a list that has been previously
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
72 created using the Specify Individuals tool. The program appends a new
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
73 "entity" (set of four columns for a gd_snp table, or one column for a
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
74 gd_genotype table), analogous to the column(s) for an individual but
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
75 containing summary data for the population as a group. For a gd_snp
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
76 table, these four columns give the total counts for the two alleles,
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
77 the "genotype" for the population, and the maximum quality value, taken
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
78 over all individuals in the population. If all defined genotypes in
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
79 the population are 2 (agree with the reference), then the population's
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
80 genotype is 2, and similarly for 0; otherwise the genotype is 1 (unless
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
81 all individuals have undefined genotype, in which case it is -1).
91e835060ad2 Updates to Admixture, Aggregate Individuals, and Restore Attributes to support gd_genotype
Richard Burhans <burhans@bx.psu.edu>
parents: 22
diff changeset
82 For a gd_genotype file, only the aggregate genotype is appended.
13
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
83
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
84 -----
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
85
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
86 **Example**
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
87
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
88 - input gd_snp::
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
89
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
90 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
91 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
92 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
93 etc.
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
94
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
95 - input individuals::
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
96
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
97 9 PB1
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
98 13 PB2
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
99 17 PB3
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
100
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
101 - output::
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
102
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
103 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0 29 0 2 72
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
104 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0 3 0 2 30
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
105 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0 13 0 2 42
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
106 etc.
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
107
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
108 </help>
fdb4240fb565 Uploaded Miller Lab Devshed version a51c894f5bed
miller-lab
parents:
diff changeset
109 </tool>