Mercurial > repos > miller-lab > genome_diversity
comparison prepare_population_structure.xml @ 18:f04f40a36cc8
Latest changes from Belinda and Cathy. Webb's updates to the Fst tools.
author | Richard Burhans <burhans@bx.psu.edu> |
---|---|
date | Tue, 23 Oct 2012 12:41:52 -0400 |
parents | 8ae67e9fb6ff |
children | 248b06e86022 |
comparison
equal
deleted
inserted
replaced
17:a3af29edcce2 | 18:f04f40a36cc8 |
---|---|
17 #end for | 17 #end for |
18 </command> | 18 </command> |
19 | 19 |
20 <inputs> | 20 <inputs> |
21 <param name="input" type="data" format="gd_snp" label="SNP dataset" /> | 21 <param name="input" type="data" format="gd_snp" label="SNP dataset" /> |
22 <param name="min_reads" type="integer" min="0" value="0" label="Minimum reads covering a SNP, per individual" /> | |
23 <param name="min_qual" type="integer" min="0" value="0" label="Minimum quality value, per individual" /> | |
24 <param name="min_spacing" type="integer" min="0" value="0" label="Minimum spacing between SNPs on the same scaffold" /> | |
25 <conditional name="individuals"> | 22 <conditional name="individuals"> |
26 <param name="choice" type="select" label="Individuals"> | 23 <param name="choice" type="select" label="Individuals"> |
27 <option value="0" selected="true">All</option> | 24 <option value="0" selected="true">All individuals</option> |
28 <option value="1">Choose</option> | 25 <option value="1">Specified populations</option> |
29 </param> | 26 </param> |
30 <when value="0" /> | 27 <when value="0" /> |
31 <when value="1"> | 28 <when value="1"> |
32 <repeat name="populations" title="Population" min="1"> | 29 <repeat name="populations" title="Population" min="1"> |
33 <param name="p_input" type="data" format="gd_indivs" label="Individuals" /> | 30 <param name="p_input" type="data" format="gd_indivs" label="Individuals" /> |
34 </repeat> | 31 </repeat> |
35 </when> | 32 </when> |
36 </conditional> | 33 </conditional> |
34 <param name="min_reads" type="integer" min="0" value="0" label="Minimum SNP coverage" /> | |
35 <param name="min_qual" type="integer" min="0" value="0" label="Minimum SNP quality" /> | |
36 <param name="min_spacing" type="integer" min="0" value="0" label="Minimum spacing between SNPs" /> | |
37 </inputs> | 37 </inputs> |
38 | 38 |
39 <outputs> | 39 <outputs> |
40 <data name="output" format="gd_ped"> | 40 <data name="output" format="gd_ped"> |
41 <actions> | 41 <actions> |
60 | 60 |
61 <help> | 61 <help> |
62 | 62 |
63 **Dataset formats** | 63 **Dataset formats** |
64 | 64 |
65 The input datasets are in gd_snp_ and gd_indivs_ formats. It is important | 65 The input datasets are in gd_snp_ and gd_indivs_ formats. |
66 for the Individuals datasets to have unique names; rename them if | 66 The output dataset is in gd_ped_ format. (`Dataset missing?`_) |
67 necessary to make them unique. These names are used by the later tools in | |
68 the graphical displays. | |
69 The output dataset is gd_ped_. (`Dataset missing?`_) | |
70 | 67 |
71 .. _gd_snp: ./static/formatHelp.html#gd_snp | 68 .. _gd_snp: ./static/formatHelp.html#gd_snp |
72 .. _gd_indivs: ./static/formatHelp.html#gd_indivs | 69 .. _gd_indivs: ./static/formatHelp.html#gd_indivs |
73 .. _gd_ped: ./static/formatHelp.html#gd_ped | 70 .. _gd_ped: ./static/formatHelp.html#gd_ped |
74 .. _Dataset missing?: ./static/formatHelp.html | 71 .. _Dataset missing?: ./static/formatHelp.html |
75 | 72 |
76 ----- | 73 ----- |
77 | 74 |
78 **What it does** | 75 **What it does** |
79 | 76 |
80 The tool converts a gd_snp dataset into two tables, called "admix.map" and | 77 This tool converts a gd_snp dataset into the format needed for estimating |
81 "admix.ped", needed for estimating the population structure. The user | 78 the population structure. You can select the individuals to be included, |
82 can read or download those files, or simply pass this tool's output on to | 79 by using "population" datasets created via the Specify Individuals tool. |
83 other programs. The user imposes conditions on which SNPs to consider, | 80 (It is important for these population datasets to have distinguishable names, |
84 such as the minimum coverage and/or quality value for every individual, | 81 since they will be stored in the output's metadata so that subsequent tools |
85 or the distance to the closest SNP in the same contig (as named in the | 82 can use them as labels. If necessary, rename the datasets to give them |
86 first column of the SNP table). A useful piece of information produced | 83 distinct and meaningful names before running this tool.) |
87 by the tool is the number of SNPs meeting those conditions, which can | 84 |
88 be found by clicking on the eye icon in the history panel after the program | 85 You can also filter the SNPs, based on criteria such as minimum coverage |
89 runs. | 86 (a qualifying SNP must have at least this many reads for every included |
87 individual), minimum quality score (for every included individual), and/or | |
88 minimum spacing (SNPs that are too close together on the same chromosome or | |
89 scaffold are discarded). In addition to producing the filtered and formatted | |
90 .map and .ped files for subsequent analysis, the tool reports the number of | |
91 SNPs meeting these conditions, which can be seen by clicking on the eye icon | |
92 in the history panel after the program runs. | |
90 | 93 |
91 ----- | 94 ----- |
92 | 95 |
93 **Example** | 96 **Example** |
94 | 97 |
95 - input:: | 98 - input:: |
96 | 99 |
97 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0 | 100 Contig161_chr1_4641264_4641879 115 C T 73.5 chr1 4641382 C 6 0 2 45 8 0 2 51 15 0 2 72 5 0 2 42 6 0 2 45 10 0 2 57 Y 54 0.323 0 |
98 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0 | 101 Contig48_chr1_10150253_10151311 11 A G 94.3 chr1 10150264 A 1 0 2 30 1 0 2 30 1 0 2 30 3 0 2 36 1 0 2 30 1 0 2 30 Y 22 +99. 0 |
99 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0 | 102 Contig20_chr1_21313469_21313570 66 C T 54.0 chr1 21313534 C 4 0 2 39 4 0 2 39 5 0 2 42 4 0 2 39 4 0 2 39 5 0 2 42 N 1 +99. 0 |
100 etc. | 103 etc. |
101 | 104 |
102 - output map file:: | 105 - output cover page:: |
103 | 106 |
104 1 snp1 0 2 | 107 Prepare to look for population structure Galaxy Composite Dataset |
105 1 snp3 0 4 | 108 Output completed: 2012-10-01 04:09:36 PM |
106 1 snp4 0 5 | |
107 1 snp5 0 6 | |
108 1 snp6 0 7 | |
109 1 snp7 0 8 | |
110 1 snp8 0 9 | |
111 1 snp9 0 10 | |
112 | 109 |
113 - output ped file:: | 110 Outputs |
111 * admix.ped (link) | |
112 * admix.map (link) | |
113 * Using 222 of 400 SNPs | |
114 | 114 |
115 PB1 1 0 0 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 2 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 1 | 115 Inputs |
116 * Minimum reads covering a SNP, per individual: 6 | |
117 * Minimum quality value, per individual: 0 | |
118 * Minimum spacing between SNPs on the same scaffold: 0 | |
119 | |
120 Populations | |
121 * Pop. A | |
122 1. PB1 | |
123 2. PB2 | |
124 * Pop. B | |
125 1. PB3 | |
126 2. PB4 | |
127 * Pop. C | |
128 1. PB6 | |
129 2. PB8 | |
116 | 130 |
117 </help> | 131 </help> |
118 </tool> | 132 </tool> |