Mercurial > repos > miller-lab > genome_diversity
comparison restore_attributes.xml @ 22:95a05c1ef5d5
update to devshed revision aaece207bd01
author | Richard Burhans <burhans@bx.psu.edu> |
---|---|
date | Mon, 11 Mar 2013 11:28:06 -0400 |
parents | |
children | 91e835060ad2 |
comparison
equal
deleted
inserted
replaced
21:d6b961721037 | 22:95a05c1ef5d5 |
---|---|
1 <tool id="gd_restore_attributes" name="Restore Attributes" version="1.0.0"> | |
2 <description>: Fill in missing properties for a gd_snp dataset</description> | |
3 | |
4 <command interpreter="python"> | |
5 cp.py "$dst" "$output" | |
6 </command> | |
7 | |
8 <inputs> | |
9 <param name="src" type="data" format="gd_snp" label="SNP dataset to copy attributes from" /> | |
10 <param name="dst" type="data" format="gd_snp" label="SNP dataset to receive attributes" /> | |
11 </inputs> | |
12 | |
13 <outputs> | |
14 <data name="output" format="gd_snp" metadata_source="src" /> | |
15 </outputs> | |
16 | |
17 <help> | |
18 | |
19 **Dataset formats** | |
20 | |
21 All of the input and output datasets are in gd_snp_ format. (`Dataset missing?`_) | |
22 | |
23 .. _gd_snp: ./static/formatHelp.html#gd_snp | |
24 .. _Dataset missing?: ./static/formatHelp.html | |
25 | |
26 ----- | |
27 | |
28 **What it does** | |
29 | |
30 This tool copies metadata information from one SNP dataset to another, leaving | |
31 the actual SNP data itself unchanged. Datasets in gd_snp format have a number | |
32 of "extra" properties associated with them, such as the focus species (which | |
33 may be different from the reference assembly), names of individuals, column | |
34 numbers containing certain data fields, etc. These values are stored in the | |
35 dataset's metadata, in addition to the more usual attributes like dataset name, | |
36 assembly build, and so forth. You can see some of these by clicking on the | |
37 pencil icon for the dataset. | |
38 | |
39 The Genome Diversity tools need this information to perform their tasks. | |
40 However, these additional attributes may be lost if the datatype is changed. | |
41 For example, suppose you want to see which SNPs overlap some other dataset in | |
42 your history, like coding regions or TAL1 binding sites. The Intersect tool | |
43 only works on datasets that are in interval format, so you might use the Compute | |
44 tool to append a new column with the End position of the SNP (= Start + 1), | |
45 then use the pencil icon to change the datatype to "interval". This works | |
46 great for doing the intersection, but if you then want to run one of the Genome | |
47 Diversity tools on the resulting SNPs, there's a problem: you can change the | |
48 datatype back to gd_snp easily enough, but the extra attributes have been lost | |
49 in the conversion to interval. | |
50 | |
51 As long as the proper values of the lost attributes have not changed, then this | |
52 tool can restore them by copying from the old gd_snp dataset in your history. | |
53 In the above example, appending a column does not change the numbering of the | |
54 earlier columns, and deleting rows via Intersect does not affect the extra | |
55 attributes either. Note that all of the metadata is copied, not just the extra | |
56 attributes specific to gd_snp (though standard items like the assembly build, | |
57 the number of lines, and the name for the output dataset are updated | |
58 automatically by the Galaxy framework). | |
59 | |
60 </help> | |
61 </tool> |