diff restore_attributes.xml @ 22:95a05c1ef5d5

update to devshed revision aaece207bd01
author Richard Burhans <burhans@bx.psu.edu>
date Mon, 11 Mar 2013 11:28:06 -0400
parents
children 91e835060ad2
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/restore_attributes.xml	Mon Mar 11 11:28:06 2013 -0400
@@ -0,0 +1,61 @@
+<tool id="gd_restore_attributes" name="Restore Attributes" version="1.0.0">
+  <description>: Fill in missing properties for a gd_snp dataset</description>
+
+  <command interpreter="python">
+    cp.py "$dst" "$output"
+  </command>
+
+  <inputs>
+    <param name="src" type="data" format="gd_snp" label="SNP dataset to copy attributes from" />
+    <param name="dst" type="data" format="gd_snp" label="SNP dataset to receive attributes" />
+  </inputs>
+
+  <outputs>
+    <data name="output" format="gd_snp" metadata_source="src" />
+  </outputs>
+
+  <help>
+
+**Dataset formats**
+
+All of the input and output datasets are in gd_snp_ format.  (`Dataset missing?`_)
+
+.. _gd_snp: ./static/formatHelp.html#gd_snp
+.. _Dataset missing?: ./static/formatHelp.html
+
+-----
+
+**What it does**
+
+This tool copies metadata information from one SNP dataset to another, leaving
+the actual SNP data itself unchanged.  Datasets in gd_snp format have a number
+of "extra" properties associated with them, such as the focus species (which
+may be different from the reference assembly), names of individuals, column
+numbers containing certain data fields, etc.  These values are stored in the
+dataset's metadata, in addition to the more usual attributes like dataset name,
+assembly build, and so forth.  You can see some of these by clicking on the
+pencil icon for the dataset.
+
+The Genome Diversity tools need this information to perform their tasks.
+However, these additional attributes may be lost if the datatype is changed.
+For example, suppose you want to see which SNPs overlap some other dataset in
+your history, like coding regions or TAL1 binding sites.  The Intersect tool
+only works on datasets that are in interval format, so you might use the Compute
+tool to append a new column with the End position of the SNP (= Start + 1),
+then use the pencil icon to change the datatype to "interval".  This works
+great for doing the intersection, but if you then want to run one of the Genome
+Diversity tools on the resulting SNPs, there's a problem: you can change the
+datatype back to gd_snp easily enough, but the extra attributes have been lost
+in the conversion to interval.
+
+As long as the proper values of the lost attributes have not changed, then this
+tool can restore them by copying from the old gd_snp dataset in your history.
+In the above example, appending a column does not change the numbering of the
+earlier columns, and deleting rows via Intersect does not affect the extra
+attributes either.  Note that all of the metadata is copied, not just the extra
+attributes specific to gd_snp (though standard items like the assembly build,
+the number of lines, and the name for the output dataset are updated
+automatically by the Galaxy framework).
+
+  </help>
+</tool>