Mercurial > repos > miller-lab > snp_analysis_conversion
diff dividePgSnpAlleles.xml @ 2:35c20b109be5
Retrying upload with "bare" tarball (i.e. one without a top containing directory).
author | cathy |
---|---|
date | Tue, 28 May 2013 17:54:02 -0400 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/dividePgSnpAlleles.xml Tue May 28 17:54:02 2013 -0400 @@ -0,0 +1,78 @@ +<tool id="dividePgSnp" name="Separate pgSnp Alleles" version="1.1.0" hidden="false"> + <description>: Split allele info into separate columns</description> + <command interpreter="perl"> + #if $refcol.ref == "yes" #dividePgSnpAlleles.pl -ref=$refcol.ref_column $input1 > $out_file1 + #else #dividePgSnpAlleles.pl $input1 > $out_file1 + #end if + </command> + <inputs> + <param format="interval" name="input1" type="data" label="pgSnp dataset" /> + <conditional name="refcol"> + <param name="ref" type="select" label="Dataset has a column with the reference allele"> + <option value="yes">yes</option> + <option value="no" selected="true">no</option> + </param> + <when value="yes"> + <param name="ref_column" type="data_column" data_ref="input1" label="Column with reference allele" /> + </when> + <when value="no"> <!-- do nothing --> + </when> + </conditional> + </inputs> + <outputs> + <data format="interval" name="out_file1" /> + </outputs> + <tests> + <test> + <param name='input1' value='dividePgSnp_input.pgSnp' ftype='interval' /> + <param name='refcol' value='no' /> + <output name="output" file="dividePgSnp_output.txt" /> + </test> + </tests> + + <help> + +**Dataset formats** + +The input dataset is of Galaxy datatype interval_, with the additional columns +required for pgSnp_ format. +Any further columns beyond those defined for pgSnp will be appended to the output. +The output dataset is in interval_ format. (`Dataset missing?`_) + +.. _interval: ./static/formatHelp.html#interval +.. _pgSnp: ./static/formatHelp.html#pgSnp +.. _Dataset missing?: ./static/formatHelp.html + +----- + +**What it does** + +This separates the alleles from a pgSnp dataset into separate columns, +and also the frequencies and scores that go with the alleles. It will skip +any positions with more than 2 alleles. If only a single allele is given then "N" +will be used for the second, with a frequency and score of zero. Or, if a +column with reference alleles is provided, +the value in that column will be used in place of the "N". + +----- + +**Example** + +- input pgSnp file:: + + chr1 256 257 A/C 2 3,4 10,20 + chr1 56100 56101 A 1 5 30 + chr1 77052 77053 A/G 2 6,7 40,50 + chr1 110904 110905 A 1 8 60 + etc. + +- output:: + + chr1 256 257 A 3 10 C 4 20 + chr1 56100 56101 A 5 30 N 0 0 + chr1 77052 77053 A 6 40 G 7 50 + chr1 110904 110905 A 8 60 N 0 0 + etc. + +</help> +</tool>