view dividePgSnpAlleles.xml @ 0:de839553d2a2

Uploaded tool tarball.
author devteam
date Wed, 25 Sep 2013 10:26:17 -0400
parents
children f74fdfc17143
line wrap: on
line source

<tool id="dividePgSnp" version="1.0.0" name="Separate pgSnp alleles" hidden="false">
  <description>into columns</description>
  <command interpreter="perl">
    #if $refcol.ref == "yes" #dividePgSnpAlleles.pl -ref=$refcol.ref_column $input1 > $out_file1
    #else #dividePgSnpAlleles.pl $input1 > $out_file1
    #end if
  </command>
  <inputs>
    <param format="interval" version="1.0.0" name="input1" type="data" label="pgSnp dataset" />
    <conditional version="1.0.0" name="refcol">
      <param version="1.0.0" name="ref" type="select" label="Dataset has a column with the reference allele:">
        <option value="yes">yes</option>
        <option value="no" selected="true">no</option>
      </param>
      <when value="yes">
      <param version="1.0.0" name="ref_column" type="data_column" data_ref="input1" label="Column with reference allele" />
      </when>
      <when value="no"> <!-- do nothing -->
      </when>
    </conditional>
  </inputs>
  <outputs>
  <data format="interval" version="1.0.0" name="out_file1" />
  </outputs>
  <tests>
    <test>
      <param name='input1' value='dividePgSnp_input.pgSnp' ftype='interval' />
      <param name='ref' value='no' />
      <output version="1.0.0" name="output" file="dividePgSnp_output.txt" />
    </test>
  </tests>

  <help>
**Dataset formats**

The input dataset is of Galaxy datatype interval_ with the columns specified
for pgSnp_.
Any additional columns beyond the pgSnp defined columns will be appended to
the output.
The output dataset is in interval_ format.  (`Dataset missing?`_)

.. _interval: ./static/formatHelp.html#interval
.. _Dataset missing?: ./static/formatHelp.html
.. _pgSnp: ./static/formatHelp.html#pgSnp

**What it does**

This separates the alleles from a pgSnp dataset into separate columns,
as well as the frequencies and scores that go with the alleles.  It will skip
any positions with more than 2 alleles.  If only a single allele is given then "N"
will be used for the second, with a frequency and score of zero.  Or, if a 
column with reference alleles is provided, 
the value in that column will be used in place of the "N" for single alleles.

-----

**Examples**

- input pgSnp file::

   chr1    256     257     A/C     2       3,4     10,20
   chr1    56100   56101   A       1       5       30
   chr1    77052   77053   A/G     2       6,7     40,50
   chr1    110904  110905  A       1       8       60
   etc.

- output::

   chr1    256     257     A       3       10       C       4       20
   chr1    56100   56101   A       5       30       N       0       0
   chr1    77052   77053   A       6       40       G       7       50
   chr1    110904  110905  A       8       60       N       0       0
   etc.

</help>
</tool>