view vcf2pgSnp.xml @ 3:edf12470a1a6 default tip

Bugfix from Belinda, in vcf2pgSnp.pl
author Cathy Riemer <cathy+hg@bx.psu.edu>
date Thu, 19 Mar 2015 12:06:34 -0400
parents 35c20b109be5
children
line wrap: on
line source

<tool id="vcf2pgSnp" name="VCF to pgSnp" version="1.1.0" hidden="false">
  <description>: Convert from VCF to pgSnp format</description>
  <command interpreter="perl">
    #if $inType.how == "all" #vcf2pgSnp.pl all $input1 > $out_file1
    #else #vcf2pgSnp.pl $inType.ind_column $input1 > $out_file1
    #end if
  </command>
  <inputs>
    <param format="vcf" name="input1" type="data" label="VCF dataset" />
    <conditional name="inType">
      <param name="how" type="select" label="How to treat individuals">
        <option value="all">Group all as a population</option>
        <option value="one">Do just one individual</option>
      </param>
      <when value="one">
        <param name="ind_column" type="data_column" data_ref="input1" label="Column to convert" value="10" />
      </when>
      <when value="all">
          <!-- do nothing -->
      </when>
    </conditional>
  </inputs>
  <outputs>
  <data format="interval" name="out_file1" />
  </outputs>
  <tests>
    <test>
      <param name="input1" value="vcf2pgSnp_input.vcf" ftype="vcf" />
      <param name="how" value="all" />
      <output name="output" file="vcf2pgSnp_output.pgSnp" />
    </test>
  </tests>

  <help>

**Dataset formats**

The input dataset is in VCF_ format.
The output dataset is pgSnp_.  (`Dataset missing?`_)

.. _VCF: ./static/formatHelp.html#vcf
.. _pgSnp: ./static/formatHelp.html#pgSnp
.. _Dataset missing?: ./static/formatHelp.html

-----

**What it does**

This converts a VCF dataset to pgSnp format, using chromosome counts for
the allele frequencies.  If there is more than one column (individual)
of SNP data, it can either combine all columns as a population, or just
convert a single selected column.

-----

**Examples**

- input::

   1       13327   rs144762171     G       C       100     PASS    VT=SNP;SNPSOURCE=LOWCOV     GT:DS:GL        0|0:0.000:-0.03,-1.11,-5.00     0|1:1.000:-1.97,-0.01,-2.51     0|0:0.050:-0.01,-1.69,-5.00     0|0:0.100:-0.48,-0.48,-0.48
   1       13980   rs151276478     T       C       100     PASS    VT=SNP;SNPSOURCE=LOWCOV     GT:DS:GL        0|0:0.100:-0.48,-0.48,-0.48     0|1:0.950:-0.48,-0.48,-0.48     0|0:0.050:-0.48,-0.48,-0.48     0|0:0.050:-0.48,-0.48,-0.48
   1       30923   rs140337953     G       T       100     PASS    VT=SNP;SNPSOURCE=LOWCOV     GT:DS:GL        1|1:1.950:-5.00,-0.61,-0.12     0|0:0.450:-0.10,-0.69,-2.81     0|0:0.450:-0.11,-0.64,-3.49     1|1:1.500:-0.48,-0.48,-0.48
   etc.

- output as a population::

   chr1    13326   13327   G/C     2       7,1     0,0
   chr1    13979   13980   T/C     2       7,1     0,0
   chr1    30922   30923   G/T     2       4,4     0,0
   etc.

- output for column 10::

   chr1    13326   13327   G       1       2       0
   chr1    13979   13980   T       1       2       0
   chr1    30922   30923   T       1       2       0
   etc.

</help>
</tool>