view master2pg.xml @ 3:edf12470a1a6 default tip

Bugfix from Belinda, in vcf2pgSnp.pl
author Cathy Riemer <cathy+hg@bx.psu.edu>
date Thu, 19 Mar 2015 12:06:34 -0400
parents 35c20b109be5
children
line wrap: on
line source

<tool id="master2pgSnp" name="MasterVar to pgSnp" version="1.1.0" hidden="false">
  <description>: Convert from MasterVar to pgSnp format</description>
  <command interpreter="perl">
    master2pg.pl $indel $addRef $input1 > $out_file1
  </command>
  <inputs>
    <param format="tabular" name="input1" type="data" label="Complete Genomics MasterVar dataset" />
    <param name="indel" type="select" label="Convert indels">
      <option value="" selected="true">no</option>
      <option value="indel">yes</option>
    </param>
    <param name="addRef" type="select" label="Add column with reference nt">
      <option value="" selected="true">no</option>
      <option value="addRef">yes</option>
    </param>
  </inputs>
  <outputs>
  <data format="interval" name="out_file1" />
  </outputs>
  <tests>
    <test>
      <param name='input1' value='masterVarTest.txt' ftype='tab' />
      <param name='indel' value="" />
      <output name="output" file="masterVar_output.txt" />
    </test>
  </tests>

  <help>

**Dataset formats**

The input dataset is in the MasterVar_ format provided by the Complete Genomics 
analysis process (Galaxy considers this to be tabular_, but it must have the 
columns specified for MasterVar).
The output dataset is in pgSnp_ format.  (`Dataset missing?`_)

.. _MasterVar: ./static/formatHelp.html#mastervar
.. _tabular: ./static/formatHelp.html#tab
.. _pgSnp: ./static/formatHelp.html#pgSnp
.. _Dataset missing?: ./static/formatHelp.html

-----

**What it does**

This converts a Complete Genomics MasterVar file to pgSnp format, 
so it can be viewed in genome browsers or used with the Phenotype Association
and Operate on Genomic Intervals tools.
Positions homozygous for the reference allele are skipped.
You can optionally add the reference nt as an extra column at the end.  This
will make the file not work in genome browsers (the extra column could be 
removed later) but will work with other Galaxy tools.

-----

**Example**

- input MasterVar file::

   934     2       chr1    41980   41981   hom     snp     A       G       G       76      97                      dbsnp.86:rs806721       425     1       1       1       2       -170                            ERVL-E-int:ERVL:47.4    2       1.17    N
   935     2       chr1    41981   42198   hom     ref     =       =       =                                                                                       -170                                            1.17    N
   1102    2       chr1    53205   53206   het-ref snp     G       C       G       93      127                     dbsnp.100:rs2854676     477     7       30      0       37      -127                                    2       1.17    N
   etc.

- output::

   chr1    41980   41981   G       1       1       76
   chr1    51672   51673   C       1       1       53
   chr1    52237   52238   G       1       7       63
   chr1    53205   53206   C/G     2       7,30    93,127
   etc.

</help>
</tool>