Mercurial > repos > miller-lab > snp_analysis_conversion
view dividePgSnpAlleles.xml @ 3:edf12470a1a6 default tip
Bugfix from Belinda, in vcf2pgSnp.pl
author | Cathy Riemer <cathy+hg@bx.psu.edu> |
---|---|
date | Thu, 19 Mar 2015 12:06:34 -0400 |
parents | 35c20b109be5 |
children |
line wrap: on
line source
<tool id="dividePgSnp" name="Separate pgSnp Alleles" version="1.1.0" hidden="false"> <description>: Split allele info into separate columns</description> <command interpreter="perl"> #if $refcol.ref == "yes" #dividePgSnpAlleles.pl -ref=$refcol.ref_column $input1 > $out_file1 #else #dividePgSnpAlleles.pl $input1 > $out_file1 #end if </command> <inputs> <param format="interval" name="input1" type="data" label="pgSnp dataset" /> <conditional name="refcol"> <param name="ref" type="select" label="Dataset has a column with the reference allele"> <option value="yes">yes</option> <option value="no" selected="true">no</option> </param> <when value="yes"> <param name="ref_column" type="data_column" data_ref="input1" label="Column with reference allele" /> </when> <when value="no"> <!-- do nothing --> </when> </conditional> </inputs> <outputs> <data format="interval" name="out_file1" /> </outputs> <tests> <test> <param name='input1' value='dividePgSnp_input.pgSnp' ftype='interval' /> <param name='refcol' value='no' /> <output name="output" file="dividePgSnp_output.txt" /> </test> </tests> <help> **Dataset formats** The input dataset is of Galaxy datatype interval_, with the additional columns required for pgSnp_ format. Any further columns beyond those defined for pgSnp will be appended to the output. The output dataset is in interval_ format. (`Dataset missing?`_) .. _interval: ./static/formatHelp.html#interval .. _pgSnp: ./static/formatHelp.html#pgSnp .. _Dataset missing?: ./static/formatHelp.html ----- **What it does** This separates the alleles from a pgSnp dataset into separate columns, and also the frequencies and scores that go with the alleles. It will skip any positions with more than 2 alleles. If only a single allele is given then "N" will be used for the second, with a frequency and score of zero. Or, if a column with reference alleles is provided, the value in that column will be used in place of the "N". ----- **Example** - input pgSnp file:: chr1 256 257 A/C 2 3,4 10,20 chr1 56100 56101 A 1 5 30 chr1 77052 77053 A/G 2 6,7 40,50 chr1 110904 110905 A 1 8 60 etc. - output:: chr1 256 257 A 3 10 C 4 20 chr1 56100 56101 A 5 30 N 0 0 chr1 77052 77053 A 6 40 G 7 50 chr1 110904 110905 A 8 60 N 0 0 etc. </help> </tool>