view modify_snp_table.xml @ 2:41ef7e57c2fa

data -> galaxy.datatypes.data
author Richard Burhans <burhans@bx.psu.edu>
date Mon, 09 Apr 2012 15:27:28 -0400
parents 2c498d40ecde
children e29f4d801bb0
line wrap: on
line source

<tool id="gd_modify_snp_table" name="Modify" version="1.0.0">
  <description>a SNP table</description>

  <command interpreter="python">
    modify_snp_table.py "$input" "$p1_input" "$output"
    #if $limit_coverage.choice == "0"
        "-1" "-1" "-1" "-1"
    #else
        "${limit_coverage.lo_coverage}" "${limit_coverage.hi_coverage}" "${limit_coverage.low_ind_cov}" "${limit_coverage.lo_quality}"
    #end if
    #for $individual, $individual_col in zip($input.dataset.metadata.individual_names, $input.dataset.metadata.individual_columns)
        #set $arg = '%s:%s' % ($individual_col, $individual)
        "$arg"
    #end for
  </command>

  <inputs>
    <param name="input" type="data" format="wsf" label="SNP table" />
    <param name="p1_input" type="data" format="ind" label="Population individuals" />
    <conditional name="limit_coverage">
      <param name="choice" type="select" format="integer" label="Option">
        <option value="0" selected="true">add columns to the SNP table</option>
        <option value="1">discard some SNPs</option>
      </param>
      <when value="0" />
      <when value="1">
        <param name="lo_coverage" type="integer" min="0" value="0" label="Lower bound on total coverage" />
        <param name="hi_coverage" type="integer" min="0" value="1000" label="Upper bound on total coverage" />
        <param name="low_ind_cov" type="integer" min="0" value="0" label="Lower bound on individual coverage" />
        <param name="lo_quality" type="integer" min="0" value="0" label="Lower bound on individual quality values" />
      </when>
    </conditional>
  </inputs>

  <outputs>
    <data name="output" format="wsf" metadata_source="input" />
  </outputs>

  <tests>
    <test>
      <param name="input" value="test_in/sample.wsf" ftype="wsf" />
      <param name="p1_input" value="test_in/a.ind" ftype="ind" />
      <param name="choice" value="1" />
      <param name="lo_coverage" value="0" />
      <param name="hi_coverage" value="1000" />
      <param name="low_ind_cov" value="3" />
      <param name="lo_quality" value="30" />
      <output name="output" file="test_out/modify_snp_table/modify.wsf" />
    </test>
  </tests>

  <help>
**What it does**

The user specifies that some of the individuals in the selected SNP table are
form a "population" that has been previously defined using the Galaxy tool to
select individuals from a SNP table.  One option is for the program to append
four columns to the table, giving the total counts for the two alleles, the
"genotype" for the population and the maximum quality value, taken over all
individuals in the population.  If all defined genotypes in the population
are 2 (agree with the reference), the population's genotype is 2; similarly
for 0; otherwise the genotype is 1 (unless all individuals have undefined
genotype, in which case it is -1.  The other option is to remove rows from
the table for which the total coverage for the population is either too low
or too high, and/or if the individual coverage or quality value is too low.
  </help>
</tool>