view false_discovery_rate.xml @ 2:cf0d72c7b482 draft

Update.
author galaxyp
date Fri, 10 May 2013 17:31:05 -0400
parents
children
line wrap: on
line source

<tool id="openms_false_discvoerage_rate" version="0.1.0" name="False Discovery Rate">
  <description>
    Tool to estimate the false discovery rate on peptide and protein level.
  </description>
  <macros>
    <import>macros.xml</import>
  </macros>
  <expand macro="stdio" />
  <expand macro="requires" />
  <command interpreter="python">
    openms_wrapper.py --executable 'FalseDiscoveryRate' --config $config
  </command>
  <configfiles>
    <configfile name="config">[simple_options]
#if $input.type == "combined"
in=$input1
algorithm!decoy_string=$input.decoy_string
#else
fwd_in=$input.forward_input
rev_in=$input.forward_input
#end if
out=$output
algorithm!q_value=$q_value
algorithm!use_all_hits=$use_all_hits
algorithm!split_charge_variants=$split_charge_variants
algorithm!add_decoy_peptides=$add_decoy_peptides
</configfile>
  </configfiles>
  <inputs>
    <conditional name="input">
      <param name="type" label="Input Type" type="select">
        <option value="combined">One search against combined target/decoy database</option>
        <option value="separate">Two searches from separate target and decoy databases.</option>
      </param>
      <when value="combined">
        <param name="input1" label="Identification Input" type="data" format="idxml" />
        <param name="decoy_string" type="text" value="_rev" label="Decoy string"/>
      </when>
      <when value="separate">
        <param name="forward_input" label="Forward Identification Input" type="data" format="idxml" />
        <param name="reverse_input" label="Reverse Identification Input" type="data" format="idxml" />
      </when>
    </conditional>
    <param name="q_value" label="Compute Q-Value" type="boolean" truevalue="true" falsevalue="false" checked="true" help="If 'true', the q-values will be calculated instead of the FDRs." />
    <param name="use_all_hits" label="Use All Hits" type="boolean" truevalue="true" falsevalue="false" help="If 'true' not only the first hit, but all are used (peptides only)." />
    <param name="split_charge_variants" label="Split Charge Variants" type="boolean" truevalue="true" falsevalue="false" help="If set to 'true' charge variants are treated separately (for peptides of combined target/decoy searches only)." checked="false" />
    <param name="add_decoy_peptides" type="boolean" label="Add Decoy Peptides" help="If set to true, decoy peptides will be written to output file, too. The q-value is set to the closest target score." truevalue="true" falsevalue="false" checked="false" />

  </inputs>
  <outputs>
    <data format="idxml" name="output" />
  </outputs>
  <help>
**What it does**

This TOPP tool can calculate the false discovery rate (FDR) given a forward and backward search (or one run on a combined database). Most useful is this on protein level, however, it also can be applied to peptides.

The false discovery rate is defined as the number of false discoveries (the hits in the reversed search) over the number of false and correct discoveries (the hits in both databases) with a score better than a given threshold.

Prerequisites:

When using a combined database of forward and reverse hits (thus only using one search run per ID engine), then use PeptideIndexer to index an idXML file generated by a search engine adapter, e.g. MascotAdapter. This will allow us to discern which peptides are from the target vs. decoy database.
When no decoy hits were found you will get a warning saying something like:
"FalseDiscoveryRate: #decoy sequences is zero! Setting all target sequences to q-value/FDR 0!"
This should be a serious concern, since the target/decoy annotation in a previous step has probably a misconfigured database (see PeptideIndexer).

**Citation**

For the underlying tool, please cite ``Marc Sturm, Andreas Bertsch, Clemens Gröpl, Andreas Hildebrandt, Rene Hussong, Eva Lange, Nico Pfeifer, Ole Schulz-Trieglaff, Alexandra Zerck, Knut Reinert, and Oliver Kohlbacher, 2008. OpenMS – an Open-Source Software Framework for Mass Spectrometry. BMC Bioinformatics 9: 163. doi:10.1186/1471-2105-9-163.``

If you use this tool in Galaxy, please cite Chilton J, et al. https://bitbucket.org/galaxyp/galaxyp-toolshed-openms
  </help>
</tool>