annotate protein_prophet.xml @ 3:255b5b6ec617 draft default tip

Better defaults for omssa
author Ira Cooke <iracooke@gmail.com>
date Mon, 23 Jul 2012 14:48:31 +1000
parents 418f42b34049
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
418f42b34049 Reuploading
iracooke
parents:
diff changeset
1 <tool id="proteomics_search_protein_prophet_1" name="Protein Prophet" version="1.0.0">
418f42b34049 Reuploading
iracooke
parents:
diff changeset
2 <requirements><requirement type="package">protk</requirement></requirements>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
3 <description>Calculate Protein Prophet statistics on search results</description>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
4
418f42b34049 Reuploading
iracooke
parents:
diff changeset
5
418f42b34049 Reuploading
iracooke
parents:
diff changeset
6 <!-- Note .. the input file is assumed to be the first argument -->
418f42b34049 Reuploading
iracooke
parents:
diff changeset
7 <command>protein_prophet.rb --galaxy $input_file -r $iproph $nooccam $groupwts $normprotlen $logprobs $confem $allpeps $unmapped $instances $delude --minprob=$minprob --minindep=$minindep </command>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
8 <inputs>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
9
418f42b34049 Reuploading
iracooke
parents:
diff changeset
10 <param name="input_file" type="data" format="pepxml" multiple="false" label="Peptide Prophet Results" help="These files will typically be outputs from peptide prophet or interprophet"/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
11
418f42b34049 Reuploading
iracooke
parents:
diff changeset
12
418f42b34049 Reuploading
iracooke
parents:
diff changeset
13 <param name="iproph" selected="true" type="boolean" label="Inputs are from iProphet" truevalue="--iprophet-input" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
14 <param name="nooccam" type="boolean" label="Don't apply Occam's razor" help="When selected no attempt will be made to derive the simplest protein list explaining observed peptides" truevalue="--no-occam" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
15 <param name="groupwts" type="boolean" label="Use group weights" help="Check peptide's total weight (rather than actual weight) in the Protein Group against the threshold" truevalue="--group-wts" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
16 <param name="normprotlen" type="boolean" label="Normalize NSP using Protein Length" truevalue="--norm-protlen" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
17 <param name="logprobs" type="boolean" label="Use the log of probability in the confidence calculations" truevalue="--log-prob" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
18 <param name="confem" type="boolean" label="Use the EM to compute probability given the confidenct" truevalue="--confem" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
19 <param name="allpeps" type="boolean" label="Consider all possible peptides in the database in the confidence model" truevalue="--allpeps" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
20 <param name="unmapped" type="boolean" label="Report results for unmapped proteins" truevalue="--unmapped" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
21 <param name="instances" type="boolean" label="Use Expected Number of Ion Instances to adjust the peptide probabilities prior to NSP adjustment" truevalue="--instances" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
22 <param name="delude" type="boolean" label="Do NOT use peptide degeneracy information when assessing proteins" truevalue="--delude" falsevalue=""/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
23
418f42b34049 Reuploading
iracooke
parents:
diff changeset
24 <param name="minprob" type="text" label="Minimum peptide prophet probability for peptides to be considered" value="0.05"/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
25 <param name="minindep" type="text" label="Minimum percentage of independent peptides required for a protein" value="0"/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
26
418f42b34049 Reuploading
iracooke
parents:
diff changeset
27 </inputs>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
28 <outputs>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
29 <data format="protxml" name="output" metadata_source="input_file" label="protein_prophet.${input_file.display_name}.protXML" from_work_dir="protein_prophet_results.prot.xml"/>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
30 </outputs>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
31
418f42b34049 Reuploading
iracooke
parents:
diff changeset
32
418f42b34049 Reuploading
iracooke
parents:
diff changeset
33 <!--NOPLOT: do not generate plot png file
418f42b34049 Reuploading
iracooke
parents:
diff changeset
34 NOOCCAM: non-conservative maximum protein list
418f42b34049 Reuploading
iracooke
parents:
diff changeset
35 GROUPWTS: check peptide's total weight in the Protein Group against the threshold (default: check peptide's actual weight against threshold)
418f42b34049 Reuploading
iracooke
parents:
diff changeset
36 NORMPROTLEN: Normalize NSP using Protein Length
418f42b34049 Reuploading
iracooke
parents:
diff changeset
37 LOGPROBS: Use the log of the probabilities in the Confidence calculations
418f42b34049 Reuploading
iracooke
parents:
diff changeset
38 CONFEM: Use the EM to compute probability given the confidence
418f42b34049 Reuploading
iracooke
parents:
diff changeset
39 ALLPEPS: Consider all possible peptides in the database in the confidence model
418f42b34049 Reuploading
iracooke
parents:
diff changeset
40 UNMAPPED: Report results for UNMAPPED proteins
418f42b34049 Reuploading
iracooke
parents:
diff changeset
41 INSTANCES: Use Expected Number of Ion Instances to adjust the peptide probabilities prior to NSP adjustment
418f42b34049 Reuploading
iracooke
parents:
diff changeset
42 DELUDE: do NOT use peptide degeneracy information when assessing proteins
418f42b34049 Reuploading
iracooke
parents:
diff changeset
43
418f42b34049 Reuploading
iracooke
parents:
diff changeset
44 MINPROB: peptideProphet probabilty threshold (default=0.05)
418f42b34049 Reuploading
iracooke
parents:
diff changeset
45 MININDEP: minimum percentage of independent peptides required for a protein (default=0)
418f42b34049 Reuploading
iracooke
parents:
diff changeset
46
418f42b34049 Reuploading
iracooke
parents:
diff changeset
47
418f42b34049 Reuploading
iracooke
parents:
diff changeset
48 -->
418f42b34049 Reuploading
iracooke
parents:
diff changeset
49
418f42b34049 Reuploading
iracooke
parents:
diff changeset
50 <help>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
51
418f42b34049 Reuploading
iracooke
parents:
diff changeset
52 **What it does**
418f42b34049 Reuploading
iracooke
parents:
diff changeset
53
418f42b34049 Reuploading
iracooke
parents:
diff changeset
54 Given a set of peptide assignments from MS/MS spectra in the form of a pepXML file, this tool estimates probabilities at the protein level. As output, the tool produces a protXML file, which contains proteins along with the estimated probabilities that those proteins were present. Probabilities are estimated using a statistical model based on the number of peptides corresponding to that protein and the confidence that each of those peptides were assigned correctly. It takes account of the fact that peptides may correspond to more than one protein.
418f42b34049 Reuploading
iracooke
parents:
diff changeset
55
418f42b34049 Reuploading
iracooke
parents:
diff changeset
56 ----
418f42b34049 Reuploading
iracooke
parents:
diff changeset
57
418f42b34049 Reuploading
iracooke
parents:
diff changeset
58 **Citation**
418f42b34049 Reuploading
iracooke
parents:
diff changeset
59
418f42b34049 Reuploading
iracooke
parents:
diff changeset
60 If you use this tool please read and cite the paper describing the statistical model implemented by Protein Prophet
418f42b34049 Reuploading
iracooke
parents:
diff changeset
61
418f42b34049 Reuploading
iracooke
parents:
diff changeset
62 Nesvizhskii A., et al. “A Statistical Model for Identifying Proteins by Tandem Mass Spectrometry” *Anal. Chem.* 75, 4646-4658 (2003).
418f42b34049 Reuploading
iracooke
parents:
diff changeset
63
418f42b34049 Reuploading
iracooke
parents:
diff changeset
64
418f42b34049 Reuploading
iracooke
parents:
diff changeset
65 </help>
418f42b34049 Reuploading
iracooke
parents:
diff changeset
66
418f42b34049 Reuploading
iracooke
parents:
diff changeset
67 </tool>