comparison pmd_fdr.xml @ 1:460edeedeb7d draft default tip

"planemo upload for repository https://github.com/galaxyproteomics/tools-galaxyp/tree/master/tools/pmd_fdr commit d87c44ec8fd5bc03d7bb06f65a1a6ec247daee36"
author galaxyp
date Thu, 10 Oct 2019 17:29:37 -0400
parents 5cc0c32d05a2
children
comparison
equal deleted inserted replaced
0:5cc0c32d05a2 1:460edeedeb7d
1 <tool id="pmd_fdr" name="PMD FDR" version="0.1.0"> 1 <tool id="pmd_fdr" name="PMD FDR" version="1.4.0">
2 <description>recalculate FDR fom precursor mass discrepancy</description> 2 <description>recalculate FDR fom precursor mass discrepancy</description>
3 <requirements> 3 <requirements>
4 <requirement type="package" version="3.5.1">r-base</requirement> 4 <requirement type="package" version="3.5.1">r-base</requirement>
5 <requirement type="package" version="1.4.0">r-stringr</requirement> 5 <requirement type="package" version="1.4.0">r-stringr</requirement>
6 <requirement type="package" version="0.4">r-argparser</requirement> 6 <requirement type="package" version="0.4">r-argparser</requirement>
22 <param argument="--psm_report" type="data" format="tabular" label="PSM report (Peptide Spectrum Match)"/> 22 <param argument="--psm_report" type="data" format="tabular" label="PSM report (Peptide Spectrum Match)"/>
23 <param argument="--psm_report_1_percent" type="data" format="tabular" label="PSM report at 1% FDR (Optional)" optional="true"/> 23 <param argument="--psm_report_1_percent" type="data" format="tabular" label="PSM report at 1% FDR (Optional)" optional="true"/>
24 <param argument="--input_file_type" type="select" label="Input file type"> 24 <param argument="--input_file_type" type="select" label="Input file type">
25 <option value="PSM_Report" selected="true">PeptideShaker PSM_Report</option> 25 <option value="PSM_Report" selected="true">PeptideShaker PSM_Report</option>
26 <option value="PMD_FDR_input_file">PMD_FDR_input_file</option> 26 <option value="PMD_FDR_input_file">PMD_FDR_input_file</option>
27 <option value="MaxQuant_Evidence">MaxQuant_Evidence</option>
27 </param> 28 </param>
28 </inputs> 29 </inputs>
29 <outputs> 30 <outputs>
30 <data name="output_g_fdr" format="tabular" label="${tool.name} on ${on_string} output_g_fdr"/> 31 <data name="output_g_fdr" format="tabular" label="${tool.name} on ${on_string} output_g_fdr"/>
31 <data name="output_i_fdr" format="tabular" label="${tool.name} on ${on_string} output_i_fdr"/> 32 <data name="output_i_fdr" format="tabular" label="${tool.name} on ${on_string} output_i_fdr"/>
38 <output name="output_i_fdr" file="output_i_fdr.tabular" /> 39 <output name="output_i_fdr" file="output_i_fdr.tabular" />
39 <output name="output_densities" file="output_densities.tabular" /> 40 <output name="output_densities" file="output_densities.tabular" />
40 </test> 41 </test>
41 </tests> 42 </tests>
42 <help><![CDATA[ 43 <help><![CDATA[
43 Computes individual and global False Discovery Rate (FDR) using Precursor Mass Discrepancy (PMD) from a Peptide Spectrum Match (PSM) report. 44 =======
45 PMD FDR
46 =======
47 PMD FDR calculates individual and global False Discovery Rate (FDR) using Precursor Mass Discrepancy (PMD) from a Peptide Spectrum Match (PSM) report. ee the `PMD-FDR-for-Galaxy-P home page <https://github.com/slhubler/PMD-FDR-for-Galaxy-P/blob/master/README.md#pmd-fdr-for-galaxy-p>`_ for details.
44 48
45 The PSM report must be from PeptideShaker (--input_file_type PSM_Report), 49 ----------
46 or otherwise (--input_file_type PMD_FDR_input_file) have columns with header names: 50 **Inputs**
51 ----------
47 52
48 - PMD_FDR_input_score 53 **Primary Input file** is a PSM report, 3 formats are currently accepted:
49 - PMD_FDR_pmd 54
50 - PMD_FDR_spectrum_file 55 * PeptideShaker PSM report (--input_file_type PSM_Report)
51 - PMD_FDR_proteins 56 This format is the output of PeptideShaker. We expect it to be a tab-delimited file with the first row being the column labels.
52 - PMD_FDR_spectrum_title 57 The following fields are required for a file to be correctly processed:
53 - PMD_FDR_sequence 58
54 - PMD_FDR_decoy 59 - Confidence [%]
60 - Precursor m/z Error [ppm]
61 - Spectrum File
62 - Protein(s)
63 - Spectrum Title
64 - Sequence
65 - Decoy
66
67 * Maxquant evidence.txt (--input_file_type MaxQuant_Evidence)
68 The following fields are required for correct processing (others can exist):
69
70 - PEP
71 - Mass error [ppm]
72 - Proteins
73 - Retention time
74 - Sequence
75 - Reverse
76
77 * Custom generated (--input_file_type PMD_FDR_input_file)
78 having columns with header names:
79
80 - PMD_FDR_input_score
81 - PMD_FDR_pmd
82 - PMD_FDR_spectrum_file
83 - PMD_FDR_proteins
84 - PMD_FDR_spectrum_title
85 - PMD_FDR_sequence
86 - PMD_FDR_decoy
87
88 **Optional PSM 1% file**
89
90 At present, this only supports PSM_Report format. This is simply the PSM_Report from PeptideShaker using a 1% FDR. This file is matched against the Primary Input file, using the spectrum file and title to match records. If they agree, the record (in the original input) is marked by setting is_one_percent_FDR to TRUE
55 91
56 92
57 PMD_FDR_package_for_Galaxy.R 93 -----------
94 **Outputs**
95 -----------
58 96
59 usage: PMD_FDR_package_for_Galaxy.R [--] [--help] [--opts OPTS] [--psm_report PSM_REPORT] [--psm_report_1_percent PSM_REPORT_1_PERCENT] [--output_i_fdr OUTPUT_I_FDR] [--output_g_fdr OUTPUT_G_FDR] [--output_densities OUTPUT_DENSITIES] [--score_field_name SCORE_FIELD_NAME] [--input_file_type INPUT_FILE_TYPE] 97 Score ranges are used in all three files. An example score range: 060_099_ge_lt
60 98
61 arguments: 99 The structure of score range field is *aaa_bbb_cc_dd* where:
62 -p, --psm_report PSM_REPORT full name and path to the PSM report 100
63 --psm_report_1_percent PSM_REPORT_1_PERCENT full name and path to the PSM report for 1% FDR [default: ] 101 * *aaa* is the lower bound of the score range
64 -i, --input_file_type INPUT_FILE_TYPE type of input file (currently supports: PSM_Report) [default: PMD_FDR_input_file] 102 * *bbb* is the upper bound of the score range
65 -o, --output_i_fdr OUTPUT_I_FDR full name and path to the i-FDR output file [default: ] 103 * *cc* and *dd* describe the lower bound and upper bound comparison operator:
66 --output_g_fdr OUTPUT_G_FDR full name and path to the g-FDR output file [default: ] 104
67 --output_densities OUTPUT_DENSITIES full name and path to the densities output file [default: ] 105 - eq - "equal to"
106 - ge - "greater than or equal to"
107 - gt - "greater than"
108 - le - "less than or equal to"
109 - lt - "less than"
110
111
112
113 **output_densities**
114 File contains a normalized version of the density function applied to up to 13 subsets of the data. All but "x" refers to the subsetting variable. As such, each column, except x, should sum to 1:
115
116 - x *center of a range of normalized PMD interval*
117 - t *(estimated) relative abundance of True Hits*
118 - f *(estimated) relative abundance of False Hits*
119 - aaa_bbb_cc_dd *relative abundance of score range; see above for definition of score*
120 - decoy *relative abundance of decoys (superset of f)*
121
122 Example:
123
124 ::
125
126 x t f 000_060_ge_lt 060_099_ge_lt 099_100_ge_lt 100_100_eq_eq decoy
127 -16.9133785470534 4.16622917525113e-18 8.93652665157669e-05 0.000309818912433651 0 3.74804907374386e-18 4.16622917525113e-18 8.93652665157669e-05
128 -16.8519045481908 4.23773371666637e-18 9.74050812738847e-05 0.000317480777369488 0 5.16725222326666e-18 4.23773371666637e-18 9.74050812738847e-05
129 ...
130
131 **output_g_fdr**
132
133 File contains the groupwise FDR (gFDR) for each score range:
134
135 - *group_of_interest* name of score group
136 - *alpha* gFDR
137
138 *Alpha hould be a number between 0 and 1 However, it is generated after excluding the training data and based on the resulting (random) peak height of each distribution. This means that the gFDR can be greater than 1 in practice.*
139
140 Example::
141
142 group_of_interest alpha
143 000_060_ge_lt 0.953202547009365
144 060_099_ge_lt 0.477777669534645
145 099_100_ge_lt 0.381269718674806
146 100_100_eq_eq 0
147 decoy 1
148
149
150 **output_i_fdr**
151
152 This file contains the individual FDR (iFDR) for each PSM, with supporting evidence:
153
154 - PMD_FDR_spectrum_title *Unique identifier concatenating PMD_FDR_spectrum_file and PMD_FDR_spectrum_index*
155 - value *Identical to PMD_FDR_pmd. Implemented this way to allow future alterations that would use input variables other than PMD*
156 - PMD_FDR_decoy *Input variable - 1 for decoy, 0 for other*
157 - median_of_group_index *median PMD for good-training records with the same group_index as the current record*
158 - value_norm *normalized value (value minus median_of_group_index)*
159 - used_to_find_middle *logical variable reflecting the following statement: was this record used to identify the median_of_group_index? (These records MUST be excluded in any summary statistics.)*
160 - PMD_FDR_input_score *The score used to separate data*
161 - PMD_FDR_pmd *Precursor Mass Discrepancy*
162 - PMD_FDR_peptide_length *Peptide length of identified peptide*
163 - PMD_FDR_spectrum_file *Name of file containing spectrum*
164 - PMD_FDR_spectrum_index *Spectrum number within file*
165 - PMD_FDR_proteins *Protein name*
166 - group_input_score *Grouping by score*
167 - group_pmd *Grouping by PMD (approx 20 groups)*
168 - group_peptide_length *Grouping by peptide length*
169 - group_training_class *Grouping by Training class, (see notes)*
170 - group_proteins *Grouping by Protein groups (see notes)*
171 - group_spectrum_file *Same as PMD_FDR_spectrum_file*
172 - group_spectrum_index *Contiguous groups of spectra (see notes)*
173 - group_proteins *Grouping by species (however, see notes)*
174 - group_decoy_input_score *decoy version of group_input_score*
175 - group_decoy_pmd *decoy version of group_pmd*
176 - group_decoy_peptide_length *decoy version of group_peptide_length*
177 - group_decoy_spectrum_file *decoy version of group_spectrum_file*
178 - group_decoy_spectrum_index *decoy version of group_spectrum_index*
179 - group_decoy_proteins *decoy version of group_proteins*
180 - is_in_1percent *PSM in 1% FDR file (if it exists)*
181 - value_of_interest *Defunct column (used during processing)*
182 - group_of_interest *Defunct column (used during processing)*
183 - interpolated_groupwise_FDR *estimated gFDR, interpolated from gFDR derived from group_decoy_input_score*
184 - t *density of t at PMD of record*
185 - f *density of f at PMD of record*
186 - alpha *same as interpolated_groupwise_FDR*
187 - i_fdr *iFDR (alphaf / (alphaf + (1-alpha)*t))*
188
68 189
69 ]]></help> 190 ]]></help>
70 </tool> 191 </tool>