Mercurial > repos > ethevenot > profia
comparison profia_config.xml @ 0:39ccace77270 draft
planemo upload for repository https://github.com/workflow4metabolomics/profia.git commit 2757590af8c7ba9833ba3bebd7da7f96b20d1128-dirty
author | ethevenot |
---|---|
date | Sun, 26 Mar 2017 17:37:12 -0400 |
parents | |
children | 4753e64cf694 |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:39ccace77270 |
---|---|
1 <tool id="profia" name="proFIA" version="3.0.0"> | |
2 <description>Preprocessing of FIA-HRMS data</description> | |
3 | |
4 <requirements> | |
5 <requirement type="package">r-batch</requirement> | |
6 <requirement type="package">r-FNN</requirement> | |
7 <requirement type="package">r-maxLik</requirement> | |
8 <requirement type="package">r-minpack.lm</requirement> | |
9 <requirement type="package">r-pracma</requirement> | |
10 <requirement type="package">bioconductor-proFIA</requirement> | |
11 </requirements> | |
12 | |
13 <stdio> | |
14 <exit_code range="1:" level="fatal" /> | |
15 </stdio> | |
16 | |
17 <command><![CDATA[ | |
18 Rscript $__tool_directory__/profia_wrapper.R | |
19 | |
20 #if $inputs.input == "lib": | |
21 library $__app__.config.user_library_import_dir/$__user_email__/$inputs.library | |
22 #elif $inputs.input == "zip_file": | |
23 zipfile $inputs.zip_file | |
24 #end if | |
25 | |
26 ppmN "$ppmN" | |
27 ppmGroupN "$ppmGroupN" | |
28 fracGroupN "$fracGroupN" | |
29 kI "$kI" | |
30 | |
31 dataMatrix_out "$dataMatrix_out" | |
32 sampleMetadata_out "$sampleMetadata_out" | |
33 variableMetadata_out "$variableMetadata_out" | |
34 figure "$figure" | |
35 information "$information" | |
36 ]]></command> | |
37 | |
38 <inputs> | |
39 <conditional name="inputs"> | |
40 <param name="input" type="select" label="Choose your input method" > | |
41 <option value="zip_file" selected="true">Zip file from your history containing your raw files</option> | |
42 <option value="lib" >Library directory name</option> | |
43 </param> | |
44 <when value="zip_file"> | |
45 <param name="zip_file" type="data" format="no_unzip.zip,zip" label="Zip file" /> | |
46 </when> | |
47 <when value="lib"> | |
48 <param name="library" type="text" size="40" label="Library directory name" help="The name of your directory containing all your data" > | |
49 <validator type="empty_field"/> | |
50 </param> | |
51 </when> | |
52 </conditional> | |
53 | |
54 <param name="ppmN" label="Maximum deviation between centroids during band detection (in ppm)" type="text" value = "5" help="[ppm]" /> | |
55 <param name="ppmGroupN" label="Accuracy of the mass spectrometer to be used during feature alignment (in ppm)" type="text" value = "5" help="[ppmGroup] Should be inferior or equal to the deviation parameter above." /> | |
56 <param name="fracGroupN" label=" Minimum fraction of samples in which a peak should be detected in at least one class to be kept during feature alignment" type="text" value = "0.5" help="[fracGroup]" /> | |
57 <param name="kI" label="Number of neighbour features to be used for imputation (select 0 to skip the imputation step)" type="text" value = "5" help="[k]" /> | |
58 </inputs> | |
59 | |
60 <outputs> | |
61 <data name="dataMatrix_out" label="${tool.name}_dataMatrix.tsv" format="tabular" ></data> | |
62 <data name="sampleMetadata_out" label="${tool.name}_sampleMetadata.tsv" format="tabular" ></data> | |
63 <data name="variableMetadata_out" label="${tool.name}_variableMetadata.tsv" format="tabular" ></data> | |
64 <data name="figure" label="${tool.name}_figure.pdf" format="pdf"/> | |
65 <data name="information" label="${tool.name}_information.txt" format="txt"/> | |
66 </outputs> | |
67 | |
68 <tests> | |
69 <test> | |
70 <param name="inputs|input" value="zip_file" /> | |
71 <param name="inputs|zip_file" value="input-plasFIA.zip" ftype="zip" /> | |
72 <param name="ppmN" value="2"/> | |
73 <param name="ppmGroupN" value="1"/> | |
74 <param name="fracGroupN" value="0.1"/> | |
75 <param name="kI" value="2"/> | |
76 <output name="dataMatrix_out" file="output-dataMatrix.tsv"/> | |
77 </test> | |
78 </tests> | |
79 | |
80 <help> | |
81 | |
82 .. class:: infomark | |
83 | |
84 **Author** Alexis Delabriere and Etienne Thevenot (CEA, LIST, MetaboHUB Paris, etienne.thevenot@cea.fr) | |
85 | |
86 --------------------------------------------------- | |
87 | |
88 .. class:: infomark | |
89 | |
90 **Please cite** | |
91 | |
92 Delabriere A., Hohenester U., Junot C. and Thevenot E.A. *proFIA*: A data preprocessing workflow for Flow Injection Analysis coupled to High-Resolution Mass Spectrometry. *submitted*. | |
93 | |
94 --------------------------------------------------- | |
95 | |
96 .. class:: infomark | |
97 | |
98 **R package** | |
99 | |
100 The **proFIA** package is available from the bioconductor repository `http://bioconductor.org/packages/proFIA <http://bioconductor.org/packages/proFIA>`_ | |
101 | |
102 --------------------------------------------------- | |
103 | |
104 .. class:: infomark | |
105 | |
106 **Tool updates** | |
107 | |
108 See the **NEWS** section at the bottom of this page | |
109 | |
110 --------------------------------------------------- | |
111 | |
112 ========================================================== | |
113 *proFIA*: Preprocessing workflow for FIA-HRMS data | |
114 ========================================================== | |
115 | |
116 ----------- | |
117 Description | |
118 ----------- | |
119 | |
120 **Flow Injection Analysis coupled to High-Resolution Mass Spectrometry (FIA-HRMS)** is a promising approach for **high-throughput metabolomics** (Madalinski *et al.*, 2008; Fuhrer *et al.*, 2011; Draper *et al.*, 2013). FIA- HRMS data, however, cannot be preprocessed with current software tools which rely on liquid chromatography separation, or handle low resolution data only. | |
121 | |
122 The **proFIA module is a workflow** allowing to preprocess FIA-HRMS raw data in **centroid** mode and open format (netCDF, mzData, mzXML, and mzML), and generates the table of peak intensities (**peak table**). The workflow consists in **peak detection and quantification** within individual sample files, followed by **alignment** between files in the m/z dimension, and **imputation** of the missing values in the final peak table (Delabriere *et al.*, submitted). For each ion, the graph representing the intensity as a function of time is called a **flowgram**. A flowgram can be modeled as I = kP + ME(P) + B + e, where k is the response factor (corresponding to the ionization properties of the analyte), P is the **sample peak** (normalized profile which is common for all analytes from a sample and depends on the flow injection conditions only), ME is the **matrix effect**, B is the **solvent baseline**, and e is the heteroscedastic noise. | |
123 | |
124 The generated peak table is available in the '3 table' W4M tabular format (**dataMatrix**, **sampleMetadata**, and **variableMetadata**) for downstream statistical analysis and annotation with W4M modules. | |
125 | |
126 A figure provides **diagnostics** and visualization of the preprocessed data set. | |
127 | |
128 --------------------------------------------------- | |
129 | |
130 .. class:: infomark | |
131 | |
132 **References** | |
133 | |
134 | Delabriere A., Hohenester U., Junot C. and Thevenot E.A. proFIA: A data preprocessing workflow for Flow Injection Analysis coupled to High-Resolution Mass Spectrometry. *submitted*. | |
135 | Draper J., Lloyd A., Goodacre R. and Beckmann M. (2013). Flow infusion electrospray ionisation mass spectrometry for high throughput, non-targeted metabolite fingerprinting: a review. *Metabolomics* 9, 4-29. | |
136 | Fuhrer T., Dominik H., Boris B. and Zamboni N. (2011). High-throughput, accurate mass metabolome profiling of cellular extracts by flow injection-time-of-flight mass spectrometry. *Analytical Chemistry* 83, 7074-7080. | |
137 | Madalinski G., Godat E., Alves S., Lesage D., Genin E., Levi P., Labarre J., Tabet J., Ezan E. and Junot, C. (2008). Direct introduction of biological samples into a LTQ-orbitrap hybrid mass spectrometer as a tool for fast metabolome analysis. *Analytical Chemistry* 80, 3291-3303. | |
138 | |
139 --------------------------------------------------- | |
140 | |
141 ----------------- | |
142 Workflow position | |
143 ----------------- | |
144 | |
145 .. image:: profia_workflowPositionImage.png | |
146 :width: 600 | |
147 | |
148 ----------- | |
149 Input files | |
150 ----------- | |
151 | |
152 +---------------------------+------------+ | |
153 | Parameter : num + label | Format | | |
154 +===========================+============+ | |
155 | 1 : Choose your inputs | zip | | |
156 +---------------------------+------------+ | |
157 | |
158 | |
159 You have two methods for your inputs: | |
160 | Zip file (recommended): You can put a zip file containing your inputs: myinputs.zip (containing all your conditions as sub-directories). | |
161 | library folder: You must specify the name of your "library" (folder) created within your space project (for example: /projet/externe/institut/login/galaxylibrary/yourlibrary). Your library must contain all your conditions as sub-directories. | |
162 | |
163 **Steps for creating the zip file** | |
164 | |
165 **Step1: Creating your directory and hierarchize the subdirectories** | |
166 | |
167 .. class:: warningmark | |
168 | |
169 VERY IMPORTANT: If you zip your files under Windows, you must use the **7Zip** software (http://www.7-zip.org/), otherwise your zip will not be well unzipped on the platform W4M (zip corrupted bug). | |
170 Your zip should contain all your conditions as sub-directories. For example, two conditions (mutant and wild): | |
171 arabidopsis/wild/01.raw | |
172 arabidopsis/mutant/01.raw | |
173 | |
174 **Step2: Creating a zip file** | |
175 Create your zip file (e.g.: arabidopsis.zip). | |
176 | |
177 **Step 3 : Uploading it to our Galaxy server** | |
178 If your zip file is less than 2Gb, you get use the Get Data tool to upload it. | |
179 Otherwise if your zip file is larger than 2Gb, please refer to the HOWTO on workflow4metabolomics.org (http://application.sb-roscoff.fr/download/w4m/howto/galaxy_upload_up_2Go.pdf). | |
180 For more informations, don't hesitate to send us an email at supportATworkflow4metabolomics.org). | |
181 | |
182 **Advices for converting your files for the XCMS input** | |
183 | |
184 .. class:: warningmark | |
185 | |
186 VERY IMPORTANT: your data must be in **centroid** mode. In addition, we recommend you to convert your raw files to mzXML. | |
187 | |
188 We recommend the following parameters: | |
189 | |
190 Use Filtering: **True** | |
191 Use Peak Picking: **True** | |
192 Peak Peaking -Apply to MS Levels: **All Levels (1-)** : Centroid Mode | |
193 Use zlib: **64** | |
194 Binary Encoding: **64** | |
195 m/z Encoding: **64** | |
196 Intensity Encoding: **64** | |
197 | |
198 ---------- | |
199 Parameters | |
200 ---------- | |
201 | |
202 Maximum deviation between centroids during band detection; in ppm (default = 5) | |
203 | m/z tolerance of centroids corresponding to the same ion from one scan to the other. | |
204 | | |
205 | |
206 Accuracy of the mass spectrometer to be used during feature alignment; in ppm (default = 5) | |
207 | Should be inferior or equal to the deviation parameter above. | |
208 | | |
209 | |
210 Minimum fraction of samples in which a peak should be detected in at least one class to be kept during feature alignment (default = 0.5) | |
211 | Identical to the corresponding parameter in XCMS. | |
212 | | |
213 | |
214 Number of neighbour features to be used for imputation (default = 5) | |
215 | Select 0 to skip the imputation step. | |
216 | | |
217 | |
218 | |
219 ------------ | |
220 Output files | |
221 ------------ | |
222 | |
223 dataMatrix.tabular | |
224 | **dataMatrix** tabular separated file with the variables as rows and samples as columns. Missing values are indicated as 'NA' (i.e. when the signal was not significantly different from noise). | |
225 | | |
226 | |
227 sampleMetadata.tabular | |
228 | **sampleMetadata** tabular separated file containing the sample metadata as columns. | |
229 | | |
230 | |
231 variableMetadata.tabular | |
232 | **variableMetadata** tabular separated file containing the variable metadata as columns. The **timeShifted** flag is set to 1 when the flowgram is time shifted compared to the sample peak (probably due to liquid retention in the FI tube). The **corSampPeakMean** metric is the correlation between the feature flowgram and the sample peak (values are in [-1, 1]). A value below 0.2 suggests that the feature signal is affected by a strong matrix effect. The **meanSolvent** is the mean baseline signal in the feature flowgrams. The **signalOverSolventPvalueMean** is the mean p-value of the tests discriminating between signal and baseline solvent. | |
233 | | |
234 | |
235 figure.pdf | |
236 | Visualization and diagnostics about the preprocessed data set; **Feature quality**: Number of detected features per sample for each of the three categories: 'Well-behaved' features have a peak shape close to the sample peak (optimal FIA acquisition is achieved when the majority of the features fall into this category); 'Shifted' indicates a time shift compared to the sample peak, and probably results from retention in the FI tube; 'Significant Matrix Effect' corresponds to a correlation between the feature and the samples peaks of less than 0.2, which is usually caused by a strong matrix effect; **Sample peaks**: Visualization of the peak model for each sample; should have close shapes in case of similar FIA conditions; **m/z density**: may allow to detect a missing m/z value, and in turn, suggest that the *ppm* parameter should be modified; **PCA score plot** of the log10 intensities to detect sample outliers. | |
237 | | |
238 | |
239 information.txt | |
240 | Text file with all messages and warnings generated during the computation. | |
241 | | |
242 | |
243 --------------------------------------------------- | |
244 | |
245 --------------- | |
246 Working example | |
247 --------------- | |
248 | |
249 Figure output | |
250 ============= | |
251 | |
252 .. image:: profia_workingExampleImage.png | |
253 :width: 600 | |
254 | |
255 --------------------------------------------------- | |
256 | |
257 ---- | |
258 NEWS | |
259 ---- | |
260 | |
261 CHANGES IN VERSION 3.0.0 | |
262 ======================== | |
263 | |
264 NEW FEATURE | |
265 | |
266 Creation of the tool | |
267 | |
268 </help> | |
269 | |
270 <citations> | |
271 <citation type="bibtex">@Article{DelabriereSubmitted, | |
272 Title = {proFIA: A data preprocessing workflow for Flow Injection Analysis coupled to High-Resolution Mass Spectrometry}, | |
273 Author = {Delabriere, Alexis and Hohenester, Ulli and Junot, Christophe and Thevenot, Etienne A}, | |
274 Journal = {submitted}, | |
275 Year = {submitted}, | |
276 Pages = {--}, | |
277 Volume = {}, | |
278 Doi = {} | |
279 }</citation> | |
280 <citation type="doi">10.1093/bioinformatics/btu813</citation> | |
281 </citations> | |
282 | |
283 </tool> |