comparison w4mclstrpeakpics.xml @ 0:caf0ce49b638 draft

planemo upload for repository https://github.com/HegemanLab/w4mclstrpeakpics_galaxy_wrapper/tree/master commit 7ff527179eab946dd860be6b6a3e2c01b77d7dee
author eschen42
date Sun, 18 Jun 2017 23:25:58 -0400
parents
children 17356c5fe43e
comparison
equal deleted inserted replaced
-1:000000000000 0:caf0ce49b638
1 <tool id="w4mclstrpeakpics" name="View Cluster_Peaks" version="0.98.1">
2 <description>Visualize sample-cluster peaks</description>
3
4 <requirements>
5 <requirement type="package" version="3.3.1">r-base</requirement>
6 <requirement type="package" version="1.1_4">r-batch</requirement>
7 <requirement type="package" version="1.4.2">r-reshape2</requirement>
8 <requirement type="package" version="0.4_10">r-sqldf</requirement>
9 </requirements>
10
11 <stdio>
12 <exit_code range="1:" level="fatal" />
13 </stdio>
14
15
16 <command detect_errors="exit_code"><![CDATA[
17 Rscript $__tool_directory__/w4mclstrpeakpics_wrapper.R
18 data_matrix_path '$dataMatrix_in'
19 variable_metadata_path '$variableMetadata_in'
20 sample_metadata_path '$sampleMetadata_in'
21 sample_selector_column_name '$sampleSelectorColumnName'
22 sample_selector_value '$sampleSelectorValue'
23 output_pdf '$outputFigure'
24 output_tsv '$outputTable'
25 output_rdata '$outputRdata'
26 r_package_archive $__tool_directory__/w4mclstrpeakpics_0.98.1.tar.gz
27 ; echo exit code $?
28 ]]></command>
29
30 <inputs>
31 <param name="dataMatrix_in" label="Data matrix file" type="data" format="tabular" help="variable x sample, decimal: '.', missing: NA, mode: numerical, separator: tab" />
32 <param name="sampleMetadata_in" label="Sample metadata file" type="data" format="tabular" help="sample x metadata columns, separator: tab" />
33 <param name="variableMetadata_in" label="Variable metadata file" type="data" format="tabular" help="variable x metadata columns, separator: tab" />
34 <param name="sampleSelectorColumnName" label="Selector-column name" type="text" value = "sampleType" help="Name of column containing selector values." />
35 <param name="sampleSelectorValue" label="Selector value" type="text" value = "pool" help="Value to match in sample-selector column." />
36 </inputs>
37 <outputs>
38 <data name="outputTable" label="${tool.name}_${sampleSelectorValue}_table" format="tabular" ></data>
39 <data name="outputFigure" label="${tool.name}_${sampleSelectorValue}_figure" format="pdf" ></data>
40 <data name="outputRdata" label="${tool.name}_${sampleSelectorValue}_RData" format="rdata" ></data>
41 </outputs>
42
43 <tests>
44 <test>
45 <param name="dataMatrix_in" value="input_dataMatrix.tsv"/>
46 <param name="sampleMetadata_in" value="input_sampleMetadata.tsv"/>
47 <param name="variableMetadata_in" value="input_variableMetadata.tsv"/>
48 <param name="sampleSelectorColumnName" value="sampleType"/>
49 <param name="sampleSelectorValue" value="pool"/>
50 <output name="outputTable">
51 <assert_contents>
52 <has_text text="0.464572" />
53 <has_text text="0.114304" />
54 <has_text text="0.131016" />
55 <has_text text="0.0701871" />
56 <has_text text="0.0661764" />
57 <has_text text="0.0384358" />
58 <has_text text="0.0614973" />
59 <has_text text="0.0260695" />
60 <has_text text="0.0193850" />
61 <has_text text="0.00835561" />
62 </assert_contents>
63 </output>
64 </test>
65 </tests>
66
67
68
69 <help>
70 <![CDATA[
71
72 **Author** Arthur Eschenlauer (University of Minnesota, esch0041@umn.edu)
73
74 --------------------------------------------------------------------------
75
76
77 **R package**
78
79 The *w4mclstrpeakpics* package is available from the Hegeman lab github repository (https://github.com/HegemanLab/w4mclstrpeakpics/releases).
80
81 -----------------------------------------------------------------------------------------------------------------------------------------
82
83
84 **Tool updates**
85
86 See the **NEWS** section at the bottom of this page
87
88 ---------------------------------------------------
89
90 ==================================
91 Visualize W4M sample-cluster peaks
92 ==================================
93
94 -----------
95 Description
96 -----------
97
98 Produce a figure to assess the similarities and differences among peaks in a cluster of samples using XCMS-preprocessed data files as input.
99
100 -----------------
101 Workflow Position
102 -----------------
103
104 - Upstream tool category: Preprocessing
105 - Downstream tool categories: (Not applicable.)
106
107 ----------
108 Motivation
109 ----------
110
111 GC-MS1 and LC-MS1 experiments seek to resolve chemicals as features that have distinct chromatographic behavior and (after ionization) mass-to-charge ratio.
112 Data for a sample are collected as MS intensities, each of which is associated with a position on a 2D plane with dimensions of m/z ratio and chromatographic retention time.
113 Ideally, features would be sufficiently reproducible from sample-run to sample-run to identify features that are commmon among samples and those that differ.
114 However, the chromatographic retention time for a chemical can vary from one run to another.
115 In the Workflow4Metabolomics (W4M, [Giacomoni *et al.*, 2014]) "flavor" of Galaxy, the XCMS [Smith *et al.*, 2006] preprocessing tools provide for "retention time correction" to align features among samples.
116
117 In a typical GC-MS or LC-MS experiment, a pooled sample is repeatedly injected for quality control.
118 Since this is the same mixture for each injection, the identified peaks would be expected to be identical for all injections.
119 However, the parameters for peak-picking can exacerbate injection-to-injection variation in performance of the experimental apparatus.
120 This tool exists to assess the variation in the peaks picked among pooled samples, replicates, or indeed any cluster of samples.
121
122 -----------
123 Input files
124 -----------
125
126 +---------------------------+------------+
127 | File | Format |
128 +===========================+============+
129 | Data matrix | tabular |
130 +---------------------------+------------+
131 | Sample metadata | tabular |
132 +---------------------------+------------+
133 | Variable metadata | tabular |
134 +---------------------------+------------+
135
136
137 ----------
138 Parameters
139 ----------
140
141 Data matrix file
142 | variable x sample **dataMatrix** (tabular separated values) file of the numeric data matrix, with . as decimal, and NA for missing values; the table must not contain metadata apart from row and column names; the row and column names must be identical to the rownames of the sample and variable metadata, respectively (see below)
143 |
144
145 Sample metadata file
146 | sample x metadata **sampleMetadata** (tabular separated values) file of the numeric and/or character sample metadata, with . as decimal and NA for missing values
147 |
148
149 Variable metadata file
150 | variable x metadata **variableMetadata** (tabular separated values) file of the numeric and/or character variable metadata, with . as decimal and NA for missing values
151 |
152
153 Selector-column name (default = "sampleType")
154 | column of W4M/XCMS sampleMetadata holding selector string values.
155 |
156
157 Selector value (default = "pool")
158 | value within selector column to identify samples for assessment.
159 |
160
161 Include named classes (default = filter-out)
162 | *filter-in* - include only the named sample classes
163 | *filter-out* - exclude only the named sample classes
164 |
165
166
167 Column that names the sample-class (default = 'class')
168 | name of the column in sample metadata that has the values to be tested against the 'classes' input parameter
169 |
170
171 Column that names the sample (default = 'sampleMetadata')
172 | name of the column in sample metadata that has the name of the sample
173 |
174
175
176 ------------
177 Output files
178 ------------
179
180
181 View Cluster_Peaks_<SelectorName>_figure
182 | (PDF) the output figure, described below.
183 |
184
185 View Cluster_Peaks_<SelectorName>_table
186 | (tabular separated values) table of the values plotted in the upper-left graph in the output figure.
187 |
188
189 View Cluster_Peaks_<SelectorName>_rdata
190 | (R data) RData containing all processing and plotting intermediates..
191 |
192
193
194 ------
195 Graphs
196 ------
197
198
199 **The "Feature Number and Likelihood" graph**
200
201 The upper left graph in the output figure shows the following:
202
203 * The X axis reflects the number of samples in which a given feature is present (“the prevalance of a feature among the samples”).
204 * For open circles, the Y axis reflects the number of features having the number of samples reflected on the X axis.
205 * For solid triangles, the Y axis reflects the relative likelihood of features having the number of samples reflected on the X axis, calculated as
206
207 .. math:: \frac{(number\hspace{1 mm}of\hspace{1 mm}features) (number\hspace{1 mm}of\hspace{1 mm}samples\hspace{1 mm}per\hspace{1 mm}feature)}{maximum(number\hspace{1 mm}of\hspace{1 mm}samples\hspace{1 mm}per\hspace{1 mm}feature)}
208
209 Ideally, there would be an upward trend from left to right; if not, XCMS peak-picking parameters may need to be adjusted to suppress low-intensity “noise” peaks or to address peak-splitting.
210
211
212 **The "Peak Intensity" graph**
213
214 The lower left graph in the output figure presents the data in the upper figure without summarization, so that “the eye” can do the interpretation. It shows the following:
215
216 * The X axis reflects the number of samples in which a given feature is present (“the prevalance of a feature among the samples”).
217 * The Y axis reflects the intensity each sample for each feature having the number of samples reflected on the X axis.
218
219 Ideally, there would be an upward trend from left to right, with more points on the right; if not, XCMS peak-picking parameters may need to be adjusted to suppress low-intensity “noise” peaks or to address peak-splitting.
220
221
222 **The "Symbol area/intensity reflect ion intensity" graph**
223
224 The upper right graph in the output figure shows the following:
225
226 * The X axis reflects the corrected retention time for each feature shown.
227 * The Y axis reflects the m/z for each feature shown.
228 * Symbol area reflects intensity for a feature for one sample.
229 * Overlapping symbols make the overlapping area darker, so the area and darkness reflect the aggregated intensity of a feature. (It is doubtful that a densitometer would be able to recover aggregate intensities accurately from this graph, but philosophically that is how this graph is designed.)
230 * The graph is not designed to communicate prevalence of a feature among the samples, but the hue of the symbol reflects the prevalance, albeit subtly.
231
232 Consequently, if the graph has a lot of large, dark, blueish symbols for repeated runs of a pooled sample, there is strong evidence that the XCMS peak-picking parameters need adjustment to make peak-picking more consistent.
233
234
235 **The "Symbol size/shape reflects prevalence" graph**
236
237 By contrast with the area/intensity graph, lower right graph’s primary purpose is to communicate prevalence of a feature among the samples. It`shows the following:
238
239 * The X axis reflects the corrected retention time for each feature shown.
240 * The Y axis reflects the m/z for each feature shown.
241 * Symbol size and shape reflect the prevalence of a feature among the samples.
242 * “Vividness” of color reflects the aggregate intensity across all samples for a feature, in an attempt to draw attention to the more intense features.
243
244 Consequently, if the graph has a lot of small, vivid symbols for repeated runs of a pooled sample, there is strong evidence that the XCMS peak-picking parameters need adjustment to make peak-picking more consistent.
245
246
247 ---------------
248 Working example
249 ---------------
250
251 **Input files**
252
253 +-------------------+----------------------------------------------------------------------------------------------------------------+
254 | Input File | Download from URL |
255 +===================+================================================================================================================+
256 | Data matrix | https://raw.githubusercontent.com/HegemanLab/w4mclstrpeakpics/master/tests/testthat/input_dataMatrix.tsv |
257 +-------------------+----------------------------------------------------------------------------------------------------------------+
258 | Sample metadata | https://raw.githubusercontent.com/HegemanLab/w4mclstrpeakpics/master/tests/testthat/input_sampleMetadata.tsv |
259 +-------------------+----------------------------------------------------------------------------------------------------------------+
260 | Variable metadata | https://raw.githubusercontent.com/HegemanLab/w4mclstrpeakpics/master/tests/testthat/input_variableMetadata.tsv |
261 +-------------------+----------------------------------------------------------------------------------------------------------------+
262
263 **Other input parameters**
264
265 +------------------------------------+-----------------+
266 | Input Parameter | Value |
267 +====================================+=================+
268 | Selector-column name | sampleType |
269 +------------------------------------+-----------------+
270 | Selector value | pool |
271 +------------------------------------+-----------------+
272
273
274
275 ]]>
276 </help>
277 <citations>
278 <citation type="doi">10.1021/ac051437y</citation>
279 <citation type="doi">10.1093/bioinformatics/btu813</citation>
280 </citations>
281 <!--
282 vim:noet:sw=4:ts=4
283 --> </tool>