comparison w4mclassfilter.xml @ 3:191a720488ce draft

planemo upload for repository https://github.com/HegemanLab/w4mclassfilter_galaxy_wrapper/tree/master commit dacc1bae8e60340a431907fe8a8b490357e1bfc0
author eschen42
date Thu, 11 May 2017 00:05:51 -0400
parents 23c6d271def9
children 499c7ecfa834
comparison
equal deleted inserted replaced
2:23c6d271def9 3:191a720488ce
1 <tool id="W4MClassFilter" name="W4MClassFilter" version="0.98.1"> 1 <tool id="w4mclassfilter" name="Sample_Subset" version="0.98.1">
2 <description>Filter W4M data by sample class</description> 2 <description>Filter W4M data by sample class</description>
3 3
4 <requirements> 4 <requirements>
5 <requirement type="package" version="3.3.1">r-base</requirement> 5 <requirement type="package" version="3.3.1">r-base</requirement>
6 <requirement type="package" version="1.1_4">r-batch</requirement> 6 <requirement type="package" version="1.1_4">r-batch</requirement>
191 191
192 ----------- 192 -----------
193 Description 193 Description
194 ----------- 194 -----------
195 195
196 Filter set of retention-corrected W4M files (dataMatrix, sampleMetadata, variableMetadata) by sample class 196 Filter a set of retention-corrected W4M files (dataMatrix, sampleMetadata, variableMetadata) by sample-class
197 197
198 -------- 198 -----------------
199 Comments 199 Workflow Position
200 -------- 200 -----------------
201 201
202 The *inclusive* parameter indicates: 202 - Upstream tool category: Preprocessing
203 - when 'filter-in', that only the sample-classes named should be included 203 - Downstream tool categories: Normalisation, Statistical Analysis, Quality Control
204 - when 'filter-out', that all sample-classes should be included excepting the sample-classes named 204
205 ----------
206 Motivation
207 ----------
208
209 GC-MS1 and LC-MS1 experiments seek to resolve chemicals as features that have distinct chromatographic behavior and (after ionization) mass-to-charge ratio.
210 Data for a sample are collected as MS intensities, each of which is associated with a position on a 2D plane with dimensions of m/z ratio and chromatographic retention time.
211 Ideally, features would be sufficiently reproducible from sample-run to sample-run to identify features that are commmon among samples and those that differ.
212 However, the chromatographic retention time for a chemical can vary from one run to another.
213 In the Workflow4Metabolomics (W4M, [Giacomoni *et al.*, 2014]) "flavor" of Galaxy, the XCMS [Smith *et al.*, 2006] preprocessing tools provide for "retention time correction" to align features among samples, but features may be better aligned if pooled samples and blanks are included.
214
215 Multivariate statistical techniques may be used to discover clusters of similar samples, and sometimes it is desirable to apply clustering iteratively to smaller and smaller subsets of samples until observable separation of clusters is no longer significant.
216 Once feature-alignment has been achieved among samples in GC-MS and LC-MS datasets, however, the presence of pools and blanks may confound identification and separation of clusters.
217 Multivariate statistical algorithms also may be impacted by missing values or dimensions that have zero variance.
218
219 The w4mclassfilter tool provides a way to choose subsets of samples for further analysis.
220 The tool takes as input the data matrix, sample metadata, and variable metadata Galaxy datasets produced by W4M and produces the same trio of datasets with data only for the selected samples.
221 The tool uses a "sample-class" column in the sample metadata as the basis for including or eliminating samples for further analysis.
222 Class-values to be considered are provided by the user as a comma-separated list.
223 The user also provides an indication whether the list specifies classes to be included in further analysis ("filter-in") or rather to be excluded from it ("filter-out").
224 Next, missing and negative intensites for features of the remaining samples are imputed to zero.
225 Finally, samples or features with zero variance are eliminated.
226
205 227
206 ----------- 228 -----------
207 Input files 229 Input files
208 ----------- 230 -----------
209 231
329 CHANGES IN VERSION 0.98.1 351 CHANGES IN VERSION 0.98.1
330 ========================= 352 =========================
331 353
332 NEW FEATURES 354 NEW FEATURES
333 355
334 First release - R package that implements filtering of W4M data matrix, variable metadata, and sample metadata by class of sample. 356 First release - Wrap the w4mclassfilter R package that implements filtering of W4M data matrix, variable metadata, and sample metadata by class of sample.
335 357
336 *dataMatrix* *is* modified by the tool, so it *does* appear as an output file 358 *dataMatrix* *is* modified by the tool, so it *does* appear as an output file
359 *sampleMetadata* *is* modified by the tool, so it *does* appear as an output file
360 *variableMetadata* *is* modified by the tool, so it *does* appear as an output file
337 361
338 INTERNAL MODIFICATIONS 362 INTERNAL MODIFICATIONS
339 363
340 none 364 none
365
341 ]]> 366 ]]>
342 </help> 367 </help>
343 <citations> 368 <citations>
369 <citation type="doi">10.1021/ac051437y</citation>
370 <citation type="doi">10.1093/bioinformatics/btu813</citation>
344 </citations> 371 </citations>
345 </tool> 372 <!--
346 <!-- vim: noet sw=4 ts=4 : 373 vim:noet:sw=4:ts=4
347 --> 374 --> </tool>