Mercurial > repos > galaxyp > mzmatch
changeset 0:201a15633354 draft default tip
Initial commit.
author | galaxyp |
---|---|
date | Fri, 10 May 2013 17:28:02 -0400 |
parents | |
children | |
files | export.xml extract_background_ions.xml extract_mass_chromatograms.xml macros.xml mzmatch_2.0.jar mzmatch_wrapper.py noise_filter.xml |
diffstat | 7 files changed, 241 insertions(+), 0 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/export.xml Fri May 10 17:28:02 2013 -0400 @@ -0,0 +1,32 @@ +<tool id="mzmatch_export" version="0.1.0" name="Peakml Export"> + <description> + </description> + <macros> + <import>macros.xml</import> + </macros> + <expand macro="stdio" /> + <expand macro="requires" /> + <command interpreter="python"> + mzmatch_wrapper.py --executable 'convert.ConvertToText' --config $config + </command> + <configfiles> + <configfile name="config">-i $input1 -o $output</configfile> + </configfiles> + <inputs> + <param name="input1" label="Input" type="data" format="peakml" /> + </inputs> + <outputs> + <data format="tabular" name="output" /> + </outputs> + <help> +**What it does** + +Converts the contents of a PeakML file to a tab-separated text file. In order to make the file useable in a spreadsheet environment, only the top-level data for each measurement is given in the file. + +**Citation** + +For the underlying tool, please cite ``PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis. Richard A. Scheltema, Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. Analytical Chemistry 2011 83 (7), pp 2786-2793`` + +If you use this tool in Galaxy, please cite Chilton J, et al. https://bitbucket.org/galaxyp/galaxyp-toolshed-mzmatch + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/extract_background_ions.xml Fri May 10 17:28:02 2013 -0400 @@ -0,0 +1,39 @@ +<tool id="mzmatch_extract_background_ions" version="0.1.0" name="Extract Background Ions"> + <description> + </description> + <macros> + <import>macros.xml</import> + </macros> + <expand macro="stdio" /> + <expand macro="requires" /> + <command interpreter="python"> + mzmatch_wrapper.py + --executable '__SHELL__' --config $setup + --executable 'ExtractBackgroundIons' --config $config + </command> + <configfiles> + <configfile name="setup">ln -s $input1 input.${input1.ext}</configfile> + <configfile name="config">-ppm $ppm -i input.${input1.ext} -o $output -threshold $threshold</configfile> + </configfiles> + <inputs> + <param name="input1" label="Input" type="data" format="mzml,mzxml" /> + <param name="ppm" label="ppm" value="3" type="float" help="The accuracy of the measurement in parts-per-milion. This value is used for the collection of the data-points belonging to a background ion and needs to be reasonable for the equipment used to make the measurement (the LTQ-Orbitrap manages approximatetly 3 ppm)." /> + <param name="threshold" type="float" value="0.02" label="Treshold" help="The percentage (a value between 0 and 1) of scans that minimally need to contain a measurement." /> + </inputs> + <outputs> + <data format="peakml" name="output" /> + </outputs> + <help> +**What it does** + +Extracts background ions (x-axis: RT; y-axis: Intensity) from 2D mass spectrometry data (LC/MS or GC/MS). The raw data is loaded from the open standard file formats (mzML, mzXML or mzData) and all of the individidual mass traces (M/Z +/- ppm over the whole scan range) are retrieved. A mass trace is retained when it is present in more than the given percentage of scans (option 'threshold'). + +The resulting output file is in PeakML-format, containing a list of all the extracted background ions. + +**Citation** + +For the underlying tool, please cite ``PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis. Richard A. Scheltema, Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. Analytical Chemistry 2011 83 (7), pp 2786-2793`` + +If you use this tool in Galaxy, please cite Chilton J, et al. https://bitbucket.org/galaxyp/galaxyp-toolshed-mzmatch + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/extract_mass_chromatograms.xml Fri May 10 17:28:02 2013 -0400 @@ -0,0 +1,48 @@ +<tool id="mzmatch_extract_mass_chromatograms" version="0.1.0" name="Extract Mass Chromatograms"> + <description> + </description> + <macros> + <import>macros.xml</import> + </macros> + <expand macro="stdio" /> + <expand macro="requires" /> + <command interpreter="python"> + mzmatch_wrapper.py + --executable '__SHELL__' --config $setup + --executable 'ExtractMassChromatograms' --config $config + </command> + <configfiles> + <configfile name="setup">ln -s $input1 input.${input1.ext}</configfile> + <configfile name="config">-ppm $ppm -i input.${input1.ext} -o $output #if $threshold.specify# -threshold $threshold.threshold #end if#</configfile> + </configfiles> + <inputs> + <param name="input1" label="Input" type="data" format="mzml,mzxml" /> + <param name="ppm" label="ppm" value="3" type="float" help="The accuracy of the measurement in parts-per-milion. This value is used for the collection of the data-points belonging to a mass trace and needs to be reasonable for the equipment used to make the measurement (the LTQ-Orbitrap manages approximatetly 3 ppm)." /> + <conditional name="threshold"> + <param name="specify" type="boolean" label="Specify Threshold" /> + <when value="false"> + </when> + <when value="true"> + <param name="threshold" type="float" value="0.02" label="Treshold" help="The percentage threshold value for breaking the mass traces up, as a percentage of the most intense portion of a mass trace. The threshold value is a percentage and required to be between 0 and 1." /> + </when> + </conditional> + </inputs> + <outputs> + <data format="peakml" name="output" /> + </outputs> + <help> +**What it does** + +Extracts mass chromatograms (x-axis: RT; y-axis: Intensity) from 2D mass spectrometry data (LC/MS or GC/MS). The raw data is loaded from the open standard file formats (mzML, mzXML or mzData) and all of the individidual mass traces (M/Z +/- ppm over the whole scan range) are retrieved. When the option 'threshold' is defined, the individual mass traces are broken up into individual mass chromatograms (ie the isomers are separated). This is achieved by cutting peaks out of the mass trace where the threshold is reached (as a percentage of the most intense portion of the mass trace). This is an iterative process, where the sides are then analyzed in the same fashion. + +The method employed here for retrieving mass chromatograms is greedy and extracts everything (although a modicum of noise reduction is applied to reduce the amount of fragments from broken up mass chromatograms). In order to reduce the resulting noise patterns, tools like 'mzmatch.filter.NoiseFilter' and 'mzmatch.filter.RSDFilter' can be employed. + +The resulting output file is in PeakML-format, containing a list of all the extracted mass chromatograms. When the cutoff value has been selected, one can also specify a file (option 'masstraces') where the extracted mass chromatograms are overlaid on the mass traces they have been cut from. + +**Citation** + +For the underlying tool, please cite ``PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis. Richard A. Scheltema, Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. Analytical Chemistry 2011 83 (7), pp 2786-2793`` + +If you use this tool in Galaxy, please cite Chilton J, et al. https://bitbucket.org/galaxyp/galaxyp-toolshed-mzmatch + </help> +</tool>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/macros.xml Fri May 10 17:28:02 2013 -0400 @@ -0,0 +1,13 @@ +<macros> + <macro name="stdio"> + <stdio> + <exit_code range="1:" level="fatal" description="Error running mzmatch tool." /> + </stdio> + </macro> + <macro name="requires"> + <requirements> + <requirement type="package">mzmatch</requirement> + <yield /> + </requirements> + </macro> +</macros> \ No newline at end of file
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/mzmatch_wrapper.py Fri May 10 17:28:02 2013 -0400 @@ -0,0 +1,63 @@ +from os.path import dirname, join, abspath +import sys +from optparse import OptionParser +from ConfigParser import SafeConfigParser +import subprocess + +DEBUG = False + + +def main(): + (options, args) = _parse_args() + for executable, config_path in zip(options.executables, options.configs): + command_handler = COMMAND_HANDLERS.get(executable, _run_mzmatch) + command_handler(executable, config_path) + + +def _run_shell(executable, config_path): + command = open(config_path, "r").read().strip() + if DEBUG: + print "Running shell command %s" % command + _exec(command) + + +def _run_mzmatch(executable, config_path): + command_prefix = "java -cp %s" % _jar_path() + java_class = "mzmatch.ipeak.%s" % executable + args = open(config_path, "r").read().strip() + _exec("%s %s %s" % (command_prefix, java_class, args)) + + +def _jar_path(): + py_path = __file__ + jar_path = join(dirname(py_path), "mzmatch_2.0.jar") + return jar_path + +COMMAND_HANDLERS = { + "__SHELL__": _run_shell, +} + + +def _exec(command): + proc = subprocess.Popen(args=command, shell=True) + return_code = proc.wait() + if return_code != 0: + print "Error executing command [%s], return code is %d" % (command, return_code) + sys.exit(return_code) + + +def _parse_args(): + parser = OptionParser() + parser.add_option("-e", "--executable", dest="executables", default=[], action="append") + parser.add_option("-c", "--config", dest="configs", default=[], action="append") + return parser.parse_args() + + +def _load_options(config_path): + config_parser = SafeConfigParser() + config_parser.optionxform = str + config_parser.read(config_path) + return config_parser + +if __name__ == "__main__": + main()
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/noise_filter.xml Fri May 10 17:28:02 2013 -0400 @@ -0,0 +1,46 @@ +<tool id="mzmatch_noise_filter" version="0.1.0" name="Noise Filter"> + <description> + </description> + <macros> + <import>macros.xml</import> + </macros> + <expand macro="stdio" /> + <expand macro="requires" /> + <command interpreter="python"> + mzmatch_wrapper.py --executable 'filter.NoiseFilter' --config $config + </command> + <configfiles> + <configfile name="config">-i $input1 -o $output -rejected $output_rejected #if $codadw.specify # -codadw $codadw.codadw #end if#</configfile> + </configfiles> + <inputs> + <param name="input1" label="Input" type="data" format="peakml" /> + <conditional name="codadw"> + <param name="specify" label="Specify CoDA-DW Filter" type="boolean" /> + <when value="true"> + <param name="codadw" type="float" label="CoDA-DW Threshold" value="0.8" /> + </when> + <when value="false"> + </when> + </conditional> + </inputs> + <outputs> + <data format="peakml" name="output" label="Noise Filtered ${on_string}" /> + <data format="peakml" name="output_rejected" label="Noise Filtered Rejected ${on_string}" /> + </outputs> + <help> +**What it does** + +Filters noise from PeakML files, containing mass chromatograms at the lowest level. When the file contains a list of sets of mass chromatograms, the maximum score for the used method is calculated and compared to the given threshold. This is the best approach, as we expect that high quality can match up to low quality signals at the same mass and retention time. Only those entries scoring above the given threshold are stored in the output file. The rejected can be stored in a separate file (option 'rejected') for inspection or recovery. + +The option 'codadw' can be used to set the threshold for the CoDA Durbin-Watson noise filtering approach. Normally the Durbin-Watson criterion results in a value between 0 and 4, where higher means a large amount of periodicity in the signal and lower vica versa. For mass chromatograms we expect little periodicity in the signal, thus a lower value is preferable. However, in order to preserve unity in our quality scores the CoDA-DW score is scaled between 0..1, where higher is better (less periodicity in the signal). As a general rule-of-thumb, for high quality mass chromatograms a score >0.8 is expected. + +Remarks: +- CoDA-DW is scaled between 0..1, where higher is better mass chromatogram quality. + +**Citation** + +For the underlying tool, please cite ``PeakML/mzMatch: A File Format, Java Library, R Library, and Tool-Chain for Mass Spectrometry Data Analysis. Richard A. Scheltema, Andris Jankevics, Ritsert C. Jansen, Morris A. Swertz, and Rainer Breitling. Analytical Chemistry 2011 83 (7), pp 2786-2793`` + +If you use this tool in Galaxy, please cite Chilton J, et al. https://bitbucket.org/galaxyp/galaxyp-toolshed-mzmatch + </help> +</tool>