annotate text_exporter.xml @ 2:cf0d72c7b482 draft

Update.
author galaxyp
date Fri, 10 May 2013 17:31:05 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
1 <tool id="openms_text_exporter" version="0.1.0" name="Text Exporter">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
2 <description>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
3 </description>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
4 <macros>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
5 <import>macros.xml</import>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
6 </macros>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
7 <expand macro="stdio" />
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
8 <expand macro="requires" />
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
9 <command interpreter="python">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
10 openms_wrapper.py \
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
11 --executable '__SHELL__' --config $link \
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
12 --executable 'TextExporter' --config $config
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
13 </command>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
14 <configfiles>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
15 <configfile name="link">ln -s '${type.input}' 'input.${type.input.ext}'</configfile>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
16 <configfile name="config">[simple_options]
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
17 in=input.${type.input.ext}
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
18 out=${out}
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
19 #set $input_type = str($type.input_type)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
20 #if $input_type == "featurexml"
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
21 feature!minimal=${type.minimal}
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
22 #end if
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
23 no_ids=${no_ids}
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
24 </configfile>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
25 </configfiles>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
26 <inputs>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
27 <conditional name="type">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
28 <param name="input_type" type="select" label="Input Type">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
29 <option value="featurexml">Features (FeatureXML)</option>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
30 <option value="consensusxml">Consensus (ConsensusXML)</option>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
31 <option value="idxml">Identifications (IdXML)</option>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
32 <option value="mzml">Peak List (mzML)</option>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
33 </param>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
34 <when value="mzml">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
35 <param format="mzml" name="input" type="data" label="Input Peak List"/>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
36 </when>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
37 <when value="featurexml">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
38 <param format="featurexml" name="input" type="data" label="Input Features"/>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
39 <param name="minimal" type="boolean" label="Minimal Output" help="Set this flag to write only three attributes: RT, m/z, and intensity." truevalue="true" falsevalue="false" />
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
40 </when>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
41 <when value="consensusxml">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
42 <param format="consensusxml" name="input" type="data" label="Input Consensus"/>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
43 </when>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
44 <when value="idxml">
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
45 <param format="idxml" name="input" type="data" label="Input Identifications"/>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
46 </when>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
47 </conditional>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
48 <param name="no_ids" type="boolean" label="Suppress IDs" help="Supresses output of identification data." truevalue="true" falsevalue="false" />
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
49 </inputs>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
50 <outputs>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
51 <data format="txt" name="out" />
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
52 </outputs>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
53 <help>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
54 **What it does**
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
55
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
56 The goal of this tool is to create output in a table format that is easily readable in Excel or OpenOffice. Lines in the output correspond to rows in the table.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
57
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
58 utput files begin with comment lines, starting with the special character "#". The last such line(s) will be a header with column names, but this may be preceded by more general comments.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
59
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
60 Because the OpenMS XML formats contain different kinds of data in a hierarchical structure, TextExporter produces somewhat unusual TSV/CSV files for many inputs: Different lines in the output may belong to different types of data, and the number of columns and the meanings of the individual fields depend on the type. In such cases, the first column always contains an indicator (in capital letters) for the data type of the current line. In addition, some lines have to be understood relative to a previous line, if there is a hierarchical relationship in the data. (See below for details and examples.)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
61
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
62 Missing values are represented by "-1" or "nan" in numeric fields and by blanks in character/text fields.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
63
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
64 Depending on the input and the parameters, the output contains the following columns:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
65
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
66 featureXML input:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
67
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
68 first column: RUN / PROTEIN / UNASSIGNEDPEPTIDE / FEATURE / PEPTIDE (indicator for the type of data in the current row)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
69 a RUN line contains information about a protein identification run; further columns: run_id, score_type, score_direction, data_time, search_engine_version, parameters
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
70 a PROTEIN line contains data of a protein identified in the previously listed run; further columns: score, rank, accession, coverage, sequence
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
71 an UNASSIGNEDPEPTIDE line contains data of peptide hit that was not assigned to any feature; further columns: rt, mz, score, rank, sequence, charge, aa_before, aa_after, score_type, search_identifier, accessions
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
72 a FEATURE line contains data of a single feature; further columns: rt, mz, intensity, charge, width, quality, rt_quality, mz_quality, rt_start, rt_end
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
73 a PEPTIDE line contains data of a peptide hit annotated to the previous feature; further columns: same as for UNASSIGNEDPEPTIDE
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
74 With the no_ids flag, only FEATURE lines (without the FEATURE indicator) are written.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
75
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
76 With the feature:minimal flag, only the rt, mz, and intensity columns of FEATURE lines are written.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
77
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
78 consensusXML input:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
79
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
80 Output format produced for the out parameter:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
81
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
82 first column: MAP / RUN / PROTEIN / UNASSIGNEDPEPTIDE / CONSENSUS / PEPTIDE (indicator for the type of data in the current row)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
83 a MAP line contains information about a sub-map; further columns: id, filename, label, size (potentially followed by further columns containing meta data, depending on the input)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
84 a CONSENSUS line contains data of a single consensus feature; further columns: rt_cf, mz_cf, intensity_cf, charge_cf, width_cf, quality_cf, rt_X0, mz_X0, ..., rt_X1, mz_X1, ...
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
85 "..._cf" columns refer to the consensus feature itself, "..._Xi" columns refer to a sub-feature from the map with ID "Xi" (no quality column in this case); missing sub-features are indicated by "nan" values
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
86 see above for the formats of RUN, PROTEIN, UNASSIGNEDPEPTIDE, PEPTIDE lines
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
87 With the no_ids flag, only MAP and CONSENSUS lines are written.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
88
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
89 Output format produced for the consensus_centroids parameter:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
90
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
91 one line per consensus centroid
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
92 columns: rt, mz, intensity, charge, width, quality
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
93 Output format produced for the consensus_elements parameter:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
94
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
95 one line per sub-feature (element) of a consensus feature
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
96 first column: H / L (indicator for new/repeated element)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
97 H indicates a new element, L indicates the replication of the first element of the current consensus feature (for plotting)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
98 further columns: rt, mz, intensity, charge, width, rt_cf, mz_cf, intensity_cf, charge_cf, width_cf, quality_cf
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
99 "..._cf" columns refer to the consensus feature, the other columns refer to the sub-feature
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
100 Output format produced for the consensus_features parameter:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
101
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
102 one line per consensus feature (suitable for processing with e.g. R)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
103 columns: same as for a CONSENSUS line above, followed by additional columns for identification data
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
104 additional columns: peptide_N0, n_diff_peptides_N0, protein_N0, n_diff_proteins_N0, peptide_N1, ...
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
105 "..._Ni" columns refer to the identification run with index "Ni", n_diff_... stands for "number of different ..."; different peptides/proteins in one column are separated by "/"
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
106 With the no_ids flag, the additional columns are not included.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
107
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
108 idXML input:
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
109
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
110 first column: RUN / PROTEIN / PEPTIDE (indicator for the type of data in the current row)
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
111 see above for the formats of RUN, PROTEIN, PEPTIDE lines
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
112 additional column for PEPTIDE lines: predicted_rt
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
113 With the id:proteins_only flag, only RUN and PROTEIN lines are written.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
114
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
115 With the id:peptides_only flag, only PEPTIDE lines (without the PEPTIDE indicator) are written.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
116
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
117 With the id:first_dim_rt flag, the additional columns rt_first_dim and predicted_rt_first_dim are included for PEPTIDE lines.
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
118
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
119 **Citation**
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
120
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
121 For the underlying tool, please cite ``Marc Sturm, Andreas Bertsch, Clemens Gröpl, Andreas Hildebrandt, Rene Hussong, Eva Lange, Nico Pfeifer, Ole Schulz-Trieglaff, Alexandra Zerck, Knut Reinert, and Oliver Kohlbacher, 2008. OpenMS – an Open-Source Software Framework for Mass Spectrometry. BMC Bioinformatics 9: 163. doi:10.1186/1471-2105-9-163.``
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
122
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
123 If you use this tool in Galaxy, please cite Chilton J, et al. https://bitbucket.org/galaxyp/galaxyp-toolshed-openms
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
124 </help>
cf0d72c7b482 Update.
galaxyp
parents:
diff changeset
125 </tool>