comparison readme.md @ 9:7b5df538705e draft

"planemo upload for repository https://github.com/galaxyproteomics/tools-galaxyp/tree/master/tools/openms commit ddf41e8bda1ba065f5cdec98e93dee8165ffc1b9"
author galaxyp
date Thu, 03 Sep 2020 16:24:57 +0000
parents 399fae38a523
children 75ccdbc2475d
comparison
equal deleted inserted replaced
8:da8c663a5564 9:7b5df538705e
6 OpenMS is free software available under the three clause BSD license and runs under Windows, MacOSX and Linux. 6 OpenMS is free software available under the three clause BSD license and runs under Windows, MacOSX and Linux.
7 7
8 More informations are available at: 8 More informations are available at:
9 9
10 * https://github.com/OpenMS/OpenMS 10 * https://github.com/OpenMS/OpenMS
11 * http://open-ms.sourceforge.net 11 * https://www.openms.de/
12 12
13 The wrappers for these tools and most of their tests are automatically
14 generated using the `generate.sh` script. The generation of the tools is
15 based on the CTDConverter (https://github.com/WorkflowConversion/CTDConverter)
16 which can be fine tuned via the `hardcoded_params.json` file. This file allows
17 to blacklist and hardcode parameters and to modify or set arbitrary
18 CTD/XML attributes.
19
20 Note that, due to its size, the test data is excluded from this repository. In
21 order to generate the test data on call `test-data.sh`.
22
23 Manual updates should only be done to
24
25 - the `@GALAXY_VERSION@"` token in `macros.xml`
26 - and the manually contributed tests in `macros_test.xml` (The goal is that all
27 tools that do not have an automatically generated test are covered here)
28 - the `hardcoded_params.json` files
29
30 In a few cases patches may be acceptable.
31
32 Installation
33 ============
34
35 The Galaxy OpenMS tools can be installed from the toolshed. While most tools
36 will work out of the box some need attention since requirements can not be
37 fulfilled via Conda:
38
39 Not yet in Conda are:
40
41 - SpectraST (http://tools.proteomecenter.org/wiki/index.php?title=SpectraST)
42 - MaRaCluster (https://github.com/statisticalbiotechnology/maracluster)
43
44 Binaries for these tools can easily be obtained via:
45
46 ```
47 VERSION=....
48 git git clone -b release/$VERSION.0 https://github.com/OpenMS/OpenMS.git OpenMS$VERSION.0-git
49 git submodule init OpenMS$VERSION.0-git
50 git submodule update OpenMS$VERSION.0-git
51 ```
52
53 They are located in `OpenMS$VERSION-git/THIRDPARTY/`.
54
55 Not in Conda due to licencing restrictions:
56
57 - Mascot http://www.matrixscience.com/
58 - MSFragger https://github.com/Nesvilab/MSFragger
59 - Novor http://www.rapidnovor.org/novor
60
61 There are multiple ways to enable the Galaxy tools to use these binaries.
62
63 - Just copy them to the `bin` path within Galaxy's conda environment
64 - Put them in any other path that that is included in PATH
65 - Edit the corresponding tools: In the command line part search for the parameters `-executable`, `-maracluster_executable`, or `-mascot_directory` and edit them appropriately.
66
67 Working
68 =======
69
70 The tools work by:
71
72 Preprocessing:
73
74 - For each input / output data set parameter a directory is crated (named by
75 the parameter)
76 - For input data set parameters the links to the actual location of the data
77 sets are created
78
79 Main:
80
81 - The galaxy wrapper create two json config files: one containing the
82 parameters and the values chosen by the user and the other the values of
83 hardcoded parameters.
84 - With `OpenMSTool -write_ctd ./` a CTD (names OpenMSTool.ctd) file is
85 generated that contains the default values.
86 - A call to `fill_ctd.py` fills in the values from the json config files into
87 the CTD file
88 - The actual tool is called `OpenMSTool -ini OpenMSTool.ctd` and also all input
89 and output parameters are given on the command line.
90
91 Postprocessing:
92
93 - output data sets are moved to the final locations
94
95 Note: The reason for handling data sets on the command line (and not specifying
96 them in the CTD file) is mainly that all files in Galaxy have the extension
97 `.dat` and OpenMS tools require an appropriate extension. But this may change
98 in the future.
13 99
14 Generating OpenMS wrappers 100 Generating OpenMS wrappers
15 ========================== 101 ==========================
16 102
17 * install OpenMS (you can do this automatically through Conda) 103 1. remove old test data: `rm -rf $(ls -d test-data/* | egrep -v "random|\.loc")`
18 * create a folder called CTD 104 2. `./generate.sh`
19 * if you installed openms as a binary in a specific directory, execute the following command in the `openms/bin` directory:
20
21 ```bash
22 for binary in `ls`; do ./$binary -write_ctd /PATH/TO/YOUR/CTD; done;
23 ```
24
25 * if there is no binary release (e.g. as with version 2.2), download and unpack the Conda package, find the `bin` folder and create a list of the tools as follow:
26
27 ```bash
28 ls >> tools.txt
29 ```
30
31 * search for the `bin` folder of your conda environment containing OpenMS and do:
32
33 ```bash
34 while read p; do
35 ./PATH/TO/BIN/$p -write_ctd /PATH/TO/YOUR/CTD;
36 done <tools.txt
37 ```
38
39 * You should have all CTD files now. `MetaProSIP.ctd` includes a not supported character: To use it, search for `²` and replace it (e.g. with `^2`).
40 105
41 * clone or install CTDopts 106 Whats happening:
42 107
43 ```bash 108 1. The binaries of the OpenMS package can generate a CTD file that describes
44 git clone https://github.com/genericworkflownodes/CTDopts 109 the parameters. These CTD files are converted to xml Galaxy tool descriptions
45 ``` 110 using the `CTDConverter`.
46 111
47 * add CTDopts to your `$PYTHONPATH` 112 2. The CI testing framework of OpenMS contains command lines and test data
113 (https://github.com/OpenMS/OpenMS/tree/develop/src/tests/topp). These tests
114 are described in two CMake files.
48 115
49 ```bash 116 - From these CMake files Galaxy tests are auto generated and stored in `macros_autotest.xml`
50 export PYTHONPATH=/home/user/CTDopts/ 117 - The command lines are stored in `prepare_test_data.sh` for regeneration of test data
51 ```
52 118
53 * clone or install CTD2Galaxy 119 More details can be found in the comments of the shell script.
54 120
55 ```bash 121 Open problems
56 git clone https://github.com/WorkflowConversion/CTDConverter.git 122 =============
57 ```
58
59 * If you have CTDopts and CTDConverter installed you are ready to generate Galaxy Tools from CTD definitions. Change the following command according to your needs, especially the `/PATH/TO` parts. The default files are provided in this repository. You might have to install `libxslt` and `lxml` to run it. Further information can be found on the CTDConverter page.
60 123
61 ```bash 124 Some tools stall in CI testing using `--biocontainers` which is why the OpenMS
62 python convert.py galaxy \ 125 tools are currently listed in `.tt_biocontainer_skip`. This is
63 -i /PATH/TO/YOUR/CTD/*.ctd \
64 -o ./PATH/TO/YOUR/WRAPPERS/ -t tool.conf \
65 -d datatypes_conf.xml -g openms \
66 -b version log debug test no_progress threads \
67 in_type executable myrimatch_executable \
68 fido_executable fidocp_executable \
69 omssa_executable pepnovo_e xecutable \
70 xtandem_executable param_model_directory \
71 java_executable java_memory java_permgen \
72 r_executable rt_concat_trafo_out param_id_pool \
73 -f /PATH/TO/filetypes.txt -m /PATH/TO/macros.xml \
74 -s PATH/TO/tools_blacklist.txt
75 ```
76 126
127 - AssayGeneratorMetabo and SiriusAdapter (both depend on sirius)
128 - OMSSAAdapter
77 129
78 * As last step you need to change manually the binary names of all external binaries you want to use in OpenMS. Some of these tools might already be deprecated and the files might not exist: 130 Using `docker -t` seems to solve the problem (see
79 131 https://github.com/galaxyproject/galaxy/issues/10153).
80 ```
81 sed -i '13 a\-fido_executable Fido' wrappers/FidoAdapter.xml
82 sed -i '13 a\-fidocp_executable FidoChooseParameters' wrappers/FidoAdapter.xml
83 sed -i '13 a\-myrimatch_executable myrimatch' wrappers/MyriMatchAdapter.xml
84 sed -i '13 a\-omssa_executable omssa' wrappers/OMSSAAdapter.xml
85 sed -i '13 a\-xtandem_executable xtandem' wrappers/XTandemAdapter.xml
86 ```
87
88 * For some tools, additional work has to be done. In `MSGFPlusAdapter.xml` the following is needed in the command section at the beginning (check your file to know what to copy where):
89
90 ```
91 <command><![CDATA[
92
93 ## check input file type
94 #set $in_type = $param_in.ext
95
96 ## create the symlinks to set the proper file extension, since msgf uses them to choose how to handle the input files
97 ln -s '$param_in' 'param_in.${in_type}' &&
98 ln -s '$param_database' param_database.fasta &&
99 ## find location of the MSGFPlus.jar file of the msgf_plus conda package
100 MSGF_JAR=\$(msgf_plus -get_jar_path) &&
101
102 MSGFPlusAdapter
103 -executable \$MSGF_JAR
104 #if $param_in:
105 -in 'param_in.${in_type}'
106 #end if
107 #if $param_out:
108 -out $param_out
109 #end if
110 #if $param_mzid_out:
111 -mzid_out $param_mzid_out
112 #end if
113 #if $param_database:
114 -database param_database.fasta
115 #end if
116
117 [...]
118 ]]>
119 ```
120
121 * In Xtandem Converter and probably in others:
122
123 ```
124 #if str($param_missed_cleavages) != '':
125 ```
126 This is because integers needs to be compared as string otherwise `0` becomes `false`.
127
128 * In `MetaProSIP.xml` add `R` as a requirement:
129
130 ```
131 <expand macro="requirements">
132 <requirement type="package" version="3.3.1">r-base</requirement>
133 </expand>
134 ```
135
136 * In `IDFileConverter.xml` the following is needed in the command section at the beginning (check your file to know what to copy where):
137
138 ```
139 <command><![CDATA[
140
141 ## check input file type
142 #set $in_type = $param_in.ext
143
144 ## create the symlinks to set the proper file extension, since IDFileConverter uses them to choose how to handle the input files
145 ln -s '$param_in' 'param_in.${in_type}' &&
146
147 IDFileConverter
148
149 #if $param_in:
150 -in 'param_in.${in_type}'
151 #end if
152
153 [...]
154 ]]>
155 ```
156
157 * In `IDFileConverter.xml` and `FileConverter.xml` add `auto_format="true"` to the output, e.g.:
158
159 - `<data name="param_out" auto_format="true"/>`
160 - `<data name="param_out" metadata_source="param_in" auto_format="true"/>`
161
162 * To add an example test case to `DecoyDatabase.xml` add the following after the output section. If standard settings change you might have to adjust the options and/or the test files.
163
164 ```
165 <tests>
166 <test>
167 <param name="param_in" value="DecoyDatabase_input.fasta"/>
168 <output name="param_out" file="DecoyDatabase_output.fasta"/>
169 </test>
170 </tests>
171 ```
172
173 * Additionally cause of lacking dependencies, the following adapters have been removed in `SKIP_TOOLS_FILES.txt` as well:
174 * OMSSAAdapter
175 * MyrimatchAdapter
176
177 * Additionally cause of a problematic parameter (-model_directory), the following adapter has been removed:
178 * PepNovoAdapter
179
180 132
181 Licence (MIT) 133 Licence (MIT)
182 ============= 134 =============
183 135
184 Permission is hereby granted, free of charge, to any person obtaining a copy 136 Permission is hereby granted, free of charge, to any person obtaining a copy