comparison README.md @ 0:59bc96331073 draft default tip

planemo upload for repository https://github.com/geraldinepascal/FROGS-wrappers/tree/v3.1.0 commit 08296fc88e3e938c482c631bd515b3b7a0499647
author frogs
date Thu, 28 Feb 2019 10:14:49 -0500
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:59bc96331073
1 ​ <img src="static/images/frogs_images/FROGS_logo.png" width="20%" style="display: block; margin: auto;"/> <img src="galaxy_project_logo_square.png" width="20%" style="display: block; margin: auto;"/>
2
3
4
5 Visit our web site : http://frogs.toulouse.inra.fr/
6
7 [![Release](https://img.shields.io/badge/release-3.1.0-blue.svg)![Date](https://img.shields.io/badge/date-February%202018-red.svg)](https://github.com/geraldinepascal/FROGS-wrappers/releases)[<img src="https://www.podcastscience.fm/wp-content/uploads/2017/12/deezer.png" width="5%" style="display: block; margin: auto;"/>](https://www.deezer.com/fr/playlist/5233843102?utm_source=deezer&utm_content=playlist-5233843102&utm_term=18632989_1545296531&utm_medium=web)
8
9
10
11 # Description
12
13 FROGS is a workflow designed to produce an OTU count matrix from high depth sequencing amplicon data.
14
15 FROGS-wrapper allow to add FROGS on a Galaxy instance.
16
17 # Table of content
18
19 * [Installing FROGS\-wrappers](#installing-frogs-wrappers)
20 * [Simplest way](#simplest-way)
21 * [From sources](#from-sources)
22 * [Prerequisites](#prerequisites)
23 * [FROGS\-wrappers installation](#frogs-wrappers-installation)
24 * [Use PEAR as reads merge software in preprocess](#use-pear-as-reads-merge-software-in-preprocess)
25 * [Upload and configure the databanks](#upload-and-configure-the-databanks)
26 * [Galaxy configuration](#galaxy-configuration)
27 * [Setup Galaxy environment variables](#setup-galaxy-environment-variables)
28 * [Install python packages inside virtual env](#install-python-packages-inside-virtual-env)
29 * [Avoid FROGS HTML report sanitization](#avoid-FROGS-HTML-report-sanitization)
30 * [Set memory and parallelisation settings](#set-memory-and-parallelisation-settings)
31 * [License](#license)
32 * [Copyright](#copyright)
33 * [Citation](#citation)
34 * [Contact](#contact)
35
36 # Installing FROGS-wrappers
37
38 ## Simplest way
39
40 FROGS is available on the Toolshed : https://toolshed.g2.bx.psu.edu/repository?repository_id=525e78406276b403&changeset_revision=76c750c5f0d1
41
42 It will install FROGS thanks to [conda](https://anaconda.org/bioconda/frogs), download all these XML tools and well configure them in your Galaxy.
43
44 The 22 FROGS tools will be in random order in your tools panel. We propose to control that order by modifying the `integrated_tool_panel.xml `.
45
46 We suppose that you installed FROGS in a spécific section named `FROGS` (update the FROGS version if necessary).
47
48 ```
49 <section id="FROGS" name="FROGS" version="">
50
51 <label text="OTUs reconstruction" id="FROGS_OTU" />
52
53 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_demultiplex/3.1" />
54 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_preprocess/3.1" />
55 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_clustering/3.1" />
56 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_remove_chimera/3.1" />
57 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_filters/3.1" />
58 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_itsx/3.1" />
59 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_affiliation_OTU/3.1" />
60 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_affiliation_postprocess/3.1" />
61 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_normalisation/3.1" />
62 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_Tree/3.1" />
63 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_clusters_stat/3.1" />
64 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_affiliations_stat/3.1" />
65 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_biom_to_stdBiom/3.1" />
66 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_biom_to_tsv/3.1" />
67 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_tsv_to_biom/3.1" />
68
69 <label text="OTUs structure and composition analysis" id="FROGSSTAT_Phyloseq" />
70
71 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Import_Data/3.1" />
72 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Composition_Visualisation/3.1" />
73 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Alpha_Diversity/3.1" />
74 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Beta_Diversity/3.1" />
75 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Sample_Clustering/3.1" />
76 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Structure_Visualisation/3.1" />
77 <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Multivariate_Analysis_Of_Variance/3.1" />
78 </section>
79 ```
80
81
82
83 ## From sources
84
85 #### Prerequisites
86
87 You should start by installing [FROGS](https://github.com/geraldinepascal/FROGS) (remember, FROGS is now installable via [conda](https://anaconda.org/bioconda/frogs) )
88
89 #### FROGS-wrappers installation
90
91 1. <u>Download wrapper</u>
92
93 Download the last released versions of FROGS-wrappers: https://github.com/geraldinepascal/FROGS-wrappers/releases
94
95 Uncompress and unarchive the release in your ` <Galaxy_Dir>/tools` directory
96
97 (replace the) link to the new directory like this
98
99 ` ln -s <Galaxy_Dir>/tools/FROGS-wrappers-<Release_Number> <Galaxy_Dir>/tools/FROGS`
100
101 2. <u>Add tools in galaxy</u>
102
103 Add the tools in `<Galaxy_Dir>/config/tool_conf.xml`
104
105 ```
106 <section id="FROGS_wrappers" name="FROGS">
107 <label text="OTUs reconstruction" id="FROGS_OTU" />
108 <tool file="FROGS/demultiplex.xml" />
109 <tool file="FROGS/preprocess.xml" />
110 <tool file="FROGS/clustering.xml" />
111 <tool file="FROGS/remove_chimera.xml" />
112 <tool file="FROGS/filters.xml" />
113 <tool file="FROGS/itsx.xml" />
114 <tool file="FROGS/affiliation_OTU.xml" />
115 <tool file="FROGS/affiliation_postprocess.xml" />
116 <tool file="FROGS/normalisation.xml" />
117 <tool file="FROGS/tree.xml" />
118 <tool file="FROGS/clusters_stat.xml" />
119 <tool file="FROGS/affiliations_stat.xml" />
120 <tool file="FROGS/biom_to_stdBiom.xml" />
121 <tool file="FROGS/biom_to_tsv.xml" />
122 <tool file="FROGS/tsv_to_biom.xml" />
123 <label text="OTUs structure and composition analysis" id="FROGSSTAT_Phyloseq" />
124 <tool file="FROGS/r_import_data.xml" />
125 <tool file="FROGS/r_composition.xml" />
126 <tool file="FROGS/r_alpha_diversity.xml" />
127 <tool file="FROGS/r_beta_diversity.xml" />
128 <tool file="FROGS/r_structure.xml" />
129 <tool file="FROGS/r_clustering.xml" />
130 <tool file="FROGS/r_manova.xml" />
131 </section>
132 ```
133 NB: If you used previous version of FROGS, you must removed `app` direcotry name in the paths names.
134
135 3. <u>Add images</u>
136
137 Add the FROGS-wrappers images in `<Galaxy_Dir>/static/images` directory
138
139 `cp -r <Galaxy_Dir>/tools/FROGS/static/images/frogs_images/ <Galaxy_Dir>/static/images/.`
140
141
142 # Use PEAR as reads merge software in preprocess
143 [PEAR](https://cme.h-its.org/exelixis/web/software/pear/) is one of the most effective software for read pair merging, but as its licence is not free for private use, we can not distribute it in FROGS.
144 If you work in an academic lab on a private Galaxy server, or if you have payed your licence you can use PEAR in FROGS preprocess.
145 For that, you need to:
146
147 * have PEAR in your PATH or in the FROGS libexec directory
148
149 * add PEAR in the FROGS preprocess Galaxy wrapper (<FROGS_DIR>/tools/preprocess/preprocess.xml):
150
151 :warning: there is two places where the list merge_software is defined, add pear in both of them!
152
153 add pear value in the list of `merge_software`
154 ```
155 <conditional name="merge_software_type">
156 <param name="merge_software" type="select" label="Merge software" help="Select the software to merge paired-end reads.">
157 <option value="vsearch" selected="true">Vsearch</option>
158 <option value="flash">Flash</option>
159 <option value="pear">PEAR</option>
160 </param>
161 <when value="flash">
162 <param name="expected_amplicon_size" type="integer" label="Expected amplicon size" help="Maximum amplicon length expected in approximately 90% of the amplicons." value="" />
163 </when>
164 </conditional>
165 ```
166
167 :warning: remember, there is two places where the list merge_software is defined, add pear in both of them!
168
169 # Upload and configure the databanks
170
171 Databanks are defined in `loc` files and `loc` files are defined in Galaxy datatable.
172
173 * Manual installation :
174
175 * datatables : `<Galaxy_Dir>/config/tool_data_table_conf.xml` , example : `<Galaxy_Dir>/tools/FROGS/tool_data_table_conf.xml.sample`
176
177 Add FROGS-wrappers datatables in the Galaxy datatables, but replace `{__HERE__}` by `tools/FROGS`.
178
179 * loc files example : `<Galaxy_Dir>/tools/FROGS/tool-data/`
180
181 Copy and rename them as indicated in the tool_data_table.
182
183 Then add entry as indicated in each loc files.
184
185 * Toolshed installation :
186
187 * datatables : `<Galaxy_Dir>/config/shed_tool_data_table_conf.xml` (nothing to modify, FROGS datatables should automatically be added)
188 * loc files to filled in : `tool-data/toolshed.g2.bx.psu.edu/repos/frogs/frogs_<VERSION>/<RANDOM>/`
189
190
191
192 We provide some databanks for each of these 3 data tables, you simply need to download them and add them in the corresponding `loc` files.
193
194 - Assignation databank for affiliation_OTU tool
195
196 URL : http://genoweb.toulouse.inra.fr/frogs_databanks/assignation
197
198 loc file :`frogs_db.loc`
199
200 - Contaminant databank for filter tool
201
202 URL : http://genoweb.toulouse.inra.fr/frogs_databanks/contaminants
203
204 loc file : `phiX_db.loc`
205
206 - Hyper variable in length amplicon databank for affiliation_postprocess tool
207
208 URL : http://genoweb.toulouse.inra.fr/frogs_databanks/HVL
209
210 loc file : `HVL.loc`
211
212
213 # Galaxy configuration
214
215 ## setup Galaxy environment variables
216
217 FROGS python programs need to be available in the PATH, if installing from source or via conda, you need to add `<FROGS_PATH>/app` directory in the Galaxy PATH environment variable. (see [environment-setup-file parameter](https://docs.galaxyproject.org/en/latest/admin/config.html#environment-setup-file) )
218
219 ## Install python packages inside virtual env
220
221 Galaxy runs in a specific virtual env. To allow FROGS clusters stat to access to the python scipy library, you need to (re)install it inside the Galaxy virtual env
222 ```
223 cd <Galaxy_Dir>
224 source .venv/bin/activate
225 pip install scipy
226 deactivate
227 ```
228
229 ## Avoid FROGS HTML report sanitization
230
231 By default Galaxy sanitizes HTML outputs to prevent XSS attacks.
232 FROGS outputs, for almost all tools, report in HTML format. To allow their visualization inside Galaxy, we need to avoid the Galaxy sanitization.
233 You need to uncomment `sanitize_whitelist_file` line in `<Galaxy_Dir>/config/galaxy.ini`, create the corresponding `<Galaxy_Dir>/config/sanitize_whitelist.txt` file if not already done, and add the following lines in it.
234 ```
235 FROGSSTAT_Phyloseq_Alpha_Diversity
236 FROGSSTAT_Phyloseq_Beta_Diversity
237 FROGSSTAT_Phyloseq_Composition_Visualisation
238 FROGSSTAT_Phyloseq_Import_Data
239 FROGSSTAT_Phyloseq_Multivariate_Analysis_Of_Variance
240 FROGSSTAT_Phyloseq_Sample_Clustering
241 FROGSSTAT_Phyloseq_Structure_Visualisation
242 FROGS_Tree
243 FROGS_affiliation_OTU
244 FROGS_affiliations_stat
245 FROGS_clustering
246 FROGS_clusters_stat
247 FROGS_filters
248 FROGS_itsx
249 FROGS_normalisation
250 FROGS_preprocess
251 FROGS_remove_chimera
252 ```
253
254 ## Set memory and parallelisation settings
255
256 If you have more than one CPU, it is recommended to increase the number of CPUs used by tools.
257
258 All CPUs must be on the same computer/node.
259
260
261 * Specifications
262
263 | Tool | RAM per CPU | Minimal RAM | Configuration example |
264 | :-------------------: | :---------: | :---------: | :-------------------: |
265 | preprocess | 8Gb | - | 12 CPUs and 96 GB |
266 | clustering | - | 10 Gb | 16 CPUs and 60 GB |
267 | ITSx / remove_chimera | 3Gb | 5Gb | 12 CPUs and 36 GB |
268 | affiliation_OTU | - | 20 Gb | 30 CPUs and 300 GB |
269
270 * Galaxy configuration
271
272 You need to add `destiantion` sections (one per tool) in the `<Galaxy-Dir>/config/job_conf.xml`
273 Example for SGE scheduler:
274
275
276 ```
277 <destinations>
278 ...
279 <destination id="FROGS_preprocess_job" runner="drmaa">
280 <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param>
281 <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param>
282 <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param>
283 <param id="nativeSpecification">-clear -q galaxyq -l mem=5G -l h_vmem=13G -pe parallel_smp 12</param>
284 </destination>
285 <destination id="FROGS_clustering_job" runner="drmaa">
286 <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param>
287 <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param>
288 <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param>
289 <param id="nativeSpecification">-clear -q galaxyq -l mem=3G -l h_vmem=10G -pe parallel_smp 16</param>
290 </destination>
291 <destination id="FROGS_remove_chimera_job" runner="drmaa">
292 <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param>
293 <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param>
294 <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param>
295 <param id="nativeSpecification">-clear -q galaxyq -l mem=3G -l h_vmem=4G -pe parallel_smp 12</param>
296 </destination>
297 <destination id="FROGS_itsx_job" runner="drmaa">
298 <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param>
299 <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param>
300 <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param>
301 <param id="nativeSpecification">-clear -q galaxyq -l mem=3G -l h_vmem=4G -pe parallel_smp 12</param>
302 </destination>
303 <destination id="FROGS_affiliation_OTU_job" runner="drmaa">
304 <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param>
305 <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param>
306 <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param>
307 <param id="nativeSpecification">-clear -q galaxyq -l mem=7G -l h_vmem=10G -pe parallel_smp 30</param>
308 </destination>
309 </destinations>
310 <tools>
311 ...
312 <tool id="FROGS_preprocess" destination="FROGS_preprocess_job"/>
313 <tool id="FROGS_clustering" destination="FROGS_clustering_job"/>
314 <tool id="FROGS_remove_chimera" destination="FROGS_remove_chimera_job"/>
315 <tool id="FROGS_itsx" destination="FROGS_itsx_job"/>
316 <tool id="FROGS_affiliation_OTU" destination="FROGS_affiliation_OTU_job"/>
317 </tools>
318 ```
319
320 # License
321 GNU GPL v3
322
323 # Copyright
324 2018 INRA
325
326 # Citation
327 Please cite the **FROGS article**: *Escudie F., et al. Bioinformatics, 2018. FROGS: Find, Rapidly, OTUs with Galaxy Solution.*
328
329 # Contact
330 frogs@inra.fr
331