Mercurial > repos > frogs > frogs_3_1_0
diff README.md @ 0:59bc96331073 draft default tip
planemo upload for repository https://github.com/geraldinepascal/FROGS-wrappers/tree/v3.1.0 commit 08296fc88e3e938c482c631bd515b3b7a0499647
author | frogs |
---|---|
date | Thu, 28 Feb 2019 10:14:49 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/README.md Thu Feb 28 10:14:49 2019 -0500 @@ -0,0 +1,331 @@ + <img src="static/images/frogs_images/FROGS_logo.png" width="20%" style="display: block; margin: auto;"/> <img src="galaxy_project_logo_square.png" width="20%" style="display: block; margin: auto;"/> + + + +Visit our web site : http://frogs.toulouse.inra.fr/ + +[![Release](https://img.shields.io/badge/release-3.1.0-blue.svg)![Date](https://img.shields.io/badge/date-February%202018-red.svg)](https://github.com/geraldinepascal/FROGS-wrappers/releases)[<img src="https://www.podcastscience.fm/wp-content/uploads/2017/12/deezer.png" width="5%" style="display: block; margin: auto;"/>](https://www.deezer.com/fr/playlist/5233843102?utm_source=deezer&utm_content=playlist-5233843102&utm_term=18632989_1545296531&utm_medium=web) + + + +# Description + +FROGS is a workflow designed to produce an OTU count matrix from high depth sequencing amplicon data. + +FROGS-wrapper allow to add FROGS on a Galaxy instance. + +# Table of content + +* [Installing FROGS\-wrappers](#installing-frogs-wrappers) + * [Simplest way](#simplest-way) + * [From sources](#from-sources) + * [Prerequisites](#prerequisites) + * [FROGS\-wrappers installation](#frogs-wrappers-installation) +* [Use PEAR as reads merge software in preprocess](#use-pear-as-reads-merge-software-in-preprocess) +* [Upload and configure the databanks](#upload-and-configure-the-databanks) +* [Galaxy configuration](#galaxy-configuration) + * [Setup Galaxy environment variables](#setup-galaxy-environment-variables) + * [Install python packages inside virtual env](#install-python-packages-inside-virtual-env) + * [Avoid FROGS HTML report sanitization](#avoid-FROGS-HTML-report-sanitization) + * [Set memory and parallelisation settings](#set-memory-and-parallelisation-settings) +* [License](#license) +* [Copyright](#copyright) +* [Citation](#citation) +* [Contact](#contact) + +# Installing FROGS-wrappers + +## Simplest way + +FROGS is available on the Toolshed : https://toolshed.g2.bx.psu.edu/repository?repository_id=525e78406276b403&changeset_revision=76c750c5f0d1 + +It will install FROGS thanks to [conda](https://anaconda.org/bioconda/frogs), download all these XML tools and well configure them in your Galaxy. + +The 22 FROGS tools will be in random order in your tools panel. We propose to control that order by modifying the `integrated_tool_panel.xml `. + +We suppose that you installed FROGS in a spécific section named `FROGS` (update the FROGS version if necessary). + +``` +<section id="FROGS" name="FROGS" version=""> + + <label text="OTUs reconstruction" id="FROGS_OTU" /> + + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_demultiplex/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_preprocess/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_clustering/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_remove_chimera/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_filters/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_itsx/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_affiliation_OTU/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_affiliation_postprocess/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_normalisation/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_Tree/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_clusters_stat/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_affiliations_stat/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_biom_to_stdBiom/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_biom_to_tsv/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGS_tsv_to_biom/3.1" /> + + <label text="OTUs structure and composition analysis" id="FROGSSTAT_Phyloseq" /> + + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Import_Data/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Composition_Visualisation/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Alpha_Diversity/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Beta_Diversity/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Sample_Clustering/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Structure_Visualisation/3.1" /> + <tool id="toolshed.g2.bx.psu.edu/repos/frogs/frogs_3_1_0/FROGSSTAT_Phyloseq_Multivariate_Analysis_Of_Variance/3.1" /> +</section> +``` + + + +## From sources + +#### Prerequisites + +You should start by installing [FROGS](https://github.com/geraldinepascal/FROGS) (remember, FROGS is now installable via [conda](https://anaconda.org/bioconda/frogs) ) + +#### FROGS-wrappers installation + +1. <u>Download wrapper</u> + + Download the last released versions of FROGS-wrappers: https://github.com/geraldinepascal/FROGS-wrappers/releases + + Uncompress and unarchive the release in your ` <Galaxy_Dir>/tools` directory + + (replace the) link to the new directory like this + + ` ln -s <Galaxy_Dir>/tools/FROGS-wrappers-<Release_Number> <Galaxy_Dir>/tools/FROGS` + +2. <u>Add tools in galaxy</u> + + Add the tools in `<Galaxy_Dir>/config/tool_conf.xml` + + ``` + <section id="FROGS_wrappers" name="FROGS"> + <label text="OTUs reconstruction" id="FROGS_OTU" /> + <tool file="FROGS/demultiplex.xml" /> + <tool file="FROGS/preprocess.xml" /> + <tool file="FROGS/clustering.xml" /> + <tool file="FROGS/remove_chimera.xml" /> + <tool file="FROGS/filters.xml" /> + <tool file="FROGS/itsx.xml" /> + <tool file="FROGS/affiliation_OTU.xml" /> + <tool file="FROGS/affiliation_postprocess.xml" /> + <tool file="FROGS/normalisation.xml" /> + <tool file="FROGS/tree.xml" /> + <tool file="FROGS/clusters_stat.xml" /> + <tool file="FROGS/affiliations_stat.xml" /> + <tool file="FROGS/biom_to_stdBiom.xml" /> + <tool file="FROGS/biom_to_tsv.xml" /> + <tool file="FROGS/tsv_to_biom.xml" /> + <label text="OTUs structure and composition analysis" id="FROGSSTAT_Phyloseq" /> + <tool file="FROGS/r_import_data.xml" /> + <tool file="FROGS/r_composition.xml" /> + <tool file="FROGS/r_alpha_diversity.xml" /> + <tool file="FROGS/r_beta_diversity.xml" /> + <tool file="FROGS/r_structure.xml" /> + <tool file="FROGS/r_clustering.xml" /> + <tool file="FROGS/r_manova.xml" /> + </section> + ``` + NB: If you used previous version of FROGS, you must removed `app` direcotry name in the paths names. + +3. <u>Add images</u> + + Add the FROGS-wrappers images in `<Galaxy_Dir>/static/images` directory + + `cp -r <Galaxy_Dir>/tools/FROGS/static/images/frogs_images/ <Galaxy_Dir>/static/images/.` + + +# Use PEAR as reads merge software in preprocess +[PEAR](https://cme.h-its.org/exelixis/web/software/pear/) is one of the most effective software for read pair merging, but as its licence is not free for private use, we can not distribute it in FROGS. +If you work in an academic lab on a private Galaxy server, or if you have payed your licence you can use PEAR in FROGS preprocess. +For that, you need to: + +* have PEAR in your PATH or in the FROGS libexec directory + +* add PEAR in the FROGS preprocess Galaxy wrapper (<FROGS_DIR>/tools/preprocess/preprocess.xml): + + :warning: there is two places where the list merge_software is defined, add pear in both of them! + + add pear value in the list of `merge_software` +``` + <conditional name="merge_software_type"> + <param name="merge_software" type="select" label="Merge software" help="Select the software to merge paired-end reads."> + <option value="vsearch" selected="true">Vsearch</option> + <option value="flash">Flash</option> + <option value="pear">PEAR</option> + </param> + <when value="flash"> + <param name="expected_amplicon_size" type="integer" label="Expected amplicon size" help="Maximum amplicon length expected in approximately 90% of the amplicons." value="" /> + </when> + </conditional> +``` + +:warning: remember, there is two places where the list merge_software is defined, add pear in both of them! + +# Upload and configure the databanks + +Databanks are defined in `loc` files and `loc` files are defined in Galaxy datatable. + +* Manual installation : + + * datatables : `<Galaxy_Dir>/config/tool_data_table_conf.xml` , example : `<Galaxy_Dir>/tools/FROGS/tool_data_table_conf.xml.sample` + + Add FROGS-wrappers datatables in the Galaxy datatables, but replace `{__HERE__}` by `tools/FROGS`. + + * loc files example : `<Galaxy_Dir>/tools/FROGS/tool-data/` + + Copy and rename them as indicated in the tool_data_table. + + Then add entry as indicated in each loc files. + +* Toolshed installation : + + * datatables : `<Galaxy_Dir>/config/shed_tool_data_table_conf.xml` (nothing to modify, FROGS datatables should automatically be added) + * loc files to filled in : `tool-data/toolshed.g2.bx.psu.edu/repos/frogs/frogs_<VERSION>/<RANDOM>/` + + + +We provide some databanks for each of these 3 data tables, you simply need to download them and add them in the corresponding `loc` files. + +- Assignation databank for affiliation_OTU tool + + URL : http://genoweb.toulouse.inra.fr/frogs_databanks/assignation + + loc file :`frogs_db.loc` + +- Contaminant databank for filter tool + + URL : http://genoweb.toulouse.inra.fr/frogs_databanks/contaminants + + loc file : `phiX_db.loc` + +- Hyper variable in length amplicon databank for affiliation_postprocess tool + + URL : http://genoweb.toulouse.inra.fr/frogs_databanks/HVL + + loc file : `HVL.loc` + + +# Galaxy configuration + +## setup Galaxy environment variables + +FROGS python programs need to be available in the PATH, if installing from source or via conda, you need to add `<FROGS_PATH>/app` directory in the Galaxy PATH environment variable. (see [environment-setup-file parameter](https://docs.galaxyproject.org/en/latest/admin/config.html#environment-setup-file) ) + +## Install python packages inside virtual env + +Galaxy runs in a specific virtual env. To allow FROGS clusters stat to access to the python scipy library, you need to (re)install it inside the Galaxy virtual env +``` +cd <Galaxy_Dir> +source .venv/bin/activate +pip install scipy +deactivate +``` + +## Avoid FROGS HTML report sanitization + +By default Galaxy sanitizes HTML outputs to prevent XSS attacks. +FROGS outputs, for almost all tools, report in HTML format. To allow their visualization inside Galaxy, we need to avoid the Galaxy sanitization. +You need to uncomment `sanitize_whitelist_file` line in `<Galaxy_Dir>/config/galaxy.ini`, create the corresponding `<Galaxy_Dir>/config/sanitize_whitelist.txt` file if not already done, and add the following lines in it. +``` +FROGSSTAT_Phyloseq_Alpha_Diversity +FROGSSTAT_Phyloseq_Beta_Diversity +FROGSSTAT_Phyloseq_Composition_Visualisation +FROGSSTAT_Phyloseq_Import_Data +FROGSSTAT_Phyloseq_Multivariate_Analysis_Of_Variance +FROGSSTAT_Phyloseq_Sample_Clustering +FROGSSTAT_Phyloseq_Structure_Visualisation +FROGS_Tree +FROGS_affiliation_OTU +FROGS_affiliations_stat +FROGS_clustering +FROGS_clusters_stat +FROGS_filters +FROGS_itsx +FROGS_normalisation +FROGS_preprocess +FROGS_remove_chimera +``` + +## Set memory and parallelisation settings + +If you have more than one CPU, it is recommended to increase the number of CPUs used by tools. + +All CPUs must be on the same computer/node. + + + * Specifications + + | Tool | RAM per CPU | Minimal RAM | Configuration example | + | :-------------------: | :---------: | :---------: | :-------------------: | + | preprocess | 8Gb | - | 12 CPUs and 96 GB | + | clustering | - | 10 Gb | 16 CPUs and 60 GB | + | ITSx / remove_chimera | 3Gb | 5Gb | 12 CPUs and 36 GB | + | affiliation_OTU | - | 20 Gb | 30 CPUs and 300 GB | + + * Galaxy configuration + + You need to add `destiantion` sections (one per tool) in the `<Galaxy-Dir>/config/job_conf.xml` + Example for SGE scheduler: + + +``` +<destinations> + ... + <destination id="FROGS_preprocess_job" runner="drmaa"> + <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param> + <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param> + <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param> + <param id="nativeSpecification">-clear -q galaxyq -l mem=5G -l h_vmem=13G -pe parallel_smp 12</param> + </destination> + <destination id="FROGS_clustering_job" runner="drmaa"> + <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param> + <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param> + <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param> + <param id="nativeSpecification">-clear -q galaxyq -l mem=3G -l h_vmem=10G -pe parallel_smp 16</param> + </destination> + <destination id="FROGS_remove_chimera_job" runner="drmaa"> + <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param> + <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param> + <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param> + <param id="nativeSpecification">-clear -q galaxyq -l mem=3G -l h_vmem=4G -pe parallel_smp 12</param> + </destination> + <destination id="FROGS_itsx_job" runner="drmaa"> + <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param> + <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param> + <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param> + <param id="nativeSpecification">-clear -q galaxyq -l mem=3G -l h_vmem=4G -pe parallel_smp 12</param> + </destination> + <destination id="FROGS_affiliation_OTU_job" runner="drmaa"> + <param id="galaxy_external_runjob_script">scripts/drmaa_external_runner.py</param> + <param id="galaxy_external_killjob_script">scripts/drmaa_external_killer.py</param> + <param id="galaxy_external_chown_script">scripts/external_chown_script.py</param> + <param id="nativeSpecification">-clear -q galaxyq -l mem=7G -l h_vmem=10G -pe parallel_smp 30</param> + </destination> +</destinations> +<tools> +... + <tool id="FROGS_preprocess" destination="FROGS_preprocess_job"/> + <tool id="FROGS_clustering" destination="FROGS_clustering_job"/> + <tool id="FROGS_remove_chimera" destination="FROGS_remove_chimera_job"/> + <tool id="FROGS_itsx" destination="FROGS_itsx_job"/> + <tool id="FROGS_affiliation_OTU" destination="FROGS_affiliation_OTU_job"/> +</tools> +``` + +# License + GNU GPL v3 + +# Copyright + 2018 INRA + +# Citation + Please cite the **FROGS article**: *Escudie F., et al. Bioinformatics, 2018. FROGS: Find, Rapidly, OTUs with Galaxy Solution.* + +# Contact + frogs@inra.fr +