# HG changeset patch
# User pjbriggs
# Date 1539868684 14400
# Node ID 3ab198df8f3f86e81e054867cca437144fe1841d
# Parent 43d6f81bc667edc08cd4ae4b599cb2ad5ef4d4aa
planemo upload for repository https://github.com/pjbriggs/Amplicon_analysis-galaxy commit 15390f18b91d838880d952eb2714f689bbd8a042
diff -r 43d6f81bc667 -r 3ab198df8f3f README.rst
--- a/README.rst Wed Jun 13 07:45:06 2018 -0400
+++ b/README.rst Thu Oct 18 09:18:04 2018 -0400
@@ -26,20 +26,8 @@
instance to detect the dependencies and reference data correctly
at run time.
-1. Install the dependencies
----------------------------
-
-The ``install_tool_deps.sh`` script can be used to fetch and install the
-dependencies locally, for example::
-
- install_tool_deps.sh /path/to/local_tool_dependencies
-
-This can take some time to complete. When finished it should have
-created a set of directories containing the dependencies under the
-specified top level directory.
-
-2. Install the tool files
--------------------------
+1. Install the tool from the toolshed
+-------------------------------------
The core tool is hosted on the Galaxy toolshed, so it can be installed
directly from there (this is the recommended route):
@@ -58,7 +46,7 @@
-3. Install the reference data
+2. Install the reference data
-----------------------------
The script ``References.sh`` from the pipeline package at
@@ -72,33 +60,14 @@
will install the data in ``/path/to/pipeline/data``.
**NB** The final amount of data downloaded and uncompressed will be
-around 6GB.
-
-4. Configure dependencies and reference data in Galaxy
-------------------------------------------------------
-
-The final steps are to make your Galaxy installation aware of the
-tool dependencies and reference data, so it can locate them both when
-the tool is run.
-
-To target the tool dependencies installed previously, add the
-following lines to the ``dependency_resolvers_conf.xml`` file in the
-Galaxy ``config`` directory::
+around 9GB.
-
- ...
-
-
- ...
-
+3. Configure reference data location in Galaxy
+----------------------------------------------
-(NB it is recommended to place these *before* the ````
-resolvers)
-
-(If you're not familiar with dependency resolvers in Galaxy then
-see the documentation at
-https://docs.galaxyproject.org/en/master/admin/dependency_resolvers.html
-for more details.)
+The final step is to make your Galaxy installation aware of the
+location of the reference data, so it can locate them both when the
+tool is run.
The tool locates the reference data via an environment variable called
``AMPLICON_ANALYSIS_REF_DATA_PATH``, which needs to set to the parent
@@ -108,7 +77,8 @@
installation is configured:
* **For local instances:** add a line to set it in the
- ``config/local_env.sh`` file of your Galaxy installation, e.g.::
+ ``config/local_env.sh`` file of your Galaxy installation (you
+ may need to create a new empty file first), e.g.::
export AMPLICON_ANALYSIS_REF_DATA_PATH=/path/to/pipeline/data
@@ -124,9 +94,9 @@
(For more about job destinations see the Galaxy documentation at
- https://galaxyproject.org/admin/config/jobs/#job-destinations)
+ https://docs.galaxyproject.org/en/master/admin/jobs.html#job-destinations)
-5. Enable rendering of HTML outputs from pipeline
+4. Enable rendering of HTML outputs from pipeline
-------------------------------------------------
To ensure that HTML outputs are displayed correctly in Galaxy
@@ -171,46 +141,32 @@
https://github.com/galaxyproject/galaxy/issues/4490 and
https://github.com/galaxyproject/galaxy/issues/1676
-Appendix: availability of tool dependencies
-===========================================
-
-The tool takes its dependencies from the underlying pipeline script (see
-https://github.com/MTutino/Amplicon_analysis/blob/master/README.md
-for details).
+Appendix: installing the dependencies manually
+==============================================
-As noted above, currently the ``install_tool_deps.sh`` script can be
-used to manually install the dependencies for a local tool install.
+If the tool is installed from the Galaxy toolshed (recommended) then
+the dependencies should be installed automatically and this step can
+be skipped.
-In principle these should also be available if the tool were installed
-from a toolshed. However it would be preferrable in this case to get as
-many of the dependencies as possible via the ``conda`` dependency
-resolver.
+Otherwise the ``install_amplicon_analysis_deps.sh`` script can be used
+to fetch and install the dependencies locally, for example::
-The following are known to be available via conda, with the required
-version:
+ install_amplicon_analysis.sh /path/to/local_tool_dependencies
- - cutadapt 1.8.1
- - sickle-trim 1.33
- - bioawk 1.0
- - fastqc 0.11.3
- - R 3.2.0
-
-Some dependencies are available but with the "wrong" versions:
+(This is the same script as is used to install dependencies from the
+toolshed.) This can take some time to complete, and when completed will
+have created a directory called ``Amplicon_analysis-1.2.3`` containing
+the dependencies under the specified top level directory.
- - spades (need 3.5.0)
- - qiime (need 1.8.0)
- - blast (need 2.2.26)
- - vsearch (need 1.1.3)
-
-The following dependencies are currently unavailable:
+**NB** The installed dependencies will occupy around 2.6G of disk
+space.
- - fasta_number (need 02jun2015)
- - fasta-splitter (need 0.2.4)
- - rdp_classifier (need 2.2)
- - microbiomeutil (need r20110519)
+You will need to make sure that the ``bin`` subdirectory of this
+directory is on Galaxy's ``PATH`` at runtime, for the tool to be able
+to access the dependencies - for example by adding a line to the
+``local_env.sh`` file like::
-(NB usearch 6.1.544 and 8.0.1623 are special cases which must be
-handled outside of Galaxy's dependency management systems.)
+ export PATH=/path/to/local_tool_dependencies/Amplicon_analysis-1.2.3/bin:$PATH
History
=======
@@ -218,6 +174,8 @@
========== ======================================================================
Version Changes
---------- ----------------------------------------------------------------------
+1.2.3.0 Updated to Amplicon_Analysis_Pipeline version 1.2.3; install
+ dependencies via tool_dependencies.xml.
1.2.2.0 Updated to Amplicon_Analysis_Pipeline version 1.2.2 (removes
jackknifed analysis which is not captured by Galaxy tool)
1.2.1.0 Updated to Amplicon_Analysis_Pipeline version 1.2.1 (adds
diff -r 43d6f81bc667 -r 3ab198df8f3f amplicon_analysis_pipeline.py
--- a/amplicon_analysis_pipeline.py Wed Jun 13 07:45:06 2018 -0400
+++ b/amplicon_analysis_pipeline.py Thu Oct 18 09:18:04 2018 -0400
@@ -60,9 +60,10 @@
sys.stderr.write("%s\n\n" % ('*'*width))
def clean_up_name(sample):
- # Remove trailing "_L[0-9]+_001" from Fastq
- # pair names
- split_name = sample.split('_')
+ # Remove extensions and trailing "_L[0-9]+_001" from
+ # Fastq pair names
+ sample_name = '.'.join(sample.split('.')[:1])
+ split_name = sample_name.split('_')
if split_name[-1] == "001":
split_name = split_name[:-1]
if split_name[-1].startswith('L'):
@@ -139,10 +140,12 @@
# Link to FASTQs and construct Final_name.txt file
sample_names = []
+ print "-- making Final_name.txt"
with open("Final_name.txt",'w') as final_name:
fastqs = iter(args.fastq_pairs)
for sample_name,fqr1,fqr2 in zip(fastqs,fastqs,fastqs):
sample_name = clean_up_name(sample_name)
+ print " %s" % sample_name
r1 = "%s_R1_.fastq" % sample_name
r2 = "%s_R2_.fastq" % sample_name
os.symlink(fqr1,r1)
diff -r 43d6f81bc667 -r 3ab198df8f3f amplicon_analysis_pipeline.xml
--- a/amplicon_analysis_pipeline.xml Wed Jun 13 07:45:06 2018 -0400
+++ b/amplicon_analysis_pipeline.xml Thu Oct 18 09:18:04 2018 -0400
@@ -1,21 +1,7 @@
-
+
analyse 16S rRNA data from Illumina Miseq paired-end reads
- amplicon_analysis_pipeline
- cutadapt
- sickle
- bioawk
- pandaseq
- spades
- fastqc
- qiime
- blast
- fasta-splitter
- rdp-classifier
- R
- vsearch
- microbiomeutil
- fasta_number
+ amplicon_analysis_pipeline
diff -r 43d6f81bc667 -r 3ab198df8f3f install_amplicon_analysis.sh
--- /dev/null Thu Jan 01 00:00:00 1970 +0000
+++ b/install_amplicon_analysis.sh Thu Oct 18 09:18:04 2018 -0400
@@ -0,0 +1,425 @@
+#!/bin/sh -e
+#
+# Prototype script to setup a conda environment with the
+# dependencies needed for the Amplicon_analysis_pipeline
+# script
+#
+# Handle command line
+usage()
+{
+ echo "Usage: $(basename $0) [DIR]"
+ echo ""
+ echo "Installs the Amplicon_analysis_pipeline package plus"
+ echo "dependencies in directory DIR (or current directory "
+ echo "if DIR not supplied)"
+}
+if [ ! -z "$1" ] ; then
+ # Check if help was requested
+ case "$1" in
+ --help|-h)
+ usage
+ exit 0
+ ;;
+ esac
+ # Assume it's the installation directory
+ cd $1
+fi
+# Versions
+PIPELINE_VERSION=1.2.3
+RDP_CLASSIFIER_VERSION=2.2
+# Directories
+TOP_DIR=$(pwd)/Amplicon_analysis-${PIPELINE_VERSION}
+BIN_DIR=${TOP_DIR}/bin
+CONDA_DIR=${TOP_DIR}/conda
+CONDA_BIN=${CONDA_DIR}/bin
+CONDA_LIB=${CONDA_DIR}/lib
+CONDA=${CONDA_BIN}/conda
+ENV_NAME="amplicon_analysis_pipeline@${PIPELINE_VERSION}"
+ENV_DIR=${CONDA_DIR}/envs/$ENV_NAME
+#
+# Functions
+#
+# Report failure and terminate script
+fail()
+{
+ echo ""
+ echo ERROR $@ >&2
+ echo ""
+ echo "$(basename $0): installation failed"
+ exit 1
+}
+#
+# Rewrite the shebangs in the installed conda scripts
+# to remove the full path to conda 'bin' directory
+rewrite_conda_shebangs()
+{
+ pattern="s,^#!${CONDA_BIN}/,#!/usr/bin/env ,g"
+ find ${CONDA_BIN} -type f -exec sed -i "$pattern" {} \;
+}
+#
+# Install conda
+install_conda()
+{
+ echo "++++++++++++++++"
+ echo "Installing conda"
+ echo "++++++++++++++++"
+ if [ -e ${CONDA_DIR} ] ; then
+ echo "*** $CONDA_DIR already exists ***" >&2
+ return
+ fi
+ local cwd=$(pwd)
+ local wd=$(mktemp -d)
+ cd $wd
+ wget -q https://repo.continuum.io/miniconda/Miniconda2-latest-Linux-x86_64.sh
+ bash ./Miniconda2-latest-Linux-x86_64.sh -b -p ${CONDA_DIR}
+ echo Installed conda in ${CONDA_DIR}
+ # Update the installation files
+ # This is to avoid problems when the length the installation
+ # directory path exceeds the limit for the shebang statement
+ # in the conda files
+ echo ""
+ echo -n "Rewriting conda shebangs..."
+ rewrite_conda_shebangs
+ echo "ok"
+ echo -n "Adding conda bin to PATH..."
+ PATH=${CONDA_BIN}:$PATH
+ echo "ok"
+ cd $cwd
+ rm -rf $wd/*
+ rmdir $wd
+}
+#
+# Create conda environment
+install_conda_packages()
+{
+ echo "+++++++++++++++++++++++++"
+ echo "Installing conda packages"
+ echo "+++++++++++++++++++++++++"
+ local cwd=$(pwd)
+ local wd=$(mktemp -d)
+ cd $wd
+ cat >environment.yml <${BIN_DIR}/Amplicon_analysis_pipeline.sh <${BIN_DIR}/install_reference_data.sh <${BIN_DIR}/ChimeraSlayer.pl <INSTALL.log 2>&1
+ echo "ok"
+ cd R-3.2.1
+ echo -n "Running configure..."
+ ./configure --prefix=$INSTALL_DIR --with-x=no --with-readline=no >>INSTALL.log 2>&1
+ echo "ok"
+ echo -n "Running make..."
+ make >>INSTALL.log 2>&1
+ echo "ok"
+ echo -n "Running make install..."
+ make install >>INSTALL.log 2>&1
+ echo "ok"
+ cd $cwd
+ rm -rf $wd/*
+ rmdir $wd
+ . ${CONDA_BIN}/deactivate
+}
+setup_pipeline_environment()
+{
+ echo "+++++++++++++++++++++++++++++++"
+ echo "Setting up pipeline environment"
+ echo "+++++++++++++++++++++++++++++++"
+ # vsearch113
+ echo -n "Setting up vsearch113..."
+ if [ -e ${BIN_DIR}/vsearch113 ] ; then
+ echo "already exists"
+ elif [ ! -e ${ENV_DIR}/bin/vsearch ] ; then
+ echo "failed"
+ fail "vsearch not found"
+ else
+ ln -s ${ENV_DIR}/bin/vsearch ${BIN_DIR}/vsearch113
+ echo "ok"
+ fi
+ # fasta_splitter.pl
+ echo -n "Setting up fasta_splitter.pl..."
+ if [ -e ${BIN_DIR}/fasta-splitter.pl ] ; then
+ echo "already exists"
+ elif [ ! -e ${ENV_DIR}/share/fasta-splitter/fasta-splitter.pl ] ; then
+ echo "failed"
+ fail "fasta-splitter.pl not found"
+ else
+ ln -s ${ENV_DIR}/share/fasta-splitter/fasta-splitter.pl ${BIN_DIR}/fasta-splitter.pl
+ echo "ok"
+ fi
+ # rdp_classifier.jar
+ local rdp_classifier_jar=rdp_classifier-${RDP_CLASSIFIER_VERSION}.jar
+ echo -n "Setting up rdp_classifier.jar..."
+ if [ -e ${TOP_DIR}/share/rdp_classifier/${rdp_classifier_jar} ] ; then
+ echo "already exists"
+ elif [ ! -e ${ENV_DIR}/share/rdp_classifier/rdp_classifier.jar ] ; then
+ echo "failed"
+ fail "rdp_classifier.jar not found"
+ else
+ mkdir -p ${TOP_DIR}/share/rdp_classifier
+ ln -s ${ENV_DIR}/share/rdp_classifier/rdp_classifier.jar ${TOP_DIR}/share/rdp_classifier/${rdp_classifier_jar}
+ echo "ok"
+ fi
+ # qiime_config
+ echo -n "Setting up qiime_config..."
+ if [ -e ${TOP_DIR}/qiime/qiime_config ] ; then
+ echo "already exists"
+ else
+ mkdir -p ${TOP_DIR}/qiime
+ cat >${TOP_DIR}/qiime/qiime_config <>$INSTALL_DIR/INSTALLATION.log 2>&1
-EOF
- popd
- rm -rf $wd/*
- rmdir $wd
-}
-function install_amplicon_analysis_pipeline_1_2_2() {
- install_amplicon_analysis_pipeline $1 1.2.2
-}
-function install_amplicon_analysis_pipeline_1_2_1() {
- install_amplicon_analysis_pipeline $1 1.2.1
-}
-function install_amplicon_analysis_pipeline_1_1() {
- install_amplicon_analysis_pipeline $1 1.1
-}
-function install_amplicon_analysis_pipeline_1_0() {
- install_amplicon_analysis_pipeline $1 1.0
-}
-function install_amplicon_analysis_pipeline() {
- version=$2
- echo Installing Amplicon_analysis $version
- install_dir=$1/amplicon_analysis_pipeline/$version
- if [ -f $install_dir/env.sh ] ; then
- return
- fi
- mkdir -p $install_dir
- echo Moving to $install_dir
- pushd $install_dir
- wget -q https://github.com/MTutino/Amplicon_analysis/archive/v${version}.tar.gz
- tar zxf v${version}.tar.gz
- mv Amplicon_analysis-${version} Amplicon_analysis
- rm -rf v${version}.tar.gz
- popd
- # Make setup file
- cat > $install_dir/env.sh < $install_dir/env.sh < $INSTALL_DIR/env.sh <$INSTALL_DIR/INSTALLATION.log 2>&1
- mv sickle $INSTALL_DIR/bin
- popd
- rm -rf $wd/*
- rmdir $wd
- # Make setup file
- cat > $INSTALL_DIR/env.sh <$INSTALL_DIR/INSTALLATION.log 2>&1
- mv bioawk $INSTALL_DIR/bin
- mv maketab $INSTALL_DIR/bin
- popd
- rm -rf $wd/*
- rmdir $wd
- # Make setup file
- cat > $INSTALL_DIR/env.sh <$install_dir/INSTALLATION.log 2>&1
- ./configure --prefix=$install_dir >>$install_dir/INSTALLATION.log 2>&1
- make; make install >>$install_dir/INSTALLATION.log 2>&1
- popd
- rm -rf $wd/*
- rmdir $wd
- # Make setup file
- cat > $1/pandaseq/2.8.1/env.sh < $1/spades/3.5.0/env.sh < $1/fastqc/0.11.3/env.sh < test.f90
- gfortran -o test test.f90
- LGF=`ldd test | grep libgfortran | awk '{print $3}'`
- LGF_CANON=`readlink -f $LGF`
- LGF_VERS=`objdump -p $LGF_CANON | grep GFORTRAN_1 | sed -r 's/.*GFORTRAN_1\.([0-9])+/\1/' | sort -n | tail -1`
- if [ $LGF_VERS -gt $BUNDLED_LGF_VERS ]; then
- cp -p $BUNDLED_LGF_CANON ${BUNDLED_LGF_CANON}.bundled
- cp -p $LGF_CANON $BUNDLED_LGF_CANON
- fi
- popd
- rm -rf $wd/*
- rmdir $wd
- # Atlas 3.10 (build from source)
- # NB this stolen from galaxyproject/iuc-tools
- ##local wd=$(mktemp -d)
- ##echo Moving to $wd
- ##pushd $wd
- ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_3.10.2+gx0_src_all.tar.bz2
- ##wget -q https://depot.galaxyproject.org/software/lapack/lapack_3.5.0_src_all.tar.gz
- ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_patch-blas-lapack-1.0_src_all.diff
- ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_patch-shared-lib-1.0_src_all.diff
- ##wget -q https://depot.galaxyproject.org/software/atlas/atlas_patch-cpu-throttle-1.0_src_all.diff
- ##tar -jxvf atlas_3.10.2+gx0_src_all.tar.bz2
- ##cd ATLAS
- ##mkdir build
- ##patch -p1 < ../atlas_patch-blas-lapack-1.0_src_all.diff
- ##patch -p1 < ../atlas_patch-shared-lib-1.0_src_all.diff
- ##patch -p1 < ../atlas_patch-cpu-throttle-1.0_src_all.diff
- ##cd build
- ##../configure --prefix="$INSTALL_DIR" -D c -DWALL -b 64 -Fa alg '-fPIC' --with-netlib-lapack-tarfile=../../lapack_3.5.0_src_all.tar.gz -v 2 -t 0 -Si cputhrchk 0
- ##make
- ##make install
- ##popd
- ##rm -rf $wd/*
- ##rmdir $wd
- export ATLAS_LIB_DIR=$INSTALL_DIR/lib
- export ATLAS_INCLUDE_DIR=$INSTALL_DIR/include
- export ATLAS_BLAS_LIB_DIR=$INSTALL_DIR/lib/atlas
- export ATLAS_LAPACK_LIB_DIR=$INSTALL_DIR/lib/atlas
- export ATLAS_ROOT_PATH=$INSTALL_DIR
- export LD_LIBRARY_PATH=$INSTALL_DIR/lib:$LD_LIBRARY_PATH
- export LD_LIBRARY_PATH=$INSTALL_DIR/lib/atlas:$LD_LIBRARY_PATH
- # Numpy 1.7.1
- local wd=$(mktemp -d)
- echo Moving to $wd
- pushd $wd
- wget -q https://depot.galaxyproject.org/software/numpy/numpy_1.7_src_all.tar.gz
- tar -zxvf numpy_1.7_src_all.tar.gz
- cd numpy-1.7.1
- cat > site.cfg < $INSTALL_DIR/env.sh < $install_dir/env.sh <$install_dir/INSTALLATION.log 2>&1
- mv * $install_dir
- popd
- # Clean up
- rm -rf $wd/*
- rmdir $wd
- # Make setup file
-cat > $install_dir/env.sh < $install_dir/env.sh < $install_dir/env.sh <>$install_dir/INSTALLATION.log
-EOF
- done
- # Install fasta-splitter
- wget -q http://kirill-kryukov.com/study/tools/fasta-splitter/files/fasta-splitter-0.2.4.zip
- unzip -qq fasta-splitter-0.2.4.zip
- chmod 0755 fasta-splitter.pl
- mv fasta-splitter.pl $install_dir/bin
- popd
- # Clean up
- rm -rf $wd/*
- rmdir $wd
- # Make setup file
-cat > $install_dir/env.sh < $install_dir/env.sh < $install_dir/env.sh <$install_dir/bin/uc2otutab.py
- cat uc2otutab.py >>$install_dir/bin/uc2otutab.py
- chmod +x $install_dir/bin/uc2otutab.py
- popd
- # Clean up
- rm -rf $wd/*
- rmdir $wd
- # Make setup file
-cat > $install_dir/env.sh <
+
+
+
+
+ https://raw.githubusercontent.com/pjbriggs/Amplicon_analysis-galaxy/master/install_amplicon_analysis.sh
+
+ sh ./install_amplicon_analysis.sh $INSTALL_DIR
+
+
+ $INSTALL_DIR/Amplicon_analysis-1.2.3/bin
+
+
+
+
+