Mercurial > repos > melpetera > generic_filter
changeset 3:cfd7c4aa5c26 draft default tip
Uploaded
author | melpetera |
---|---|
date | Fri, 03 Jan 2020 04:09:28 -0500 |
parents | 12cf1eed21f4 |
children | |
files | GalFilter/Dockerfile GalFilter/README.md GalFilter/README.txt GalFilter/about_docker.md GalFilter/filter_wrap.R GalFilter/generic_filter.xml GalFilter/static/images/Thumbs.db GalFilter/static/images/filter.png GalFilter/test-generic_filter.sh |
diffstat | 9 files changed, 327 insertions(+), 46 deletions(-) [+] |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/GalFilter/Dockerfile Fri Jan 03 04:09:28 2020 -0500 @@ -0,0 +1,87 @@ +################################################################################ +### +### [CONTAINER CORE FUNCTIONS]: +### install "Tool - Generic Filter" Galaxy tool (and required third part softwares, libraries, ...). +### [NOTE] +### please refer to README.md and about_docker.md files for further informations +### +################################################################################ + +################################################################################ +### fix parent containter +FROM ubuntu:16.04 + +################################################################################ +### set author +MAINTAINER Nils Paulhe <nils.paulhe@inra.fr> + +################################################################################ +### sets the environment variables +ENV TOOL_VERSION = "release_2016.03.03" +ENV CONTAINER_VERSION = 0.1 + +LABEL version = "${CONTAINER_VERSION}" +LABEL tool_version = "${TOOL_VERSION}" + +################################################################################ +### install third part tools + +# add debian repo for latest version of R +RUN echo "deb http://cran.univ-paris1.fr/bin/linux/ubuntu trusty/" >> /etc/apt/sources.list && \ + apt-key adv --keyserver keyserver.ubuntu.com --recv-keys E084DAB9 + +# Update and upgrade system +RUN apt-get update && \ + apt-get -y upgrade + +# install R +RUN apt-get install -y \ + r-base \ + libcurl4-openssl-dev \ + libxml2-dev +# NOTE: add `apt-get install -y git` if required + +# init R env. (Docker) +RUN echo "r <- getOption('repos'); r['CRAN'] <- 'http://cran.us.r-project.org'; options(repos = r);" > ~/.Rprofile + +# install R libs +RUN Rscript -e "install.packages('batch', dep=TRUE)" + +################################################################################ +### install core scripts + +# init. WORKDIR +RUN [ "mkdir", "/scripts" ] + +# +# [NOTE] to add scripts, we have two options: get them from GitHub OR copy them from this directory +# + +# get scripts using Git (option 1) +# RUN cd /scripts && \ +# git clone -b release/${TOOL_VERSION} --recursive https://github.com/workflow4metabolomics/tool-generic_filter.git + +# copy scripts files from this directory (option 2) +COPY "." "/scripts/" + +## set WORKDIR +# WORKDIR "/scripts" + +# set authorizations +RUN ["chmod", "a+x", "/scripts/filter_wrap.R"] + +# make tool accessible through PATH +ENV PATH = $PATH:/scripts + +################################################################################ +### clean +RUN apt-get clean && \ + apt-get autoremove -y && \ + rm -rf /var/lib/{apt,dpkg,cache,log}/ /tmp/* /var/tmp/* +# NOTE: run `apt-get remove -y git && \` if required + +################################################################################ +### Define Entry point script +## ENTRYPOINT ["/scripts/filter_wrap.R"] + +### [END] \ No newline at end of file
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/GalFilter/README.md Fri Jan 03 04:09:28 2020 -0500 @@ -0,0 +1,53 @@ +Generic Filter +======= + +Metadata +----------- + + * **@name**: Generic Filter + * **@version**: 2020.01 + * **@authors**: Marion Landi and Melanie Petera for first version - Maintainer: Melanie Petera (PFEM ; INRAE ; MetaboHUB) + * **@init date**: 2014, december + * **@main usage**: This tool allows to remove all samples and/or variables corresponding to specific values regarding designated factors or numerical variables. + + +Context +----------- + +This tool is provided as one of the [Workflow4Metabolomics](http://workflow4metabolomics.org) Galaxy instance utilities. W4M is a French infrastructure providing software tools to process, analyse and annotate metabolomics data. + +User interface is based on the Galaxy platform (homepage: https://galaxyproject.org/). It is an open, web-based platform for data intensive biomedical research. Whether on the free public server or your own instance, you can perform, reproduce, and share complete analyses. + + +Configuration +----------- + +### Requirement: + * R software: version > 3.0.0 recommended + * Specific R library: 'batch' + +### Docker: + * Use of this tool in a docker context: +Information is provided in the [about_docker.md file](about_docker.md). + + +Technical description +----------- + +Main files: + +- filter_script.R: R function (core script) +- filter_wrap.R: R script to link the main R function to inputs +- generic_filter.xml: XML wrapper (interface for Galaxy) + + +Services provided +----------- + + * Help and support: support@workflow4metabolomics.org + + +License +----------- + + * Cea Cnrs Inria Logiciel Libre License, version 2.1 (CECILL-2.1)
--- a/GalFilter/README.txt Tue Jun 27 05:57:04 2017 -0400 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,32 +0,0 @@ -## ****** global_filter environnemnt : ****** ## -# version April 2017 M Landi / M Petera - -## --- R bin and Packages : --- ## -$ R --version -R version > 3.0.0 -Platform: x86_64-redhat-linux-gnu (64-bit) - -The dependent libs are : ->install.packages("batch", dep=TRUE) --- - -## --- Binary dependencies --- ## -NA --- - -## --- Config : --- ## -NA --- - -## --- XML HELP PART --- ## -one image : -filter.png --- - -## --- DATASETS --- ## -No data set! waiting for galaxy pages --- - -## --- ??? COMMENTS ??? --- ## -NA --- \ No newline at end of file
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/GalFilter/about_docker.md Fri Jan 03 04:09:28 2020 -0500 @@ -0,0 +1,121 @@ +# Tool - Generic Filter [Dockerfile] + + +## Metadata + +- **@authors**: Nils Paulhe <nils.paulhe@inra.fr> (Only the docker part) +- **@date creation**: `2017-09-04` +- **@main usage**: create a Docker environment / container for "Tool - Generic Filter" + +## About + +For all informations about the tool please refer to its [README file](README.md). +For further information about Workflow4Metabolomics project and the people involved, please refer to [workflow4metabolomics.org](http://workflow4metabolomics.org/), [W4M github](https://github.com/workflow4metabolomics/) and [W4M Docker Hub](https://hub.docker.com/r/workflow4metabolomics/). + +## Configuration + +### Requirement: + * Docker Engine, Docker skills + * a Galaxy server docker compliant + +### Warning: + * These scripts are provided WITHOUT ANY WARRANTY. + * These scripts should be run by a system administrator (expert). + +## Services provided + +Build a docker container for "Tool - Generic Filter" Galaxy Tool. +Provide a XML Galaxy wrapper: generic_filter.docker.xml + +## Technical description + +### Create the docker container + +``` bash +docker build -t workflow4metabolomics/tool-generic_filter:2017.06 . +``` + +### Add the tool in Galaxy + +Note: the files name and path are just examples. Adapt them to your own Galaxy configuration / practices. + +If required, add in `config/job_conf.xml` file the minimal docker options: + +``` xml + <destinations default="docker_local"> + <destination id="local" runner="local"/> + <destination id="docker_local" runner="local"> + <param id="docker_enabled">true</param> + <param id="docker_sudo">false</param> + </destination> + </destinations> +``` + +For more options please refer to the [official documentation](https://galaxyproject.org/admin/tools/docker/). + +Copy or create a symbolic link of generic_filter.docker.xml file into your `tools/docker` directory (feel free to create or change the target directory). +Then add this XML resource in your `config/tool_conf.xml` file. For example: + +``` xml + <section id="docker_tools" name="Docker Tools"> + <tool file="docker/generic_filter.docker.xml"/> + </section> +``` + +### Modify this tool's XML config. file + +replace these sections: +```xml + <!-- requirements --> + <requirements> + <requirement type="package" version="1.1_4">r-batch</requirement> + </requirements> + + <!-- cmd --> + <command> + Rscript '$__tool_directory__/filter_wrap.R' + dataMatrix_in "$dataMatrix_in" + sampleMetadata_in "$sampleMetadata_in" + <!-- ... --> +``` + +by these sections: +```xml + <!-- requirements --> + <requirements> + <container type="docker">workflow4metabolomics/tool-generic_filter:2017.06</container> + </requirements> + + <!-- cmd --> + /usr/bin/Rscript /scripts/filter_wrap.R + dataMatrix_in "$dataMatrix_in" + <!-- ... --> +``` + +## License (Dockerfile only!) + +The `Dockerfile` file is under the following license: +``` + Copyright (c) 2017 workflow4metabolomics.org / INRA + + Permission is hereby granted, free of charge, to any person obtaining a copy + of this software and associated documentation files (the "Software"), to deal + in the Software without restriction, including without limitation the rights + to use, copy, modify, merge, publish, distribute, sublicense, and/or sell + copies of the Software, and to permit persons to whom the Software is + furnished to do so, subject to the following conditions: + + The above copyright notice and this permission notice shall be included in all + copies or substantial portions of the Software. + + THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR + IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY, + FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE + AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER + LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, + OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE + SOFTWARE. +``` + +- For the Galaxy Tool's license, please refer to its `README` file. +- For the Galaxy Wrapper's license, please refer to its `XML` file.
--- a/GalFilter/filter_wrap.R Tue Jun 27 05:57:04 2017 -0400 +++ b/GalFilter/filter_wrap.R Fri Jan 03 04:09:28 2020 -0500 @@ -10,6 +10,7 @@ # V-1: Restriction of old filter wrapper to Filter according to factors # # V-1.1: Modification to allow the choice of meta-data table for filtering # # V-2: Addition of numerical filter # +# V-2.5: -h option + additional information in stdout # # # # # # Input files: dataMatrix.txt ; sampleMetadata.txt ; variableMetadata.txt # @@ -19,6 +20,23 @@ library(batch) #necessary for parseCommandArgs function + +# Constants +argv <- commandArgs(trailingOnly = FALSE) +script.path <- sub("--file=","",argv[grep("--file=",argv)]) +prog.name <- basename(script.path) + +# Help +if (length(grep('-h', argv)) >0) { + cat("Usage:", prog.name, + "dataMatrix_in myDataMatrix.tsv", + "sampleMetadata_in mySampleMetadata.tsv", + "variableMetadata_in myVariableMetadata.tsv", + "...", + "\n") + quit(status = 0) +} + args = parseCommandArgs(evaluate=FALSE) #interpretation of arguments given in command line as an R list of objects source_local <- function(...){ @@ -32,6 +50,14 @@ if(length(args) < 8){ stop("NOT enough argument !!!") } + +cat('\nJob starting time:\n',format(Sys.time(), "%a %d %b %Y %X"), +'\n\n--------------------------------------------------------------------', +'\nParameters used in "Generic Filter":\n\n') +print(args) +cat('--------------------------------------------------------------------\n\n') + + list_num <- NULL if(!is.null(args$parm_col)){ for( i in which(names(args)=="num_file") ){ @@ -59,5 +85,13 @@ # NUM, ls.num, FACT, ls.fact, # ion.file.out, meta.samp.file.out, meta.ion.file.out) + +cat('\n--------------------------------------------------------------------', +'\nInformation about R (version, Operating System, attached or loaded packages):\n\n') +sessionInfo() +cat('--------------------------------------------------------------------\n', +'\nJob ending time:\n',format(Sys.time(), "%a %d %b %Y %X")) + + #delete the parameters to avoid the passage to the next tool in .RData image rm(args)
--- a/GalFilter/generic_filter.xml Tue Jun 27 05:57:04 2017 -0400 +++ b/GalFilter/generic_filter.xml Fri Jan 03 04:09:28 2020 -0500 @@ -1,4 +1,4 @@ -<tool id="generic_filter" name="Generic_Filter" version="2017.06"> +<tool id="generic_filter" name="Generic_Filter" version="2020.01"> <description>Removes elements according to numerical or qualitative values</description> <requirements> <requirement type="package" version="1.1_4">r-batch</requirement> @@ -142,12 +142,14 @@ .. class:: infomark -**Authors** Marion Landi and Melanie Petera +**Authors:** +Melanie Petera for original R code - Marion Landi and Melanie Petera for original Galaxy wrapper +**Maintainer:** Melanie Petera (PFEM - INRAE - MetaboHUB) --------------------------------------------------- ============== -Generic_Filter +Generic Filter ============== ----------- @@ -164,7 +166,7 @@ .. image:: filter.png - :width: 800 + :width: 700 @@ -175,7 +177,7 @@ +----------------------------+------------+ | Parameter : num + label | Format | +============================+============+ -| 1 : Data Matrix file | tabular | +| 1 : Data matrix file | tabular | +----------------------------+------------+ | 2 : Sample metadata file | tabular | +----------------------------+------------+ @@ -194,7 +196,7 @@ | If 'yes' (not default): execution deletes all samples or variables (according to selection) for which the designated | numerical parameter ("Name of the column to remove" field) equals the selected range of values (depending on the "Interval of values to remove" chosen). | To delete multiple independent intervals of value from the same parameter, it is necessary to add as many times as necessary - | the corresponding column via the button "Add new Identify the parameter to filter". + | the corresponding column via the "Insert Identify the parameter to filter" button. | *On file (only if 'yes')* | Choice between sample and variable filtering (referring to the input file containing the factor). @@ -218,7 +220,7 @@ | | *Notes:* -| - these parameters can be used several times using the "Add new Identify the parameter to filter" button +| - these parameters can be used several times using the "Insert Identify the parameter to filter" button | - Numerical filter does not remove NA values; if you want to suppress them, use the Qualitative filter below | @@ -227,7 +229,7 @@ | If 'yes' (not default): execution deletes all data samples or variables (according to selection) for which the designated | factor ("Name of the column to remove" field) equals the selected level ("Remove factor when" field). | To delete multiple values of the same factor, it is necessary to add as many times as necessary the corresponding factor - | via the button "Add new Removing a level in factor". + | via the "Insert Removing a level in factor" button. | *On file (only if 'yes')* | Choice between sample and variable filtering (referring to the input file containing the factor). @@ -243,7 +245,7 @@ | | *Notes:* -| - these parameters can be used several times using the "Add new Removing a level in factor" button +| - these parameters can be used several times using the "Insert Removing a level in factor" button | - this filter can be used to filter NA in numerical parameters, or to remove a specific numerical value | @@ -255,7 +257,7 @@ dataMatrix | tabular output - | Identical to the input Data Matrix file with the sample columns and variable lines deleted according to specified filters + | Identical to the input Data matrix file with the sample columns and variable lines deleted according to specified filters | sampleMetadata @@ -271,14 +273,18 @@ --------------------------------------------------- ---------------- -Working example ---------------- +---------------------- +Additional information +---------------------- .. class:: warningmark -Soon see the corresponding "Page" +For more information about input files, refer to the corresponding "W4M HowTo" page: + | `W4M table format for Galaxy <http://workflow4metabolomics.org/sites/workflow4metabolomics.org/files/files/w4m_TableFormatForGalaxy_150908.pdf>`_ + | + + </help> <citations>
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/GalFilter/test-generic_filter.sh Fri Jan 03 04:09:28 2020 -0500 @@ -0,0 +1,12 @@ +#!/bin/bash + +# set path +scriptdir=$(cd "$( dirname "${BASH_SOURCE[0]}" )" && pwd) + +# run test +Rscript $scriptdir/filter_wrap.R dataMatrix_in "$scriptdir/test-data/in_DM1.tabular" sampleMetadata_in "$scriptdir/test-data/in_SM1.tabular" variableMetadata_in "$scriptdir/test-data/in_VM1.tabular" Numeric "TRUE" num_file "variable" parm_col "rt" Interval "lower" low_value "1.2" Factors "TRUE" qual_file "sample" factor_col "Time" factors_value "3" dataMatrix_out "$scriptdir/out_DM1.tabular" sampleMetadata_out "$scriptdir/out_SM1.tabular" variableMetadata_out "$scriptdir/out_VM1.tabular" + +# test diff +diff $scriptdir/out_DM1.tabular $scriptdir/test-data/out_DM1.tabular || exit 2 +diff $scriptdir/out_SM1.tabular $scriptdir/test-data/out_SM1.tabular || exit 2 +diff $scriptdir/out_VM1.tabular $scriptdir/test-data/out_VM1.tabular || exit 2 \ No newline at end of file