view ordinate.xml @ 2:8569e8d8b712 draft

planemo upload for repository https://github.com/galaxyproject/tools-iuc/tree/master/tools/ampvis2 commit 09b56ef3e09ad6c5923c88616fea5cbd77d87616
author iuc
date Mon, 18 Dec 2023 09:44:03 +0000
parents f84ec808bee3
children 75e025c9bc10
line wrap: on
line source

<tool id="ampvis2_ordinate" name="ampvis2 ordination plot" version="@TOOL_VERSION@+galaxy@VERSION_SUFFIX@" profile="@PROFILE@" license="MIT">
    <description></description>
    <macros>
        <import>macros.xml</import>
        <xml name="distmeasure_macro">
            <param argument="distmeasure" type="select" label="Distance measure">
                <option value="wunifrac">Weighted UniFrac. Requires a rooted phylogenetic tree.</option>
                <option value="unifrac">Unweighted UniFrac. Requires a phylogenetic tree.</option>
                <option value="jsd">Jensen-Shannon Divergence</option>
                <option value="manhattan">manhattan</option>
                <option value="euclidean">euclidean</option>
                <option value="canberra">canberra</option>
                <option value="bray" selected="true">bray</option>
                <option value="kulczynski">kulczynski</option>
                <option value="jaccard">jaccard</option>
                <option value="gower">gower</option>
                <option value="altGower">altGower</option>
                <option value="morisita">morisita</option>
                <option value="horn">horn</option>
                <option value="mountford">mountford</option>
                <option value="raup">raup</option>
                <option value="binomial">binomial</option>
                <option value="chao">chao</option>
                <option value="cao">cao</option>
                <option value="mahalanobis">mahalanobis</option>
            </param>
        </xml>
        <!-- default is most often hellinger, but for N/MMDS none is suggested -->
        <xml name="transform_macro">
            <param argument="transform" type="select" label="Transforms the abundances before ordination" help="See details in decostand. Using the hellinger transformation is a good choice when performing PCA/RDA as it will produce a more ecologically meaningful result (read about the double-zero problem in Numerical Ecology). When the Hellinger transformation is used with CA/CCA it will help reducing the impact of low abundant species. When performing nMDS or PCoA (aka mMDS) it is not recommended to also use data transformation as this will obscure the chosen distance measure.">
                <option value="none">No transformation</option>
                <option value="total">divide by margin total (total)</option>
                <option value="max">divide by margin maximum (max)</option>
                <option value="freq">divide by margin total and multiply by the number of non-zero items, so that the average of non-zero entries is one (freq)</option>
                <option value="normalize">make margin sum of squares equal to one (normalize)</option>
                <option value="range">standardize values into range 0:1. If all values are constant, they will be transformed to 0. (range)</option>
                <option value="standardize">scale x to zero mean and unit variance (standardize)</option>
                <option value="pa">to presence/absence scale 0/1: absence/presence (pa)</option>
                <option value="chi.square">divide by row sums and square root of column sums, and adjust for square root of matrix total (chi.square)</option>
                <option value="hellinger">square root of method = "total" (hellinger)</option>
                <option value="log">logarithmic transformation (log)</option>
                <option value="sqrt">square root (sqrt)</option>
            </param>
        </xml>
        <xml name="transform_macro_hellinger">
            <expand macro="transform_macro">
                <option value="hellinger" selected="true">square root of method = "total" (hellinger)</option>
            </expand>
        </xml>
        <xml name="constrain_macro">
            <expand macro="metadata_select_discrete" argument="constrain" optional="false" multiple="true" label="Constrain analysis by" help="Variable(s) in the metadata for constrained analyses. Multiple variables can be provided, but keep in mind that the more variables selected the more the result will be similar to unconstrained analysis."/>
        </xml>
    </macros>
    <expand macro="header"/>
    <command detect_errors="exit_code"><![CDATA[
        Rscript '$rscript'
    ]]></command>
    <configfiles>
        <configfile name="rscript"><![CDATA[
            #if $type_cond.type in ['RDA', 'CCA'] and $type_cond.constrain
                #set constrain_list='c("' + '", "'.join(str($type_cond.constrain).split(",")) + '")'
            #else
                #set constrain_list='NULL'
            #end if
            library(ampvis2, quietly = TRUE)
            data <- readRDS("$data")
            details <- amp_ordinate(
                data,
                filter_species = $filter_species,
                type = "$type_cond.type",
                #if $type_cond.type in ['MMDS', 'NMDS']
                    distmeasure = "$type_cond.distmeasure",
                #end if
                transform = "$type_cond.transform",
                #if $type_cond.type in ['RDA', 'CCA']
                    constrain = $constrain_list,
                #end if
                ## x_axis = 1,
                ## y_axis = 2,
                print_caption = $print_caption,
                #if $sample_color_by
                    sample_color_by = "$sample_color_by",
                #end if
                ## sample_color_order = NULL,
                #if $sample_shape_by
                    sample_shape_by = "$sample_shape_by",
                #end if
                #if $sample_colorframe
                    sample_colorframe = "$sample_colorframe",
                #end if
                #if $sample_colorframe_label
                    sample_colorframe_label = "$sample_colorframe_label",
                #end if
                ## sample_colorframe_label_size = 3,
                #if $sample_label_by
                    sample_label_by = "$sample_label_by",
                #end if
                ## sample_label_size = 4,
                ## sample_label_segment_color = "black",
                ## sample_point_size = 2,
                #if $sample_trajectory
                    sample_trajectory = "$sample_trajectory",
                    #if $sample_trajectory_group
                        sample_trajectory_group = "$sample_trajectory_group",
                    #end if
                #end if
                ## sample_plotly = NULL,
                #if $species_plot_cond.species_plot == "TRUE"
                    species_plot = $species_plot_cond.species_plot,
                    species_nlabels = $species_plot_cond.species_nlabels,
                    species_label_taxonomy = "$species_plot_cond.species_label_taxonomy",
                    ## species_label_size = 3,
                    ## species_label_color = "grey10",
                    ## species_rescale = FALSE,
                    ## species_point_size = 2,
                    ## species_shape = 20,
                    ## species_plotly = FALSE,
                #end if
                #if $envfit_factor
                    envfit_factor = "$envfit_factor",
                #end if
                #if $envfit_numeric
                    envfit_numeric = "$envfit_numeric",
                #end if
                envfit_signif_level = $envfit_signif_level,
                ## envfit_textsize = 3,
                ## envfit_textcolor = "darkred",
                ## envfit_numeric_arrows_scale = 1,
                ## envfit_arrowcolor = "darkred",
                ## envfit_show = TRUE,
                repel_labels = $repel_labels,
                opacity = $opacity,
                tax_empty = "$tax_empty",
                detailed_output = TRUE,
                ## num_threads = 1L,
            )
            plot <- details\$plot
            @OUTPUT_TOKEN@
            ggsave("$screeplot", print(details\$screeplot), device="$output_options.out_format")
            ## TODO output more from model
            ## $model
            ## $dsites
            ## $dspecies
            ## $evf_factor_model
            ## $evf_numeric_modei
            ## #else
            ##     write.table(raw, file = "$plot_raw", sep = "\t") 
            ## #end if
        ]]></configfile>
    </configfiles>
    <inputs>
        <expand macro="rds_metadata_input_macro"/>
        <param argument="filter_species" type="float" value="0.1" min="0" optional="true" label="Abundance threshold" help="Remove low abundant OTU's across all samples below this threshold in percent. Setting this to 0 may drastically increase computation time."/>
        <conditional name="type_cond">
            <param argument="type" type="select" label="Ordination method" help="Note that PCoA is not performed by the vegan package, but the pcoa function from the APE package.">
                <option value="PCA">(PCA) Principal Components Analysis</option>
                <option value="RDA">(RDA) Redundancy Analysis (considered the constrained version of PCA)</option>
                <option value="CA">(CC) Correspondence Analysis</option>
                <option value="CCA">(CCA) Canonical Correspondence Analysis (considered the constrained version of CA)</option>
                <option value="DCA">(DCA) Detrended Correspondence Analysis</option>
                <option value="NMDS">(NMDS) non-metric Multidimensional Scaling</option>
                <option value="MMDS">(MMDS) metric Multidimensional Scaling a.k.a Principal Coordinates Analysis (not to be confused with PCA)</option>
            </param>
            <when value="PCA">
                <expand macro="transform_macro_hellinger"/>
            </when>
            <when value="RDA">
                <expand macro="transform_macro_hellinger"/>
                <expand macro="constrain_macro"/>
            </when>
            <when value="CA">
                <expand macro="transform_macro_hellinger"/>
            </when>
            <when value="CCA">
                <expand macro="transform_macro_hellinger"/>
                <expand macro="constrain_macro"/>
            </when>
            <when value="DCA">
                <expand macro="transform_macro_hellinger"/>
            </when>
            <when value="NMDS">
                <expand macro="distmeasure_macro"/>
                <expand macro="transform_macro">
                    <option value="none" selected="true">No transformation</option>
                </expand>
            </when>
            <when value="MMDS">
                <expand macro="distmeasure_macro"/>
                <expand macro="transform_macro">
                    <option value="none" selected="true">No transformation</option>
                </expand>
            </when>
        </conditional>
        <param argument="print_caption" type="boolean" truevalue="TRUE" falsevalue="FALSE" label="Auto-generate a figure caption" help="Based on the used arguments. The caption includes a description of how the result has been generated as well as references for the methods used."/>
        <expand macro="metadata_select" argument="sample_color_by" label="Color sample points by"/>
        <expand macro="metadata_select_discrete" argument="sample_shape_by" label="Shape sample points by"/>
        <expand macro="metadata_select_discrete" argument="sample_colorframe" label="Frame the sample points with a polygon by" help="Split by the variable defined by sample_color_by"/>
        <expand macro="metadata_select" argument="sample_colorframe_label" label="Label Frame by"/>
        <expand macro="metadata_select" argument="sample_label_by" label="Label sample points by"/>
        <expand macro="metadata_select" argument="sample_trajectory" label="Make a trajectory between sample points by"/>
        <expand macro="metadata_select" argument="sample_trajectory_group" label="Make a trajectory between sample points by the sample_trajectory argument, but within individual groups."/>
         
        <conditional name="species_plot_cond">
            <param argument="species_plot" type="select" label="Plot species points">
                <option value="TRUE">Yes</option>
                <option value="FALSE" selected="true">No</option>
            </param>
            <when value="TRUE">
                <param argument="species_nlabels" type="integer" value="10" min="1" label="Number of the most extreme species labels to plot" help="Ordered by the sum of the numerical values of the x,y coordinates. Only makes sense with PCA/RDA)."/>
                <expand macro="taxlevel_macro" argument="species_label_taxonomy" label="Taxonomic level by which to label the species points">
                    <option value="Genus" selected="true">Genus</option>
                </expand>
                <param argument="species_label_size" type="integer" value="3" min="1" label="Size of the species text labels"/>
            </when>
            <when value="FALSE"/>
        </conditional>

        <expand macro="metadata_select_discrete" argument="envfit_factor" label="Categorical variables to fit onto the ordination plot"/>
        <expand macro="metadata_select_numeric" argument="envfit_numeric" label="Numerical variables to fit arrows onto the ordination plot." help="The lengths of the arrows are scaled by significance. "/>
        <param argument="envfit_signif_level" type="float" value="0.005" min="0" max="1" label="Significance threshold" help="For displaying the fitting results"/>

        <param argument="repel_labels" type="boolean" truevalue="TRUE" falsevalue="FALSE" label="Repel all labels" help="To prevent cluttering of the plot"/>
        <param argument="opacity" type="float" value="0.8" min="0" max="1" label="Opacity of plotted points and colorframe" help="0: invisible, 1: opaque."/>
        <expand macro="tax_empty_macro"/>
        <expand macro="out_format_macro"/>
        <param name="output_screeplot" type="boolean" checked="false" label="Output screeplot" help="plot of variances against the number of the principal component"/>
    </inputs>
    <outputs>
        <expand macro="out_macro"/>
        <expand macro="out_macro" name="screeplot" label=": screeplot">
            <filter>output_screeplot</filter>
        </expand>
    </outputs>
    <tests>
        <!-- defaults (PCA) -->
        <test expect_num_outputs="2">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="output_screeplot" value="true"/>
            <output name="plot" value="AalborgWWTPs-ordinate.pdf" ftype="pdf"/>
            <output name="screeplot" value="AalborgWWTPs-ordinate-screeplot.pdf" ftype="pdf"/>
        </test>
        <!-- RDA + caption -->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <conditional name="type_cond">
                <param name="type" value="RDA"/>
                <param name="distmeasure" value="manhattan"/>
                <param name="transform" value="chi.square"/>
                <param name="constrain" value="Plant"/>
            </conditional>
            <output name="plot" value="AalborgWWTPs-ordinate-rda.pdf" ftype="pdf"/>
        </test>
        <!-- MMDS + unfirac (which requires a tree) -->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs-complete.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <conditional name="type_cond">
                <param name="type" value="MMDS"/>
                <param name="distmeasure" value="unifrac"/>
            </conditional>
            <output name="plot" value="AalborgWWTPs-ordinate-nmds.pdf" ftype="pdf"/>
        </test>
        <!-- color, shape and colorframe -->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <param name="sample_color_by" value="Year"/>
            <param name="sample_shape_by" value="Plant"/>
            <param name="sample_colorframe" value="Year"/>
            <param name="sample_colorframe_label" value="Year"/>
            <output name="plot" value="AalborgWWTPs-ordinate-color-shape-frame.pdf" ftype="pdf"/>
        </test>
        <!-- label, trajectory -->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <param name="sample_label_by" value="Plant"/>
            <param name="sample_trajectory" value="Year"/>
            <output name="plot" value="AalborgWWTPs-ordinate-label-traj.pdf" ftype="pdf"/>
        </test>
        <!-- trajectory group -->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <param name="sample_trajectory_group" value="Plant"/>
            <output name="plot" value="AalborgWWTPs-ordinate-label-traj-group.pdf" ftype="pdf"/>
        </test>
        <!-- species plot -->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <conditional name="species_plot_cond">
                <param name="species_plot" value="TRUE"/>
                <param name="species_nlabels" value="2"/>
                <param name="species_label_taxonomy" value="Family"/>
                <param name="species_label_size" value="4"/>
            </conditional>
            <output name="plot" value="AalborgWWTPs-ordinate-species.pdf" ftype="pdf"/>
        </test>
        <!-- envfit factor-->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <param name="envfit_factor" value="Plant"/>
            <output name="plot" value="AalborgWWTPs-ordinate-envfit-factor.pdf" ftype="pdf"/>
        </test>
        <!-- envfit numeric-->
        <test expect_num_outputs="1">
            <param name="data" value="AalborgWWTPs.rds" ftype="ampvis2"/> 
            <param name="metadata_list" value="AalborgWWTPs-metadata.list"/> 
            <param name="envfit_numeric" value="Year"/>
            <output name="plot" value="AalborgWWTPs-ordinate-envfit-num.pdf" ftype="pdf"/>
        </test>
    </tests>
    <help><![CDATA[
What it does
============

Generate ordination plots suited for analysis and comparison of microbial communities.

The Galaxy tool calls the `amp_ordinate
<https://kasperskytte.github.io/ampvis2/reference/amp_ordinate.html>`_ function
of the ampvis2 package, which is a wrapper around the vegan package to generate
ggplot2 ordination plots.

Details
=======

The ``amp_ordinate`` function is primarily based on two packages: 

1. ``vegan`` package, which performs the actual ordination, and 
2. the ggplot2-package to generate the plot. 

The function generates an ordination plot by the following process:

- Various input argument checks and error messages
- OTU-table filtering, where low abundant OTU's across all samples are removed (if not **Abundance threshold** (``filter_species``) is set to 0)
- Data transformation (if not **Transforms the abundances before ordination** (``transform``) is set to "no transformation").
  See details in `decostand <https://rdrr.io/cran/vegan/man/decostand.html>`_
- Calculate distance matrix based on the chosen distmeasure if the chosen ordination method is PCoA/nMDS/DCA.
  See details in `vegdist <https://rdrr.io/cran/vegan/man/vegdist.html>`_
- Perform the actual ordination and calculate the axis scores for both samples and species/OTU's
- Visualise the result with ggplot2 or plotly in various ways defined by the user

When the chosen ordination method is an eigenanalysis-based method then the
relative contribution (eigenvalue) of each axis to the total inertia in the data
(sum of all eigenvalues, including those of the constrained space) is indicated
in percent at the axis titles. When one of the constrained ordination methods
(RDA and CCA) is used then a second value is furthermore shown which then
indicates the relative contribution of the particular axis to the total
constrained space only.

Input
=====

@HELP_RDS_INPUT@

@HELP_METADATA_LIST_INPUT@

Output
======

An ordination plot in the chosen output format.
    ]]></help>
    <expand macro="citations"/>
</tool>