view secimtools/ttest.xml @ 0:b54326490b4d draft

Upload 21.3.4.2 release
author malex
date Mon, 08 Mar 2021 20:55:03 +0000
parents
children
line wrap: on
line source

<tool id="secimtools_ttest" name="T-Test (Paired or Unpaired)" version="@WRAPPER_VERSION@">
    <description>on features.</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <expand macro="requirements" />
    <command detect_errors="exit_code"><![CDATA[
ttest.py
--input $input
--design $design
--uniqueID $uniqueID
--group $group
--pairing $pairing
--summaries $summaries
--flags $flags
--volcano $volcano
#if $order
    --order $order
#end if
    ]]></command>
    <inputs>
        <param name="input" type="data" format="tabular" label="Wide Dataset" help="Input dataset in wide format and tab separated. If file is not tab separated see TIP below."/>
        <param name="design" type="data" format="tabular" label="Design File" help="Design file tab separated. Note you need a 'sampleID' column. If not tab separated see TIP below."/>
        <param name="uniqueID" type="text" size="30" value="" label="Unique Feature ID" help="Name of the column in your Wide Dataset that has unique Feature IDs."/>
        <param name="group" type="text" size="30" label="Group/Treatment" help="Name of the column in your Design File that contains group classifications."/>
        <param name="pairing" size="30" display="radio" type="select" value="unpaired" label="Select Test" help="Select either paired (dependent samples) or unpaired (independent samples) tests.">
           <option value="unpaired"  selected="true">Unpaired (Independent Samples)</option>
           <option value="paired"    selected="true">Paired (Dependent Samples)</option>
        </param>
        <param name="order" type="text" value="" size="30" label="Pairing ID" help="Name of the column in your Design File that contains Pairing IDs. Ignored for Unpaired (Independent Samples) test."/>
    </inputs>
    <outputs>
        <data format="tabular" name="summaries" label="${tool.name} on ${on_string}: Summaries that include p-values and mean differences."/>
        <data format="tabular" name="flags" label="${tool.name} on ${on_string}: Flags that include 0.01,  0.05 and  0.10 significance levels for the differences. "/>
        <data format="pdf" name="volcano" label="${tool.name} on ${on_string}: Volcano plots for the differences."/>
    </outputs>
    <tests>
        <test>
            <param name="input"    value="ST000006_data.tsv"/>
            <param name="design"   value="ST000006_design.tsv"/>
            <param name="uniqueID" value="Retention_Index" />
            <param name="group"    value="White_wine_type_and_source" />
            <param name="pairing"  value="unpaired" />
            <output name="summaries" file="ST000006_ttest_select_unpaired_summary.tsv" />
            <output name="flags"     file="ST000006_ttest_select_unpaired_flags.tsv" />
            <output name="volcano"   file="ST000006_ttest_select_unpaired_volcano.pdf" compare="sim_size" delta="10000"/>
        </test>
    </tests>
    <help><![CDATA[

@TIP_AND_WARNING@

**Tool Description**

The tool performs a two-sided t-test for two groups of dependent samples (paired or dependent t-test) or multiple (two or more) groups of independent samples (unpaired or independent t-test).
The user selects which test (paired or unpaired) to perform.

In an unpaired t-test, the samples within and between groups are independent.
The test is performed for all pairs of conditions specified using the Group/Treatment field.
For example, if there are three treatment conditions (Control, Time1 and Time2) then t-tests will be performed for: (i) Control vs Time1, (ii) Control vs Time2, and (iii) Time1 vs Time2.
Note that this will give slightly different results than the contrast in an ANOVA because the ANOVA uses all groups to estimate the error.

A paired t-test can be performed for pairs of treatment conditions where sample pairs are known and specified by the user in the Pairing ID field.
Here, the difference between the measurements for the pairs is calculated.
To ensure that the differences are taken in the same order across all pairs, the Group/Treatment variable is required.
The differences will be calculated beween the groups in the order that the groups appear in the Design File.
The Pairing ID specifies which samples are paired. A two sided t-test will be performed for the test that the difference is zero.

--------------------------------------------------------------------------------

**Input**

    - Two input datasets are required.


@WIDE@

**NOTE:** The sample IDs must match the sample IDs in the Design File
(below). Extra columns will automatically be ignored.

@METADATA@

@UNIQID@

**Group/Treatment**

        - List with the name of the column the Design File that contains group classifications.

**Pairing ID**

        - Name of the column in your Design File that contains Pairing IDs.  An example is given below:

    +----------+--------+--------+
    | sampleID | group  | pairID |
    +==========+========+========+
    | sample1  | g1     |   p1   |
    +----------+--------+--------+
    | sample2  | g1     |   p2   |
    +----------+--------+--------+
    | sample3  | g1     |   p3   |
    +----------+--------+--------+
    | sample4  | g2     |   p1   |
    +----------+--------+--------+
    | sample5  | g2     |   p2   |
    +----------+--------+--------+
    | sample6  | g2     |   p3   |
    +----------+--------+--------+
    | ...      | ...    |   ...  |
    +----------+--------+--------+



--------------------------------------------------------------------------------

**Output**

The tool outputs 3 files:

(1) a TSV file with the results table containing p-values for each test and the corresponding differences between the means for comparisons between the groups.
(2) a TSV file with an indicator flag = 1 if the difference between the groups is statistically significant using provided α levels.
(3) a PDF file with volcano plots visual inspection of the differences between group means and p-values. The red dashed line in volcano plot(s) corresponds to a p-value = 0.01 cutoff (2 on the negative log base 10 scale).

    ]]></help>
    <expand macro="citations"/>
</tool>