view gemini_burden.xml @ 0:e799c1a6854b draft

planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/gemini commit 4bbfca6f0e9cae9a8f263aad4eab7304c96358c4
author iuc
date Thu, 18 Feb 2016 08:51:59 -0500
parents
children c70d79e0eed7
line wrap: on
line source

<tool id="gemini_@BINARY@" name="GEMINI @BINARY@" version="@VERSION@.0">
    <description>perform sample-wise gene-level burden calculations</description>
    <macros>
        <import>gemini_macros.xml</import>
        <token name="@BINARY@">burden</token>
    </macros>
    <expand macro="requirements" />
    <expand macro="stdio" />
    <expand macro="version_command" />
    <command>
<![CDATA[
        gemini @BINARY@
          --cases $cases
          --controls $controls
            $save_tscores
            $nonsynonymous
            $calpha
            --permutations $permutations
            #if float( str($min_aaf) ) >= 0.0:
                --min-aaf $min_aaf
            #end if
            #if float( str($max_aaf) ) >= 0.0:
                --max-aaf $max_aaf
            #end if
            "${ infile }"
            > "${ outfile }"
]]>

    </command>
    <inputs>
        <expand macro="infile" />

        <param name="cases" type="text" value="" label="Space separated list of cases for association testing" help="(--cases)"/>
        <param name="controls" type="text" value="" label="Space separated list of controls for association testing" help="(--controls)"/>

        <param name="save_tscores" type="boolean" truevalue="--save_tscores" falsevalue="" checked="False"
            label="Save the permuted T-scores in the output file" help="(--save_tscores)"/>

        <param name="nonsynonymous" type="boolean" truevalue="--nonsynonymous" falsevalue="" checked="False"
            label="Count all nonsynonymous variants as contributing burden" help="(--nonsynonymous)"/>
        <param name="calpha" type="boolean" truevalue="--calpha" falsevalue="" checked="False"
            label="Run the C-alpha association test" help="(--calpha)"/>
        <param name="min_aaf" type="float" value="-1" label="The min. alt. allele frequency for a variant to be included"
            help="(--min-aaf)">
            <!--validator type="in_range" min="0.0"/-->
        </param>
        <param name="max_aaf" type="float" value="-1" label="The max. alt. allele frequency for a variant to be included"
            help="(--max-aaf)">
            <!--validator type="in_range" min="0.0"/-->
        </param>

        <param name="permutations" type="integer" value="1000" label="Number of permutations to run for the C-alpha test"
            help="(--permutations)">
            <validator type="in_range" min="0"/>
        </param>

    </inputs>
    <outputs>
        <data name="outfile" format="tabular" />
    </outputs>
    <tests>
        <test>
            <param name="infile" value="gemini_burden_input.db" ftype="gemini.sqlite" />
            <param name="controls" value="M10475 M10478" />
            <param name="cases" value="M10500 M128215" />
            <param name="calpha" value="True" />
            <output name="outfile" file="gemini_burden_result.tabular" />
        </test>
    </tests>
    <help><![CDATA[
**What it does**

Burden performs sample-wise gene-level burden calculations.

The burden tool provides a set of utilities to perform burden summaries on a per-gene, per sample basis. By default, it outputs a table of gene-wise counts of all high impact variants in coding regions for each sample:

GEMINI burden example::

 gene    M10475  M10478  M10500  M128215
 WDR37   2       2       2       2
 CTBP2   0       0       0       1
 DHODH   1       0       0       0

**Setting examples**

**--nonsynonymous**

If you want to be a little bit less restrictive, you can include all non-synonymous variants instead.

GEMINI output with setting --nonsynonymous::

 gene    M10475  M10478  M10500  M128215
 SYCE1   0       1       1       0
 WDR37   2       2       2       2
 CTBP2   0       0       0       1
 ASAH2C  2       1       1       0
 DHODH   1       0       0       0

**--calpha**

If your database has been loaded with a PED file describing case and control samples, you can calculate the c-alpha statistic for cases vs. control.

GEMINI output with setting --calpha::

 gene    T       c       Z       p_value
 SYCE1   -0.5    0.25    -1.0    0.841344746069
 WDR37   -1.0    1.5     -0.816496580928 0.792891910879
 CTBP2   0.0     0.0     nan     nan
 ASAH2C  -0.5    0.75    -0.57735026919  0.718148569175
 DHODH   0.0     0.0     nan     nan

To calculate the **P-value** using a permutation test, use the --permutations option, specifying the number of permutations of the case/control labels you want to use.

**--min-aaf and --max-aaf for --calpha**

By default, all variants affecting a given gene will be included in the C-alpha computation. However, one may establish alternate allele frequency boundaries for the variants included using the --min-aaf and --max-aaf options.

Used settings:

  - -calpha test.burden.db 
  - -min-aaf 0.0 
  - -max-aaf 0.01
  - -cases 
  - -controls for --calpha

If you do not have a PED file loaded, or your PED file does not follow the standard PED phenotype encoding format you can still perform the c-alpha test, but you have to specify which samples are the control samples and which are the case samples.

Used settings:

 - -controls M10475 M10478 
 - -cases M10500 M128215 
 - -calpha

Output::

 gene    T       c       Z       p_value
 SYCE1   -0.5    0.25    -1.0    0.841344746069
 WDR37   -1.0    1.5     -0.816496580928 0.792891910879
 CTBP2   0.0     0.0     nan     nan
 ASAH2C  -0.5    0.75    -0.57735026919  0.718148569175
 DHODH   0.0     0.0     nan     nan

**--nonsynonymous --calpha**

If you would rather consider all nonsynonymous variants for the C-alpha test rather than just the medium and high impact variants, add the --nonsynonymous flag.


    ]]></help>
    <expand macro="citations">
        <citation type="doi">10.1371/journal.pgen.1001322</citation><!-- c-alpha citation -->
    </expand>
</tool>