view kraken-translate.xml @ 4:9a5be5b1a5c9 draft

"planemo upload for repository https://github.com/galaxyproject/tools-iuc/blob/master/tool_collections/kraken/ commit 06345505a91f3dcfa8a37dceb6f25e95806dddc8"
author iuc
date Wed, 04 Dec 2019 06:51:08 -0500
parents 9bdcba262627
children bc3e90fd13be
line wrap: on
line source

<tool id="kraken-translate" name="Kraken-translate" version="@WRAPPER_VERSION@">
    <description>convert taxonomy IDs to names</description>
    <macros>
        <import>macros.xml</import>
    </macros>
    <expand macro="requirements" />
    <expand macro="version_command" />
    <command detect_errors="exit_code"><![CDATA[
        @SET_DATABASE_PATH@ &&

        kraken-translate
            @INPUT_DATABASE@
            $mpa_format
            '${input}'
            > '${translated}'
    ]]></command>
    <inputs>
        <param name="input" type="data" format="tabular" label="Kraken output"
               help="Select taxonomy classification produced by kraken"/>
        <param name="mpa_format" type="boolean" truevalue="--mpa-format" falsevalue=""
               label="Restrict labels to standard rank assignments" />

        <expand macro="input_database" />
    </inputs>
    <outputs>
        <data name="translated" format="tabular" label="${tool.name} on ${on_string}: Translated classification"/>
    </outputs>
    <tests>
        <test>
            <param name="input" value="kraken-translate/kraken_translate_test1.tab" ftype="tabular"/>
            <param name="mpa-format" value="false"/>
            <param name="kraken_database" value="new_style_test_entry"/>
            <output name="translated" file="kraken-translate/kraken_translate_test1_output.tab" ftype="tabular"/>
        </test>
    </tests>
    <help><![CDATA[

.. class:: warningmark

**Note**: the database used must be the same as the one used in the original Kraken run

-------

**What it does**

The file sequences.labels generated by the above example is a text file with two tab-delimited columns, and one line for each classified sequence in sequences.fa; unclassified sequences are not reported by kraken-translate. The first column of kraken-translate's output are the sequence IDs of the classified sequences, and the second column contains the taxonomy of the sequence. For example, an output line from kraken::

 C     SEQ1    562     36      562:6

Would result in a corresponding output line from kraken-translate::

 SEQ1  root;cellular organisms;Bacteria;Proteobacteria;Gammaproteobacteria;Enterobacteriales;Enterobacteriaceae;Escherichia;Escherichia coli

Alternatively, kraken-translate accepts the option ``--mpa-format`` which will report only levels of the taxonomy with standard rank assignments (superkingdom, kingdom, phylum, class, order, family, genus, species), and uses pipes to delimit the various levels of the taxonomy. For example, ``kraken-translate --mpa-format`` with the above example output from kraken would result in the following line of output::

 SEQ1  d__Bacteria|p__Proteobacteria|c__Gammaproteobacteria|o__Enterobacteriales|f__Enterobacteriaceae|g__Escherichia|s__Escherichia_coli

Taxonomy assignments above the superkingdom rank are represented as just "root" when using the ``--mpa-report`` option with kraken-translate.
    ]]></help>
    <expand macro="citations" />
</tool>