comparison kraken2tax.xml @ 0:065938419efe draft

planemo upload for repository https://github.com/galaxyproject/tools-devteam/blob/master/tool_collections/taxonomy/kraken2tax/ commit f176c58ce66d9db715151061ea43912f0659afc0
author devteam
date Mon, 17 Aug 2015 11:03:38 -0400
parents
children d844fdcce44e
comparison
equal deleted inserted replaced
-1:000000000000 0:065938419efe
1 <tool id="Kraken2Tax" name="Convert Kraken" version="1.0">
2 <description>data to Galaxy taxonomy representation</description>
3 <requirements>
4 <requirement type="package" version="4.1.0">gnu_awk</requirement>
5 <requirement type="package" version="8d245994d7">gb_taxonomy</requirement>
6 </requirements>
7 <command>
8 <![CDATA[
9 awk '{ print \$${read_name}, \$${tax_id} }' OFS="\t" "${input}" | taxonomy-reader ${ncbi_taxonomy.fields.path}/names.dmp ${ncbi_taxonomy.fields.path}/nodes.dmp 1 > "${out_file}"
10 ]]>
11 </command>
12 <inputs>
13 <param format="tabular" name="input" type="data" label="Choose dataset to convert"/>
14 <param label="Select a taxonomy database" name="ncbi_taxonomy" type="select">
15 <options from_data_table="ncbi_taxonomy">
16 <validator message="No built-in databases are available" type="no_options" />
17 </options>
18 </param>
19 <param name="read_name" label="Read name" type="data_column" data_ref="input" value="2" help="Select column containing read names"/>
20 <param name="tax_id" label="Taxonomy ID field" type="data_column" data_ref="input" numerical="True" value="3" help="Select column containing taxonomy ID"/>
21 </inputs>
22 <outputs>
23 <data format="taxonomy" name="out_file" />
24 </outputs>
25 <tests>
26 <test>
27 <param name="input" ftype="tabular" value="kraken2tax.txt"/>
28 <param name="read_name" value="2"/>
29 <param name="tax_id" value="3"/>
30 <output name="out_file" file="kraken2tax-test1.txt"/>
31 </test>
32 </tests>
33 <help>
34
35 .. class:: infomark
36
37 Use *Filter and Sort->Filter* to restrict output of this tool to desired taxonomic ranks. You can also use *Text Manipulation->Cut* to remove unwanted columns from the output.
38
39 ------
40
41 **What it does**
42
43 This tool is designed to translate results of the Kraken metagenomic classifier (see citations below) to the full representation of NCBI taxonomy. It does so by using Taxonomic ID field provided by Kraken. The output of this tool can be directly visualized by the Krona tool. It is based on `gb_taxonomy_tools` developed by https://github.com/spond.
44
45 -------
46
47 **Example**
48
49 Suppose you have Kraken output that looks like this (here the second field is the name of a sequencing read and the third is the taxonomic ID)::
50
51 C Read_1 9606 465 Q:1
52
53 and you want to obtain the full taxonomic representation for this read. Setting **Read name** and **Taxonomy ID field** parameters to **2** and **3**, respectively, will produce the following output (you may need to scroll sideways to see the entire line)::
54
55 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24
56 Read_1 9606 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n
57
58 In other words the tool printed *Read name*, *Taxonomy ID field*, and appended 22 columns containing taxonomic ranks from Superkingdom to Subspecies. Below is a formal definition of the output columns::
59
60 Column Definition
61 ------- -------------------------------------------------
62 1 Name (specified by 'Read name' dropdown)
63 2 taxID (specified by 'Taxonomy ID field' dropdown)
64 3 root
65 4 superkingdom
66 5 kingdom
67 6 subkingdom
68 7 superphylum
69 8 phylum
70 9 subphylum
71 10 superclass
72 11 class
73 12 subclass
74 13 superorder
75 14 order
76 15 suborder
77 16 superfamily
78 17 family
79 18 subfamily
80 19 tribe
81 20 subtribe
82 21 genus
83 22 subgenus
84 23 species
85 24 subspecies
86
87 ------
88
89 .. class:: warningmark
90
91 **Why do I have these "n" things?**
92
93 Be aware that the NCBI taxonomy (ftp://ftp.ncbi.nih.gov/pub/taxonomy/) this tool relies upon is incomplete. This means that for many species one or more ranks are absent and represented as "**n**". In the above example *subkingdom*, *superphylum* etc. are missing.
94
95
96 </help>
97 <citations>
98 <citation type="doi">10.1186/gb-2014-15-3-r46</citation>
99 <citation type="doi"> 10.1101/gr.094508.109</citation>
100 </citations>
101 </tool>
102
103