Mercurial > repos > xuebing > sharplabtool
comparison tools/taxonomy/lca.xml @ 0:9071e359b9a3
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:37:19 -0500 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:9071e359b9a3 |
---|---|
1 <tool id="lca1" name="Find lowest diagnostic rank" version="1.0.1"> | |
2 <description></description> | |
3 <requirements> | |
4 <requirement type="package">taxonomy</requirement> | |
5 </requirements> | |
6 <command interpreter="python"> | |
7 lca.py $input1 $out_file1 $rank_bound | |
8 </command> | |
9 <inputs> | |
10 <param format="taxonomy" name="input1" type="data" label="for taxonomy dataset"/> | |
11 <param name="rank_bound" label="require the lowest rank to be at least" type="select"> | |
12 <option value="0">No restriction</option> | |
13 <option value="3">Superkingdom</option> | |
14 <option value="4">Kingdom</option> | |
15 <option value="5">Subkingdom</option> | |
16 <option value="6">Superphylum</option> | |
17 <option value="7">Phylum</option> | |
18 <option value="8">Subphylum</option> | |
19 <option value="9">Superclass</option> | |
20 <option value="10">Class</option> | |
21 <option value="11">Subclass</option> | |
22 <option value="12">Superorder</option> | |
23 <option value="13">Order</option> | |
24 <option value="14">Suborder</option> | |
25 <option value="15">Superfamily</option> | |
26 <option value="16">Family</option> | |
27 <option value="17">Subfamily</option> | |
28 <option value="18">Tribe</option> | |
29 <option value="19">Subtribe</option> | |
30 <option value="20">Genus</option> | |
31 <option value="21">Subgenus</option> | |
32 <option value="22">Species</option> | |
33 <option value="23">Subspecies</option> | |
34 </param> | |
35 </inputs> | |
36 <outputs> | |
37 <data format="taxonomy" name="out_file1" metadata_source="input1" /> | |
38 </outputs> | |
39 <tests> | |
40 <test> | |
41 <param name="input1" value="lca_input.taxonomy" ftype="taxonomy"/> | |
42 <param name="rank_bound" value="0" /> | |
43 <output name="out_file1" file="lca_output.taxonomy" ftype="taxonomy"/> | |
44 </test> | |
45 <test> | |
46 <param name="input1" value="lca_input2.taxonomy" ftype="taxonomy"/> | |
47 <param name="rank_bound" value="7" /> | |
48 <output name="out_file1" file="lca_output2.taxonomy" ftype="taxonomy"/> | |
49 </test> | |
50 | |
51 <!--Test case with invalid lines --> | |
52 <test> | |
53 <param name="input1" value="lca_input3.taxonomy" ftype="taxonomy"/> | |
54 <param name="rank_bound" value="10" /> | |
55 <output name="out_file1" file="lca_output3.taxonomy" ftype="taxonomy"/> | |
56 </test> | |
57 </tests> | |
58 | |
59 <help> | |
60 | |
61 **What it does** | |
62 | |
63 This tool identifies the lowest taxonomic rank for which a mategenomic sequencing read is diagnostic. It takes datasets produced by *Fetch Taxonomic Ranks* tool (aka Taxonomy format) as the input. | |
64 | |
65 ------- | |
66 | |
67 **Example** | |
68 | |
69 Suppose you have two reads, **read_1** and **read_2**, with the following taxonomic profiles (scroll sideways to see the entire dataset):: | |
70 | |
71 read_1 1 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus1 subgenus1 species1 subspecies1 | |
72 read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 genus2 subgenus2 species2 subspecies2 | |
73 read_2 3 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum3 subphylum3 superclass3 class3 subclass3 superorder3 order3 suborder3 superfamily3 family3 subfamily3 tribe3 subtribe3 genus3 subgenus3 species3 subspecies3 | |
74 read_2 4 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum4 subphylum4 superclass4 class4 subclass4 superorder4 order4 suborder4 superfamily4 family4 subfamily4 tribe4 subtribe4 genus4 subgenus4 species4 subspecies4 | |
75 | |
76 For **read_1** taxonomic labels are consistent until the genus level, where the taxonomy splits into two branches, one ending with *subspecies1* and the other with *subspecies2*. This implies **that the lowest taxomomic rank read_1 can identify is SUBTRIBE**. Similarly, read_2 is diagnostic up until the **superphylum** level. As a results the output of this tool will be:: | |
77 | |
78 read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 n n n n | |
79 read_2 3 root superkingdom1 kingdom1 subkingdom1 superphylum1 n n n n n n n n n n n n n n n n n | |
80 | |
81 where, **n** means *EMPTY*. | |
82 | |
83 -------- | |
84 | |
85 **What's up with the drop down?** | |
86 | |
87 Why do we need the *require the lowest rank to be at least* dropdown? Let's look at the above example again. Suppose you need to find only those reads that are diagnostic on at least phylum level. To do this you need to set the *require the lowest rank to be at least* to **phylum**. As a result your output will look like this:: | |
88 | |
89 read_1 2 root superkingdom1 kingdom1 subkingdom1 superphylum1 phylum1 subphylum1 superclass1 class1 subclass1 superorder1 order1 suborder1 superfamily1 family1 subfamily1 tribe1 subtribe1 n n n n | |
90 | |
91 .. class:: infomark | |
92 | |
93 Note, that **read_2** is now omitted as it matches two phyla (**phylum3** and **phylum4**) and therefore is not diagnostic (but rather cosmopolitan) on *phylum* level. | |
94 | |
95 | |
96 | |
97 | |
98 | |
99 </help> | |
100 </tool> |