Mercurial > repos > xuebing > sharplabtool
diff tools/taxonomy/find_diag_hits.xml @ 0:9071e359b9a3
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:37:19 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/taxonomy/find_diag_hits.xml Fri Mar 09 19:37:19 2012 -0500 @@ -0,0 +1,99 @@ +<tool id="find_diag_hits" name="Find diagnostic hits" version="1.0.0"> + <description></description> + <requirements> + <requirement type="package">taxonomy</requirement> + </requirements> + <command interpreter="python">find_diag_hits.py $input1 $id_col $rank_list $out_format $out_file1</command> + <inputs> + <param format="taxonomy" name="input1" type="data" label="Find diagnostic hits in"/> + <param name="id_col" type="data_column" data_ref="input1" numerical="False" label="Select column with sequence id" /> + <param name="rank_list" type="select" display="checkboxes" multiple="true" label="select taxonomic ranks"> + <option value="superkingdom">Superkingdom</option> + <option value="kingdom">Kingdom</option> + <option value="subkingdom">Subkingdom</option> + <option value="superphylum">Superphylum</option> + <option value="phylum">Phylum</option> + <option value="subphylum">Subphylum</option> + <option value="superclass">Superclass</option> + <option value="class">Class</option> + <option value="subclass">Subclass</option> + <option value="superorder">Superorder</option> + <option value="order">Order</option> + <option value="suborder">Suborder</option> + <option value="superfamily">Superfamily</option> + <option value="family">Family</option> + <option value="subfamily">Subfamily</option> + <option value="tribe">Tribe</option> + <option value="subtribe">Subtribe</option> + <option value="genus">Genus</option> + <option value="subgenus">Subgenus</option> + <option selected="true" value="species">Species</option> + <option value="subspecies">Subspecies</option> + </param> + <param name="out_format" type="select" label="Select output format"> + <option value="reads">Diagnostic read list</option> + <option value="counts">Number of diagnostic reads per taxonomic rank</option> + </param> + </inputs> + <outputs> + <data format="tabular" name="out_file1" /> + </outputs> + <tests> + <test> + <param name="input1" value="taxonomyGI.taxonomy" ftype="taxonomy"/> + <param name="id_col" value="1" /> + <param name="rank_list" value="order,genus" /> + <param name="out_format" value="counts" /> + <output name="out_file1" file="find_diag_hits.tabular" /> + </test> + </tests> + + +<help> + +**What it does** + +When performing metagenomic analyses it is often necessary to identify sequence reads corresponding to a particular taxonomic group, or, in other words, diagnostic of a particular taxonomic rank. This utility performs this analysis. It takes data generated by *Taxonomy manipulation->Fetch Taxonomic Ranks* as input and outputs either a list of sequence reads unique to a particular taxonomic rank, or a list of taxonomic ranks and the count of unique reads corresponding to each rank. + +------ + +**Example** + +Suppose the *Taxonomy manipulation->Fetch Taxonomic Ranks* generated the following taxonomy representation:: + + read1 2 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Laurasiatheria n Ruminantia n Bovidae Bovinae n n Bos n Bos taurus n + read2 12585 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n + read1 58615 root Eukaryota Metazoa n n Arthropoda n Hexapoda Insecta Neoptera Amphiesmenoptera Lepidoptera Glossata Papilionoidea Nymphalidae Nymphalinae Melitaeini Phyciodina Anthanassa n Anthanassa otanes n + read3 56785 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n + +Running this tool with the following parameters: + + * *Select column with sequence id* set to **c1** + * *Select taxonomic ranks* with **order**, and **genus** checked + * *Output format* set to **Diagnostic read list** + +will return:: + + read2 Primates order + read3 Primates order + read2 Homo genus + read3 Homo genus + +Changing *Output format* set to **Number of diagnostic reads per taxonomic rank** will produce:: + + Primates 2 order + Homo 2 genus + +.. class:: infomark + +Note that **read1** is omitted because it is non-unique: it hits Mammals and Insects at the same time. + +-------- + +.. class:: warningmark + +This tool omits "**n**" corresponding to ranks missing from NCBI taxonomy. In the above example *Home sapiens* contains the order name (Primates) while *Bos taurus* does not. + + +</help> +</tool>