Mercurial > repos > xuebing > sharplabtool
diff tools/taxonomy/gi2taxonomy.xml @ 0:9071e359b9a3
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:37:19 -0500 |
parents | |
children |
line wrap: on
line diff
--- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/tools/taxonomy/gi2taxonomy.xml Fri Mar 09 19:37:19 2012 -0500 @@ -0,0 +1,102 @@ +<tool id="Fetch Taxonomic Ranks" name="Fetch taxonomic representation" version="1.1.0"> + <description></description> + <requirements> + <requirement type="package">taxonomy</requirement> + </requirements> + <command interpreter="python">gi2taxonomy.py $input $giField $idField $out_file1 ${GALAXY_DATA_INDEX_DIR}</command> + <inputs> + <param format="tabular" name="input" type="data" label="Show taxonomic representation for"></param> + <param name="giField" label="GIs column" type="data_column" data_ref="input" numerical="True" help="select column containing GI numbers"/> + <param name="idField" label="Name column" type="data_column" data_ref="input" help="select column containing identifiers you want to include into output"/> + </inputs> + <outputs> + <data format="taxonomy" name="out_file1" /> + </outputs> + <requirements> + <requirement type="binary">taxBuilder</requirement> + </requirements> + <tests> + <test> + <param name="input" ftype="tabular" value="taxonomy2gi-input.tabular"/> + <param name="giField" value="1"/> + <param name="idField" value="2"/> + <output name="out_file1" file="taxonomy2gi-output.tabular"/> + </test> + </tests> + + <help> + +.. class:: infomark + +Use *Filter and Sort->Filter* to restrict output of this tool to desired taxonomic ranks. You can also use *Text Manipulation->Cut* to remove unwanted columns from the output. + +------ + +**What it does** + +Fetches taxonomic information for a list of GI numbers (sequences identifiers used by the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov). + +------- + +**Example** + +Suppose you have BLAST output that looks like this:: + + +-----------------------+----------+----------+-----------------+------------+------+--------+ + | queryId | targetGI | identity | alignmentLength | mismatches | gaps | score | + +-----------------------+----------+----------+-----------------+------------+------+--------+ + | 1L_EYKX4VC01BXWX1_265 | 1430919 | 90.09 | 212 | 15 | 6 | 252.00 | + +-----------------------+----------+----------+-----------------+------------+------+--------+ + +and you want to obtain full taxonomic representation for GIs listed in *targetGI* column. If you set parameters as shown here: + +.. image:: ./static/images/fetchTax.png + + +the tool will generate the following output (you may need to scroll sideways to see the entire line):: + + 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 + 1L_EYKX4VC01BXWX1_265 9606 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n n n Homo n Homo sapiens n 1430919 + +In other words the tool printed *Name column*, *taxonomy Id*, appended 22 columns containing taxonomic ranks from Superkingdom to Subspecies and added *GI* as the last column. Below is a formal definition of the output columns:: + + Column Definition + ------- ------------------------------------------ + 1 Name (specified by 'Name column' dropdown) + 2 GI (specified by 'GI column' dropdown) + 3 root + 4 superkingdom + 5 kingdom + 6 subkingdom + 7 superphylum + 8 phylum + 9 subphylum + 10 superclass + 11 class + 12 subclass + 13 superorder + 14 order + 15 suborder + 16 superfamily + 17 family + 18 subfamily + 19 tribe + 20 subtribe + 21 genus + 22 subgenus + 23 species + 24 subspecies + +------ + +.. class:: warningmark + +**Why do I have these "n" things?** + +Be aware that the NCBI taxonomy (ftp://ftp.ncbi.nih.gov/pub/taxonomy/) this tool relies upon is incomplete. This means that for many species one or more ranks are absent and represented as "**n**". In the above example *subkingdom*, *superphylum* etc. are missing. + + +</help> +</tool> + +