diff tools/taxonomy/gi2taxonomy.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/taxonomy/gi2taxonomy.xml	Fri Mar 09 19:37:19 2012 -0500
@@ -0,0 +1,102 @@
+<tool id="Fetch Taxonomic Ranks" name="Fetch taxonomic representation" version="1.1.0">
+  <description></description>
+    <requirements>
+        <requirement type="package">taxonomy</requirement>
+    </requirements>
+  <command interpreter="python">gi2taxonomy.py $input $giField $idField $out_file1 ${GALAXY_DATA_INDEX_DIR}</command>
+  <inputs>
+    <param format="tabular" name="input" type="data" label="Show taxonomic representation for"></param>
+    <param name="giField" label="GIs column" type="data_column" data_ref="input" numerical="True" help="select column containing GI numbers"/>
+    <param name="idField" label="Name column" type="data_column" data_ref="input" help="select column containing identifiers you want to include into output"/>
+  </inputs>
+  <outputs>
+    <data format="taxonomy" name="out_file1" />
+  </outputs>
+  <requirements>
+    <requirement type="binary">taxBuilder</requirement>
+  </requirements>
+  <tests>
+    <test>
+      <param name="input" ftype="tabular" value="taxonomy2gi-input.tabular"/>
+      <param name="giField" value="1"/>
+      <param name="idField" value="2"/>
+      <output name="out_file1" file="taxonomy2gi-output.tabular"/>
+    </test>
+  </tests>
+
+  <help>
+
+.. class:: infomark
+
+Use *Filter and Sort->Filter* to restrict output of this tool to desired taxonomic ranks. You can also use *Text Manipulation->Cut* to remove unwanted columns from the output.
+
+------
+
+**What it does**
+
+Fetches taxonomic information for a list of GI numbers (sequences identifiers used by the National Center for Biotechnology Information http://www.ncbi.nlm.nih.gov).
+
+-------
+
+**Example**
+
+Suppose you have BLAST output that looks like this::
+  
+   +-----------------------+----------+----------+-----------------+------------+------+--------+
+   | queryId               | targetGI | identity | alignmentLength | mismatches | gaps | score  |
+   +-----------------------+----------+----------+-----------------+------------+------+--------+
+   | 1L_EYKX4VC01BXWX1_265 |  1430919 |    90.09 |             212 |         15 |    6 | 252.00 | 
+   +-----------------------+----------+----------+-----------------+------------+------+--------+
+
+and you want to obtain full taxonomic representation for GIs listed in *targetGI* column. If you set parameters as shown here:
+
+.. image:: ./static/images/fetchTax.png
+
+
+the tool will generate the following output (you may need to scroll sideways to see the entire line)::
+
+  1                     2    3    4         5       6 7 8        9        10            11       12 13               14       15         16          17        18  19  20 21  22  23           24 25
+  1L_EYKX4VC01BXWX1_265 9606 root Eukaryota Metazoa n n Chordata Craniata Gnathostomata Mammalia n  Euarchontoglires Primates Haplorrhini Hominoidea Hominidae n   n   n  Homo n  Homo sapiens n  1430919
+
+In other words the tool printed *Name column*, *taxonomy Id*, appended 22 columns containing taxonomic ranks from Superkingdom to Subspecies and added *GI* as the last column. Below is a formal definition of the output columns::
+
+    Column Definition
+   ------- ------------------------------------------
+         1 Name (specified by 'Name column' dropdown)
+         2 GI   (specified by 'GI column' dropdown)
+         3 root
+         4 superkingdom
+         5 kingdom
+         6 subkingdom
+         7 superphylum
+         8 phylum
+         9 subphylum
+        10 superclass
+        11 class
+        12 subclass
+        13 superorder
+        14 order
+        15 suborder
+        16 superfamily
+        17 family
+        18 subfamily
+        19 tribe
+        20 subtribe
+        21 genus
+        22 subgenus
+        23 species
+        24 subspecies
+
+------
+
+.. class:: warningmark
+
+**Why do I have these "n" things?** 
+
+Be aware that the NCBI taxonomy (ftp://ftp.ncbi.nih.gov/pub/taxonomy/) this tool relies upon is incomplete.  This means that for many species one or more ranks are absent and represented as "**n**". In the above example *subkingdom*, *superphylum* etc. are missing.
+
+
+</help>
+</tool>
+
+