diff tools/ncbi_blast_plus/blastxml_to_top_descr.xml @ 0:075fe5424c32 draft

Uploaded v0.0.1
author peterjc
date Thu, 07 Feb 2013 14:56:18 -0500
parents
children 662fea0fe6b2
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/tools/ncbi_blast_plus/blastxml_to_top_descr.xml	Thu Feb 07 14:56:18 2013 -0500
@@ -0,0 +1,47 @@
+<tool id="blastxml_to_top_descr" name="BLAST top hit descriptions" version="0.0.1">
+    <description>Make a table from BLAST XML</description>
+    <command interpreter="python">
+      blastxml_to_top_descr.py $blastxml_file $tabular_file $topN
+    </command>
+    <inputs>
+        <param name="blastxml_file" type="data" format="blastxml" label="BLAST results as XML"/> 
+	<param name="topN" type="integer" min="1" max="100" optional="false" label="Number of descriptions" value="3"/>
+    </inputs>
+    <outputs>
+        <data name="tabular_file" format="tabular" label="Top $topN descriptions from $blastxml_file.name" />
+    </outputs>
+    <requirements>
+    </requirements>
+    <tests>
+        <test>
+            <param name="blastxml_file" value="blastp_four_human_vs_rhodopsin.xml" ftype="blastxml" />
+            <param name="topN" value="3" />
+            <output name="tabular_file" file="blastp_four_human_vs_rhodopsin_top3.tabular" ftype="tabular" />
+        </test>
+    </tests>
+    <help>
+    
+**What it does**
+
+NCBI BLAST+ (and the older NCBI 'legacy' BLAST) can output in a range of
+formats including text, tabular and a more detailed XML format. You can
+do a lot of things with tabular files in Galaxy (sorting, filtering, joins,
+etc) however currently the BLAST tabular output omits the hit descriptions
+found in the other output formats.
+
+This tool turns a BLAST XML file into a simple tabular file containing
+one row per query sequence, containing the query identifier and then
+the three (by default) top hit descriptions. If a query doesn't have
+that many hits, then these entries are left blank.
+
+**Example Usage**
+
+One simple usage would be to take a transcriptome assembly or set of
+gene predictions, run a BLAST search against the NCBI NR database, and
+then use this tool to make a table of the top three BLAST hits. This
+can give you a 'quick and dirty' crude annotation, potentially enough
+to spot some problems (e.g. bacterial contaimination could be very
+obvious).
+
+    </help>
+</tool>