diff TopHit_namefilter/TopHit_namefilter.xml @ 0:9f1fe290345e default tip

Migrated tool version 0.1.Alx from old tool shed archive to new tool shed repository
author abossers
date Tue, 07 Jun 2011 18:07:34 -0400
parents
children
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/TopHit_namefilter/TopHit_namefilter.xml	Tue Jun 07 18:07:34 2011 -0400
@@ -0,0 +1,112 @@
+<tool id="TopHit_namefilter" name="TopHit filter" version="0.1.Alx">
+  <description>Simple filter to keep N occurrences of lines in a file</description>
+  <command interpreter="perl">
+            TopHit_namefilter_galaxy.pl
+                $input
+				$column
+				"$splitter"
+				$hits
+				$output_file
+				<!-- 2&gt;$logfile -->
+  </command>
+  <inputs>
+   <param name="input" type="data" format="tabular,txt" label="Input tabular or plain text file" />
+   <param name="column" type="integer" size="4" value="1" label="Column number to use after the split!" />
+   <param name="splitter" type="text" size="10" value="\t" label="Splitter character/code to use" help="See help below for advanced options and how to use {pipe}" >
+		<sanitizer>
+			<valid>
+				<add value="\"/>
+				<add value=">"/>
+				<add value="%"/>
+				<add value="|"/>
+			</valid>
+		</sanitizer>
+   </param>
+   <param name="hits" type="integer" size="4" value="1" label="Number of occurrences to keep" help="They will not be sorted!" />
+  </inputs>
+  <outputs>
+    <data name="output_file" format="input" label="Filtered table/text" />
+  </outputs>
+  <tests>
+  </tests>
+  <help>
+**What it does**
+
+TopHit_namefilter is a SIMPLE filter to keep just the TOPHIT / first [N] occurrence(s) of some identifier
+useful for keeping only the first N tophits in blast when multiple hits were returned (and you don't want to rerun the BLAST analysis).
+
+Please be aware that NO additional filtering or checking is done on for instance E values of BLAST hits.
+Tophit = FIRST hit...not necessarily the best.. If multiple hits are selected to be returned
+they will NOT be sorted (see below example of a number of 2 hits occurring somewhere else in the input
+and therefore in the output file).
+
+**Comments/feedback** on the Perl script or GALAXY wrapper: alex.bossers@wur.nl
+
+-----
+
+**Note!** Beware the special use of splitters! Especially if you want to use special characters that have a "perl" split
+meaning. They need to be escaped by a leading \\.
+
+Examples of splitters before filtering (end result will remain the ORIGINAL unsplit line!):
+
+::
+
+  Splitter   Meaning                           Example line to split          Split result for filtering only!
+  --------   -------------------------------   -----------------------        --------------------------------
+    \t       Single tab                        Foo&lt;tab&gt;Bar&lt;tab&gt;here    ---&gt;   Foo          Bar        here
+    \|       Single pipe                       Foo&lt;tab&gt;Bar|here        ---&gt;   Foo&lt;tab&gt;Bar  here
+    -        Single dash                       Foo-Bar                 ---&gt;   Foo          Bar
+    -|\|     Combined splits on dash OR pipe   Foo-Bar|here            ---&gt;   Foo          Bar        here
+
+
+-----
+
+**EXAMPLE**
+
+Parameters: Column = 1, **hits = 2** and splitter = \\t 
+
+**Input**
+
+Any text/tabular file:
+
+::
+
+   Q3262-21	gi|71066702|gb|AE016828.2|	tja..here something extra
+   Q3262-23	gi|71066702|gb|AE016828.2|	okay
+   Q3262-24	gi|71066702|gb|AE016828.2| nothing there
+   Q3262-21	gi|71066702|gb|AE016828.2| enhier	was zonder space :)
+   Q3262-26	gi|71066702|gb|AE016828.2|	or still
+   Q3262-21	gi|71066702|gb|AE016828.2|
+   Q3262-21	gi|71066702|gb|AE016828.2|
+   Q3262-21	gi|71066702|gb|AE016828.2|
+   Q3262-21	gi|71066702|gb|AE016828.2|
+   Q3262-21	gi|145004|gb|M80806.1|COXTRANSPO
+   Q3262-21	gi|144996|gb|M20482.1|COXHSPAB
+   Q3262-21	gi|161761570|gb|CP000890.1|
+   Q3262-30	gi|161761570|gb|CP000890.1|
+   Q3262-21	gi|161761570|gb|CP000890.1|
+   Q3262-21	gi|161761570|gb|CP000890.1|
+   Q3262-21	gi|161761570|gb|CP000890.1|
+
+
+**Outputs**
+
+::
+
+   Q3262-21	gi|71066702|gb|AE016828.2|	tja..here something extra
+   Q3262-23	gi|71066702|gb|AE016828.2|	okay
+   Q3262-21	gi|71066702|gb|AE016828.2| enhier	was zonder space :)
+   Q3262-24	gi|71066702|gb|AE016828.2| nothing there
+   Q3262-26	gi|71066702|gb|AE016828.2|	or still
+   Q3262-30	gi|161761570|gb|CP000890.1|
+
+-----
+
+Please acknowledge our work when you find it useful!
+
+|
+
+
+  </help>
+</tool>
+