Mercurial > repos > xuebing > sharplabtool
view tools/unix_tools/sort_tool.xml @ 1:cdcb0ce84a1b
Uploaded
author | xuebing |
---|---|
date | Fri, 09 Mar 2012 19:45:15 -0500 |
parents | 9071e359b9a3 |
children |
line wrap: on
line source
<tool id="cshl_sort_tool" name="Sort"> <!-- note 1: the 'version' sort (or natual order sort) requires GNU Coreutils 7.1 or later note 2: for greater efficiency, sort buffer size is very large. If your Galaxy server doesn't have so much memory (or the sorts you use don't require it) - you can decrease the memory size. (argument is "-S 2G") --> <command>sort -S 2G $unique #for $key in $sortkeys '-k ${key.column},${key.column}${key.order}${key.style}' #end for $input > $out_file1 </command> <inputs> <param format="txt" name="input" type="data" label="Sort Query" /> <param name="unique" type="select" label="Output only unique values?"> <option value="">No</option> <option value="-u">Yes</option> </param> <repeat name="sortkeys" title="sort key"> <param name="column" label="on column" type="data_column" data_ref="input" accept_default="true" /> <param name="order" type="select" display="radio" label="in"> <option value="r">Descending order</option> <option value="">Ascending order</option> </param> <param name="style" type="select" display="radio" label="Flavor"> <option value="n">Fast numeric sort ([-n])</option> <option value="g">General numeric sort ( scientific notation [-g])</option> <option value="V">Natural/Version sort ([-V]) </option> <option value="">Alphabetical sort</option> </param> </repeat> </inputs> <tests> <test> <!-- Sort Descending numerical order, with scientific notation --> <param name="input" value="unix_sort_input1.txt" /> <output name="output" file="unix_sort_output1.txt" /> <param name="unique" value="No" /> <param name="column" value="2" /> <param name="order" value="r" /> <param name="style" value="g" /> </test> <test> <!-- Sort Ascending numerical order, with scientific notation - outputing unique values only The catch: chr15 appears twice, with the same value (0.0314 and 3.14e-2). In the output, it should appear only once because of the unique flag --> <param name="input" value="unix_sort_input1.txt" /> <output name="output" file="unix_sort_output2.txt" /> <param name="unique" value="Yes" /> <param name="column" value="2" /> <param name="order" value="" /> <param name="style" value="g" /> </test> <test> <!-- Sort Ascending 'natural' order --> <param name="input" value="unix_sort_input1.txt" /> <output name="output" file="unix_sort_output3.txt" /> <param name="unique" value="No" /> <param name="column" value="1" /> <param name="order" value="" /> <param name="style" value="V" /> </test> </tests> <outputs> <data format="input" name="out_file1" metadata_source="input"/> </outputs> <help> **What it does** This tool runs the unix **sort** command on the selected data file. ----- **Sorting Styles** * **Fast Numeric**: sort by numeric values. Handles integer values (e.g. 43, 134) and decimal-point values (e.g. 3.14). *Does not* handle scientific notation (e.g. -2.32e2). * **General Numeric**: sort by numeric values. Handles all numeric notations (including scientific notation). Slower than *fast numeric*, so use only when necessary. * **Natural Sort**: Sort in 'natural' order (natural to humans, not to computers). See example below. * **Alphabetical sort**: Sort in strict alphabetical order. See example below. **Sorting Examples** Given the following list:: chr4 chr13 chr1 chr10 chr20 chr2 **Alphabetical sort** would produce the following sorted list:: chr1 chr10 chr13 chr2 chr20 chr4 **Natural Sort** would produce the following sorted list:: chr1 chr2 chr4 chr10 chr13 chr20 .. class:: infomark If you're planning to use the file with another tool that expected sorted files (such as *join*), you should use the **Alphabetical sort**, not the **Natural Sort**. Natural sort order is easier for humans, but is unnatural for computer programs. </help> </tool>