Mercurial > repos > idot > fastx_toolkit2
comparison seqid_uncollapser.xml @ 0:78a7d28f2a15 draft
Uploaded
author | idot |
---|---|
date | Wed, 10 Jul 2013 06:13:48 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
-1:000000000000 | 0:78a7d28f2a15 |
---|---|
1 <tool id="cshl_seqid_uncollapser" name="Uncollapse rows"> | |
2 <description>containing collapsed sequence IDs</description> | |
3 <command> | |
4 cat '$input' | | |
5 fastx_uncollapser -c $idcol -v -o '$output' | |
6 </command> | |
7 <inputs> | |
8 <param format="tabular,pslx" name="input" type="data" label="Library to uncollapse" /> | |
9 <param name="idcol" label="Column with collased sequence-identifier" type="data_column" data_ref="input" accept_default="false" > | |
10 <help>This column contains the sequence id from a collapsed FASTA file in the form of "(seq number)-(read count)" (e.g. 15-4). Use 10 if you're analyzing BLAT output</help> | |
11 </param> | |
12 </inputs> | |
13 <tests> | |
14 <test> | |
15 <param name="input" value="fastx_seqid_uncollapse1.psl" /> | |
16 <param name="idcol" value="10" /> | |
17 <param name="output" file="fastx_seqid_uncollapse1.out" /> | |
18 </test> | |
19 </tests> | |
20 | |
21 <outputs> | |
22 <data format="input" name="output" metadata_source="input" | |
23 /> | |
24 </outputs> | |
25 <help> | |
26 | |
27 **What it does** | |
28 | |
29 This tool reads a row (in a table) containing a collapsed sequence ID, and duplicates the . | |
30 | |
31 .. class:: warningmark | |
32 | |
33 You must specify the column containing the collapsed sequence ID (e.g. 15-4). | |
34 | |
35 -------- | |
36 | |
37 **Example Input File** | |
38 | |
39 The following input file contains two collapsed sequence identifiers at column 10: *84-2* and *87-5* | |
40 | |
41 (meaning the first has multiplicity-count of 2 and the second has multiplicity count of 5):: | |
42 | |
43 | |
44 23 0 0 0 0 0 0 0 + 84-2 ... | |
45 22 0 0 0 0 0 0 0 + 87-5 ... | |
46 | |
47 | |
48 **Output Example** | |
49 | |
50 After **uncollapsing** (on column 10), the line of the first sequence-identifier is repeated *twice*, and the line of the second sequence-identifier is repeated *five* times:: | |
51 | |
52 23 0 0 0 0 0 0 0 + 84-2 ... | |
53 23 0 0 0 0 0 0 0 + 84-2 ... | |
54 22 0 0 0 0 0 0 0 + 87-5 ... | |
55 22 0 0 0 0 0 0 0 + 87-5 ... | |
56 22 0 0 0 0 0 0 0 + 87-5 ... | |
57 22 0 0 0 0 0 0 0 + 87-5 ... | |
58 22 0 0 0 0 0 0 0 + 87-5 ... | |
59 | |
60 | |
61 Uncollapsing a text file allows analsys of collapsed FASTA files to be used with any tool which doesn't 'understand' collapsed multiplicity counts. | |
62 | |
63 .. class:: infomark | |
64 | |
65 See the *Collapse* tool in the *FASTA Manipulation* category for more details about collapsing FASTA files. | |
66 | |
67 ----- | |
68 | |
69 This tool is based on `FASTX-toolkit`__ by Assaf Gordon. | |
70 | |
71 .. __: http://hannonlab.cshl.edu/fastx_toolkit/ | |
72 | |
73 </help> | |
74 </tool> | |
75 <!-- FASTX-Uncollapser is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) --> |