view seqid_uncollapser.xml @ 0:78a7d28f2a15 draft

Uploaded
author idot
date Wed, 10 Jul 2013 06:13:48 -0400
parents
children
line wrap: on
line source

<tool id="cshl_seqid_uncollapser" name="Uncollapse rows">
	<description>containing collapsed sequence IDs</description>
	<command>
cat '$input' |
fastx_uncollapser -c $idcol -v -o '$output'
</command>
	<inputs>
		<param format="tabular,pslx" name="input" type="data" label="Library to uncollapse" />
	    <param name="idcol" label="Column with collased sequence-identifier" type="data_column" data_ref="input" accept_default="false" >
 		  <help>This column contains the sequence id from a collapsed FASTA file in the form of "(seq number)-(read count)" (e.g. 15-4). Use 10 if you're analyzing BLAT output</help>
		</param>
	</inputs>
	<tests>
		<test>
			<param name="input" value="fastx_seqid_uncollapse1.psl" />
			<param name="idcol" value="10" />
			<param name="output" file="fastx_seqid_uncollapse1.out" />
		</test>
	</tests>

	<outputs>
		<data format="input" name="output" metadata_source="input"
		/>
	</outputs>
  <help>

**What it does**

This tool reads a row (in a table) containing a collapsed sequence ID, and duplicates the .

.. class:: warningmark

You must specify the column containing the collapsed sequence ID (e.g. 15-4).

--------

**Example Input File**

The following input file contains two collapsed sequence identifiers at column 10: *84-2* and *87-5*

(meaning the first has multiplicity-count of 2 and the second has multiplicity count of 5)::


  23    0    0    0    0    0    0    0    +    84-2 ...
  22    0    0    0    0    0    0    0    +    87-5 ...


**Output Example**

After **uncollapsing** (on column 10), the line of the first sequence-identifier is repeated *twice*, and the line of the second sequence-identifier is repeated *five* times::

  23    0    0    0    0    0    0    0    +    84-2 ...
  23    0    0    0    0    0    0    0    +    84-2 ...
  22    0    0    0    0    0    0    0    +    87-5 ...
  22    0    0    0    0    0    0    0    +    87-5 ...
  22    0    0    0    0    0    0    0    +    87-5 ...
  22    0    0    0    0    0    0    0    +    87-5 ...
  22    0    0    0    0    0    0    0    +    87-5 ...


Uncollapsing a text file allows analsys of collapsed FASTA files to be used with any tool which doesn't 'understand' collapsed multiplicity counts.

.. class:: infomark

See the *Collapse* tool in the *FASTA Manipulation* category for more details about collapsing FASTA files.

-----

This tool is based on `FASTX-toolkit`__ by Assaf Gordon.

 .. __: http://hannonlab.cshl.edu/fastx_toolkit/
    
</help>
</tool>
<!-- FASTX-Uncollapser is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->