view fastx_barcode_splitter.xml @ 0:78a7d28f2a15 draft

Uploaded
author idot
date Wed, 10 Jul 2013 06:13:48 -0400
parents
children
line wrap: on
line source

<tool id="cshl_fastx_barcode_splitter" name="Barcode Splitter">
	<description></description>
	<command interpreter="sh">fastx_barcode_splitter_galaxy_wrapper.sh $BARCODE $input "$input.name" "$output.files_path" --mismatches $mismatches --partial $partial $EOL > $output </command>

	<inputs>
		<param format="txt" name="BARCODE" type="data" label="Barcodes to use" />
		<param format="fasta,fastq,fastqsanger" name="input" type="data" label="Library to split" />

		<param name="EOL" type="select" label="Barcodes found at">
			<option value="--bol">Start of sequence (5' end)</option>
			<option value="--eol">End of sequence (3' end)</option>
		</param>

		<param name="mismatches" type="integer" size="3" value="2" label="Number of allowed mismatches" />
		
		<param name="partial" type="integer" size="3" value="0" label="Number of allowed barcodes nucleotide deletions" />
	
	</inputs>
	
	<tests>
		<test>
			<!-- Split a FASTQ file -->
			<param name="BARCODE" value="fastx_barcode_splitter1.txt" />
			<param name="input" value="fastx_barcode_splitter1.fastq" />
			<param name="EOL" value="Start of sequence (5' end)" />
			<param name="mismatches" value="2" />
			<param name="partial" value="0" />
			<output name="output" file="fastx_barcode_splitter1.out" />
		</test>
	</tests>

	<outputs>
		<data format="html" name="output" 
		/>
	</outputs>
<help>

**What it does**

This tool splits a FASTQ or FASTA file into several files, using barcodes as the split criteria.

--------

**Barcode file Format**

Barcode files are simple text files.
Each line should contain an identifier (descriptive name for the barcode), and the barcode itself (A/C/G/T), separated by a TAB character.
Example::

    #This line is a comment (starts with a 'number' sign)
    BC1	GATCT
    BC2	ATCGT
    BC3	GTGAT
    BC4 TGTCT
    
For each barcode, a new FASTQ file will be created (with the barcode's identifier as part of the file name).
Sequences matching the barcode will be stored in the appropriate file.

One additional FASTQ file will be created (the 'unmatched' file), where sequences not matching any barcode will be stored.

The output of this tool is an HTML file, displaying the split counts and the file locations.

**Output Example**

.. image:: ./static/fastx_icons/barcode_splitter_output_example.png

------

This tool is based on `FASTX-toolkit`__ by Assaf Gordon.

 .. __: http://hannonlab.cshl.edu/fastx_toolkit/    

</help>
</tool>
<!-- FASTX-barcode-splitter is part of the FASTX-toolkit, by A.Gordon (gordon@cshl.edu) -->