comparison tools/fastq_paired_unpaired/fastq_paired_unpaired.xml @ 4:09f9f0e29e47 draft

v0.0.6 use format_source; v0.0.5 error handling & citation
author peterjc
date Wed, 05 Aug 2015 11:06:38 -0400
parents
children b38bbcbd458d
comparison
equal deleted inserted replaced
3:528ba9c896e0 4:09f9f0e29e47
1 <tool id="fastq_paired_unpaired" name="Divide FASTQ file into paired and unpaired reads" version="0.1.1">
2 <description>using the read name suffices</description>
3 <requirements>
4 <requirement type="package" version="1.64">biopython</requirement>
5 <requirement type="python-module">Bio</requirement>
6 </requirements>
7 <stdio>
8 <!-- Anything other than zero is an error -->
9 <exit_code range="1:" />
10 <exit_code range=":-1" />
11 </stdio>
12 <version_command interpreter="python">fastq_paired_unpaired.py --version</version_command>
13 <command interpreter="python">
14 fastq_paired_unpaired.py $input_fastq.extension $input_fastq
15 #if $output_choice_cond.output_choice=="separate"
16 $output_forward $output_reverse
17 #elif $output_choice_cond.output_choice=="interleaved"
18 $output_paired
19 #end if
20 $output_singles
21 </command>
22 <inputs>
23 <param name="input_fastq" type="data" format="fastq" label="FASTQ file to divide into paired and unpaired reads"/>
24 <conditional name="output_choice_cond">
25 <param name="output_choice" type="select" label="How to output paired reads?">
26 <option value="separate">Separate (two FASTQ files, for the forward and reverse reads, in matching order).</option>
27 <option value="interleaved">Interleaved (one FASTQ file, alternating forward read then partner reverse read).</option>
28 </param>
29 <!-- Seems need these dummy entries here, compare this to indels/indel_sam2interval.xml -->
30 <when value="separate" />
31 <when value="interleaved" />
32 </conditional>
33 </inputs>
34 <outputs>
35 <data name="output_singles" format_source="input_fastq" label="Orphan or single reads"/>
36 <data name="output_forward" format_source="input_fastq" label="Forward paired reads">
37 <filter>output_choice_cond["output_choice"] == "separate"</filter>
38 </data>
39 <data name="output_reverse" format_source="input_fastq" label="Reverse paired reads">
40 <filter>output_choice_cond["output_choice"] == "separate"</filter>
41 </data>
42 <data name="output_paired" format_source="input_fastq" label="Interleaved paired reads">
43 <filter>output_choice_cond["output_choice"] == "interleaved"</filter>
44 </data>
45 </outputs>
46 <tests>
47 <test>
48 <param name="input_fastq" value="sanger-pairs-mixed.fastq" ftype="fastq"/>
49 <param name="output_choice" value="separate"/>
50 <output name="output_singles" file="sanger-pairs-singles.fastq" ftype="fastq"/>
51 <output name="output_forward" file="sanger-pairs-forward.fastq" ftype="fastq"/>
52 <output name="output_reverse" file="sanger-pairs-reverse.fastq" ftype="fastq"/>
53 </test>
54 <test>
55 <param name="input_fastq" value="sanger-pairs-mixed.fastq" ftype="fastq"/>
56 <param name="output_choice" value="interleaved"/>
57 <output name="output_singles" file="sanger-pairs-singles.fastq" ftype="fastq"/>
58 <output name="output_paired" file="sanger-pairs-interleaved.fastq" ftype="fastq"/>
59 </test>
60 </tests>
61 <help>
62
63 **What it does**
64
65 Using the common read name suffix conventions, it divides a FASTQ file into
66 paired reads, and orphan or single reads.
67
68 The input file should be a valid FASTQ file which has been sorted so that
69 any partner forward+reverse reads are consecutive. The output files all
70 preserve this sort order. Pairing are recognised based on standard name
71 suffices. See below or run the tool with no arguments for more details.
72
73 Any reads where the forward/reverse naming suffix used is not recognised
74 are treated as orphan reads. The tool supports the /1 and /2 convention
75 originally used by Illumina, .f and .r convention, the Sanger convention
76 (see http://staden.sourceforge.net/manual/pregap4_unix_50.html for details),
77 and the current Illumina convention where the reads get the same identifier
78 with the fragment number in the description, for example:
79
80 * @HWI-ST916:79:D04M5ACXX:1:1101:10000:100326 1:N:0:TGNCCA
81 * @HWI-ST916:79:D04M5ACXX:1:1101:10000:100326 2:N:0:TGNCCA
82
83 Note that this does support multiple forward and reverse reads per template
84 (which is quite common with Sanger sequencing), e.g. this which is sorted
85 alphabetically:
86
87 * WTSI_1055_4p17.p1kapIBF
88 * WTSI_1055_4p17.p1kpIBF
89 * WTSI_1055_4p17.q1kapIBR
90 * WTSI_1055_4p17.q1kpIBR
91
92 or this where the reads already come in pairs:
93
94 * WTSI_1055_4p17.p1kapIBF
95 * WTSI_1055_4p17.q1kapIBR
96 * WTSI_1055_4p17.p1kpIBF
97 * WTSI_1055_4p17.q1kpIBR
98
99 both become:
100
101 * WTSI_1055_4p17.p1kapIBF paired with WTSI_1055_4p17.q1kapIBR
102 * WTSI_1055_4p17.p1kpIBF paired with WTSI_1055_4p17.q1kpIBR
103
104 **References**
105
106 If you use this Galaxy tool in work leading to a scientific publication please
107 cite the following paper:
108
109 Peter J.A. Cock, Björn A. Grüning, Konrad Paszkiewicz and Leighton Pritchard (2013).
110 Galaxy tools and workflows for sequence analysis with applications
111 in molecular plant pathology. PeerJ 1:e167
112 http://dx.doi.org/10.7717/peerj.167
113
114 This tool is available to install into other Galaxy Instances via the Galaxy
115 Tool Shed at http://toolshed.g2.bx.psu.edu/view/peterjc/fastq_paired_unpaired
116 </help>
117 <citations>
118 <citation type="doi">10.7717/peerj.167</citation>
119 </citations>
120 </tool>