1
|
1 <tool id="fastq_paired_end_joiner" name="FASTQ joiner" version="2.0.0">
|
0
|
2 <description>on paired end reads</description>
|
|
3 <requirements>
|
|
4 <requirement type="package" version="1.0.0">galaxy_sequence_utils</requirement>
|
|
5 </requirements>
|
1
|
6 <command interpreter="python">fastq_paired_end_joiner.py '$input1_file' '${input1_file.extension[len( 'fastq' ):]}' '$input2_file' '${input2_file.extension[len( 'fastq' ):]}' '$output_file' '$style'</command>
|
0
|
7 <inputs>
|
|
8 <param name="input1_file" type="data" format="fastqsanger,fastqcssanger" label="Left-hand Reads" />
|
|
9 <param name="input2_file" type="data" format="fastqsanger,fastqcssanger" label="Right-hand Reads" />
|
1
|
10 <param name="style" type="select" label="FASTQ Header Style">
|
|
11 <option value="old" selected="true">old</option>
|
|
12 <option value="new">new</option>
|
|
13 </param>
|
0
|
14 </inputs>
|
|
15 <outputs>
|
|
16 <data name="output_file" format="input" />
|
|
17 </outputs>
|
|
18 <tests>
|
|
19 <test>
|
|
20 <param name="input1_file" value="split_pair_reads_1.fastqsanger" ftype="fastqsanger" />
|
|
21 <param name="input2_file" value="split_pair_reads_2.fastqsanger" ftype="fastqsanger" />
|
|
22 <output name="output_file" file="3.fastqsanger" />
|
|
23 </test>
|
|
24 </tests>
|
|
25 <help>
|
|
26 **What it does**
|
|
27
|
1
|
28 This tool joins paired end FASTQ reads from two separate files into a
|
|
29 single read in one file. The join is performed using sequence
|
|
30 identifiers, allowing the two files to contain differing ordering. If
|
|
31 a sequence identifier does not appear in both files, it is excluded
|
|
32 from the output.
|
0
|
33
|
|
34 -----
|
|
35
|
|
36 **Input formats**
|
|
37
|
1
|
38 Both old and new (from recent Illumina software) style FASTQ headers
|
|
39 are supported. The following example uses the "old" style.
|
|
40
|
0
|
41 Left-hand Read::
|
|
42
|
|
43 @HWI-EAS91_1_30788AAXX:7:21:1542:1758/1
|
|
44 GTCAATTGTACTGGTCAATACTAAAAGAATAGGATC
|
|
45 +HWI-EAS91_1_30788AAXX:7:21:1542:1758/1
|
|
46 hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh
|
|
47
|
|
48 Right-hand Read::
|
|
49
|
|
50 @HWI-EAS91_1_30788AAXX:7:21:1542:1758/2
|
|
51 GCTCCTAGCATCTGGAGTCTCTATCACCTGAGCCCA
|
|
52 +HWI-EAS91_1_30788AAXX:7:21:1542:1758/2
|
|
53 hhhhhhhhhhhhhhhhhhhhhhhh`hfhhVZSWehR
|
|
54
|
|
55 -----
|
|
56
|
|
57 **Output**
|
|
58
|
|
59 A multiple-fastq file, for example::
|
|
60
|
|
61 @HWI-EAS91_1_30788AAXX:7:21:1542:1758
|
|
62 GTCAATTGTACTGGTCAATACTAAAAGAATAGGATCGCTCCTAGCATCTGGAGTCTCTATCACCTGAGCCCA
|
|
63 +HWI-EAS91_1_30788AAXX:7:21:1542:1758
|
|
64 hhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhhh`hfhhVZSWehR
|
|
65
|
|
66 ------
|
|
67
|
1
|
68 **The "new" style**
|
|
69
|
|
70 Recent Illumina FASTQ headers are structured as follows::
|
|
71
|
|
72 @COORDS FLAGS
|
|
73 COORDS = INSTRUMENT:RUN_#:FLOWCELL_ID:LANE:TILE:X:Y
|
|
74 FLAGS = READ:IS_FILTERED:CONTROL_NUMBER:INDEX_SEQUENCE
|
|
75
|
|
76 where the whitespace character between COORDS and FLAGS can be either
|
|
77 a space or a tab.
|
0
|
78
|
1
|
79 ------
|
|
80
|
|
81 **Credits**
|
0
|
82
|
1
|
83 This is an extended version (adds support for "new" style FASTQ headers)
|
|
84 of D. Blankenberg's fastq joiner:
|
0
|
85
|
1
|
86 `Blankenberg D, Gordon A, Von Kuster G, Coraor N, Taylor J, Nekrutenko A; Galaxy Team. Manipulation of FASTQ data with Galaxy. Bioinformatics. 2010 Jul 15;26(14):1783-5. <http://www.ncbi.nlm.nih.gov/pubmed/20562416>`_
|
|
87
|
|
88 New style header support added by Simone Leo <simone.leo@crs4.it>
|
0
|
89 </help>
|
|
90 </tool>
|