annotate FastQ_QualConverter.xml @ 1:3990d6b37e2d draft default tip

Uploaded
author geert-vandeweyer
date Thu, 13 Feb 2014 08:24:43 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
1
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
1 <tool id="fastq_qual_convert" name="FASTQ QualityConverter" version="1.0.4">
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
2 <description>convert from various base-FASTQ quality formats to fastqsanger</description>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
3 <command interpreter="perl">FastQ_QualConverter.pl -i '$input_file' -f '$input_type' -o '$output_file'</command>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
4 <inputs>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
5 <param name="input_file" type="data" format="fastq" label="File to Convert" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
6 <param name="input_type" type="select" label="Input FASTQ quality scores type">
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
7 <option value='Auto' selected="True">Auto</option>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
8 <option value="solexa">Solexa</option>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
9 <option value="illumina">Illumina 1.3-1.7</option>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
10 <option value="sanger">Sanger (does nothing)</option>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
11 </param>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
12 </inputs>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
13 <outputs>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
14 <data name="output_file" format="fastqsanger">
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
15 </data>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
16 </outputs>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
17 <tests>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
18 <!-- These tests include test files adapted from supplemental material in Cock PJ, Fields CJ, Goto N, Heuer ML, Rice PM. The Sanger FASTQ file format for sequences with quality scores, and the Solexa/Illumina FASTQ variants. Nucleic Acids Res. 2009 Dec 16. -->
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
19 <!-- Unfortunately, cannot test for expected failures -->
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
20 <!-- Test basic options -->
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
21 <test>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
22 <param name="input_file" value="sanger_full_range_original_sanger.fastqsanger" ftype="fastq" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
23 <param name="input_type" value="sanger" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
24 <output name="output_file" file="sanger_full_range_original_sanger.fastqsanger" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
25 </test>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
26 <test>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
27 <param name="input_file" value="illumina_full_range_original_illumina.fastqillumina" ftype="fastq" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
28 <param name="input_type" value="illumina" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
29 <output name="output_file" file="illumina_full_range_as_sanger.fastqsanger" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
30 </test>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
31 <test>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
32 <param name="input_file" value="solexa_full_range_original_solexa.fastqsolexa" ftype="fastq" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
33 <param name="input_type" value="solexa" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
34 <output name="output_file" file="solexa_full_range_as_sanger.fastqsanger" />
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
35 </test>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
36 </tests>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
37 <help>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
38 **What it does**
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
39
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
40 This tool offers several conversions options relating to the FASTQ format.Output is always fastqsanger. Input can be specified or auto detected (based on first 15000 reads).
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
41
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
42 Hopefully it is faster than the default fastq groomer.
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
43
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
44
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
45 -----
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
46
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
47 **Quality Score Comparison**
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
48
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
49 ::
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
50
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
51 SSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSSS
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
52 ...............................IIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIIII
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
53 ..........................XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
54 !"#$%&amp;'()*+,-./0123456789:;&lt;=&gt;?@ABCDEFGHIJKLMNOPQRSTUVWXYZ[\]^_`abcdefghijklmnopqrstuvwxyz{|}~
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
55 | | | | | |
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
56 33 59 64 73 104 126
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
57
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
58 S - Sanger Phred+33, 93 values (0, 93) (0 to 60 expected in raw reads) (sanger = input)
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
59 I - Illumina 1.3 Phred+64, 62 values (0, 62) (0 to 40 expected in raw reads) (sanger = input - 31)
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
60 X - Solexa Solexa+64, 67 values (-5, 62) (-5 to 40 expected in raw reads) (sanger = 33 + 10 * log(1 + 10 ** (input) - 64) / 10.0)) / log(10);
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
61
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
62 Diagram adapted from http://en.wikipedia.org/wiki/FASTQ_format
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
63
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
64 .. class:: infomark
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
65
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
66 Output from Illumina 1.8+ pipelines are Sanger encoded.
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
67
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
68
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
69 </help>
3990d6b37e2d Uploaded
geert-vandeweyer
parents:
diff changeset
70 </tool>