annotate tools/next_gen_conversion/fastq_gen_conv.xml @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 <tool id="fastq_gen_conv" name="FASTQ Groomer" version="1.0.0">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 <description>converts any FASTQ to Sanger</description>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 <command interpreter="python">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 fastq_gen_conv.py
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 --input=$input
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 --origType=$origTypeChoice.origType
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 #if $origTypeChoice.origType == "sanger":
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 --allOrNot=$origTypeChoice.howManyBlocks.allOrNot
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 #if $origTypeChoice.howManyBlocks.allOrNot == "not":
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 --blocks=$origTypeChoice.howManyBlocks.blocks
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 #else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 --blocks="None"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 #end if
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 #else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 --allOrNot="None"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 --blocks="None"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 #end if
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18 --output=$output
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 </command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20 <inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21 <param name="input" type="data" format="fastq" label="Groom this dataset" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 <conditional name="origTypeChoice">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23 <param name="origType" type="select" label="How do you think quality values are scaled?" help="See below for explanation">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 <option value="solexa">Solexa/Illumina 1.0</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25 <option value="illumina">Illumina 1.3+</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26 <option value="sanger">Sanger (validation only)</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27 </param>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28 <when value="solexa" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29 <when value="illumina" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30 <when value="sanger">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31 <conditional name="howManyBlocks">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32 <param name="allOrNot" type="select" label="Since your fastq is already in Sanger format you can check it for consistency">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33 <option value="all">Check all (may take a while)</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34 <option selected="true" value="not">Check selected number of blocks</option>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35 </param>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36 <when value="all" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 <when value="not">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38 <param name="blocks" type="integer" value="1000" label="How many blocks (four lines each) do you want to check?" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39 </when>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40 </conditional>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41 </when>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42 </conditional>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 </inputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44 <outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45 <data name="output" format="fastqsanger"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46 </outputs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47 <tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49 <param name="input" value="fastq_gen_conv_in1.fastq" ftype="fastq" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50 <param name="origType" value="solexa" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 <output name="output" format="fastqsanger" file="fastq_gen_conv_out1.fastqsanger" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 <test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54 <param name="input" value="fastq_gen_conv_in2.fastq" ftype="fastq" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55 <param name="origType" value="sanger" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56 <param name="allOrNot" value="not" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57 <param name="blocks" value="3" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58 <output name="output" format="fastqsanger" file="fastq_gen_conv_out2.fastqsanger" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 </test>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60 </tests>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61 <help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 **What it does**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
64
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
65 Galaxy pipeline for mapping of Illumina data requires data to be in fastq format with quality values conforming to so called "Sanger" format. Unfortunately there are many other types of fastq. Thus the main objective of this tool is to "groom" multiple types of fastq into Sanger-conforming fastq that can be used in downstream application such as mapping.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
66
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
67 .. class:: infomark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
68
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
69 **TIP**: If the input dataset is already in Sanger format the tool does not perform conversion. However validation (described below) is still performed.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
70
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
71 -----
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
72
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
73 **Types of fastq datasets**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
74
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
75 A good description of fastq datasets can be found `here`__, while a description of Galaxy's fastq "logic" can be found `here`__. Because ranges of quality values within different types of fastq datasets overlap it very difficult to detect them automatically. This tool supports conversion of two commonly found types (Solexa/Illumina 1.0 and Illumina 1.3+) into fastq Sanger.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
76
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
77 .. __: http://en.wikipedia.org/wiki/FASTQ_format
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
78 .. __: http://wiki.g2.bx.psu.edu/Admin/NGS%20Local%20Setup
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
79
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
80 .. class:: warningmark
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
81
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
82 **NOTE** that there is also a type of fastq format where quality values are represented by a list of space-delimited integers (e.g., 40 40 20 15 -5 20 ...). This tool **does not** handle such fastq. If you have such a dataset, it needs to be converted into ASCII-type fastq (where quality values are encoded by characters) by "Numeric-to-ASCII" utility before it can accepted by this tool.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
83
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
84 -----
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
85
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
86 **Validation**
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
87
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
88 In addition to converting quality values to Sanger format the tool also checks the input dataset for consistency. Specifically, it performs these four checks:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
89
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
90 - skips empty lines
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
91 - checks that blocks are properly formed by making sure that:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
92
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
93 #. there are four lines per block
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
94 #. the first line starts with "@"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
95 #. the third line starts with "+"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
96 #. lengths of second line (sequences) and the fourth line (quality string) are identical
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
97
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
98 - checks that quality values are within range for the chosen fastq format (e.g., the format provided by the user in **How do you think quality values are scaled?** drop down.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
99
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
100 To see exactly what the tool does you can take a look at its source code `here`__.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
101
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
102 .. __: http://bitbucket.org/galaxy/galaxy-central/src/tip/tools/next_gen_conversion/fastq_gen_conv.py
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
103
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
104
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
105 </help>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
106 </tool>