comparison fastqvalidator.xml @ 32:9fe9e0e77afd draft

Uploaded
author nilesh
date Thu, 27 Jun 2013 11:13:27 -0400
parents
children
comparison
equal deleted inserted replaced
31:1849c4284f72 32:9fe9e0e77afd
1 <tool id="fastq_validator_wrapper" name="FastQ Validator">
2 <description>for each sequence in a file</description>
3 <requirements>
4 <requirement type="package" version="1.0.0">libStatGen</requirement>
5 <!-- <requirement type="package" version="1.0.0">fastq_validator</requirement> -->
6 </requirements>
7 <command> fastQValidator --file $input --minReadLen $minReadLen --maxErrors $maxErrors --printableErrors $printableErrors $baseComposition $disableSeqIDCheck $quiet $avgQual $spacetype $params > $output</command>
8 <inputs>
9 <param name="input" type="data" format="fastq,txt" label="FASTQ file"/>
10 <param name="minReadLen" value="10" type="integer" min="1" label="Minimum allowed read length (Default=10)"/>
11 <param name="maxErrors" type="integer" value="-1" min="-1" label="Number of errors to allow (Default=-1)" />
12 <param name="printableErrors" type="integer" value="20" optional="true" min="0" label="Max errors to print before suppressing (Default=20)" />
13 <param name="baseComposition" type="boolean" optional="true" label="Print Base Composition Statistics" truevalue="--baseComposition"/>
14 <param name="avgQual" type="boolean" optional="true" label="Print Avg Phred Quality/Cycle and Overall Avg Quality" truevalue="--avgQual"/>
15 <param name="disableSeqIDCheck" type="boolean" optional="true" label="Disable unique sequence identifier check (check to save memory)" truevalue="--disableSeqIDCheck"/>
16 <param name="quiet" type="boolean" optional="false" label="Suppress error/summary statistics display" truevalue="--quiet" />
17 <param name="params" type="boolean" optional="false" label="Print parameter settings" truevalue="--params"/>
18 <param name="spacetype" type="select" label="Space Options for Raw Sequence (Default=Auto)" display="radio">
19 <option selected="true" value="--auto">Auto</option>
20 <option value="--baseSpace">BaseSpace</option>
21 <option value="--colorSpace">ColorSpace</option>
22 </param>
23 </inputs>
24 <outputs>
25 <data format="txt" name="output" />
26 </outputs>
27 <help>
28
29 About
30 +++++
31
32 The fastQValidator validates the format of fastq files.
33 The initial version of a FASTQ Validator is complete. It was built using LibStatGen: FASTQ which is part of the libStatGen library.
34
35
36 Info on Errors
37 ++++++++++++++
38
39 Number of Errors to allow (default+-1):
40 Number of errors to allow before quitting reading/validating the file. -1 (default) indicates to not quit until the entire file is read. 0 indicates not to read/validate anything.
41
42 Max errors to print before suppressing (defualt+20):
43 Maximum number of errors to print before suppressing them (Defaults to 20). Different than maxErrors since printableErrors will continue reading and validating the file until the end, but just doesn't print the errors.
44
45 **Info on Space Options for Raw Sequence**
46 auto: Determine baseSpace/colorSpace from the Raw Sequence in the file (Default)
47 baseSpace: ACTGN only
48 colorSpace: 0123. only (with 1 character primer base)
49
50
51 Output
52 ++++++
53
54 When running the fastQValidator Executable, if the --params option is specified, the output starts with a summary of the parameters::
55
56 =============================================================================
57 The following parameters are available. Ones with "[]" are in effect::
58
59 Input Parameters
60 --file [../fastqValidator/test/testFile.txt], --baseComposition,
61 --disableSeqIDCheck, --quiet, --params [ON], --minReadLen [10],
62 --maxErrors [-1]
63 Space Type : --baseSpace, --colorSpace, --auto [ON]
64 Errors : --ignoreErrors, --printableErrors [20]
65 =============================================================================
66
67 The Validator Executable outputs error messages for invalid sequences based on Validation Criteria. For Example: ::
68
69 ======================================================================
70 ERROR on Line 25: The sequence identifier line was too short.
71 ERROR on Line 29: First line of a sequence does not begin wtih @
72 ERROR on Line 33: No Sequence Identifier specified before the comment.
73 ======================================================================
74
75 Base Composition Percentages by Index are printed if --printBaseComp is set to ON. For Example: ::
76
77 ========================================================================
78 Base Composition Statistics:
79 Read Index %A %C %G %T %N Total Reads At Index
80 0 100.00 0.00 0.00 0.00 0.00 20
81 1 5.00 95.00 0.00 0.00 0.00 20
82 2 5.00 0.00 5.00 90.00 0.00 20
83 ========================================================================
84
85 Phred Quality by Index are printed if --avgQual is set to ON in a version after May 29, 2012. Only valid qualities are included in these averages. For Example::
86
87 ==================================================
88 Average Phred Quality by Read Index (starts at 0):
89 Read Index Average Quality
90 0 44.10
91 1 45.55
92 2 51.11
93 3 47.68
94 4 47.37
95
96 Overall Average Phred Quality = 50.40
97 ==================================================
98
99 Summary of the number of lines, sequences, and errors: ::
100
101 =======================================================================
102 Finished processing testFile.txt with 92 lines containing 20 sequences.
103 There were a total of 17 errors.
104 =======================================================================
105
106 </help>
107
108 </tool>