Mercurial > repos > lparsons > cutadapt
annotate cutadapt.xml @ 5:1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
author | Lance Parsons <lparsons@princeton.edu> |
---|---|
date | Fri, 22 Jul 2011 11:03:00 -0400 |
parents | 0a872e59164c |
children | 2d6671b10919 |
rev | line source |
---|---|
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
1 <tool id="cutadapt" name="Cutadapt" version="0.9.5.a"> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
2 <description>Remove adapter sequences from Fastq/Fasta</description> |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
3 <requirements> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
4 <requirement type="python-module">cutadapt</requirement> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
5 </requirements> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
6 |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
7 <command>cutadapt |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
8 #if $input.extension.startswith( "fastq"): |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
9 --format=fastq |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
10 #else |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
11 --format=$input.extension |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
12 #end if |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
13 #for $a in $adapters |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
14 --adapter='${a.adapter_source.adapter}' |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
15 #end for |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
16 #for $aa in $anywhere_adapters |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
17 --anywhere='${aa.anywhere_adapter_source.anywhere_adapter}' |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
18 #end for |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
19 --error-rate=$error_rate |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
20 --times=$count |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
21 --overlap=$overlap |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
22 #if str($min) != '0': |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
23 --minimum-length=$min |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
24 #end if |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
25 #if str($max) != '0': |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
26 --maximum-length=$max |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
27 #end if |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
28 #if str($quality_cutoff) != '0': |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
29 --quality-cutoff=$quality_cutoff |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
30 #end if |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
31 $discard |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
32 --output='$output' |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
33 #if str( $output_params.output_type ) == "additional": |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
34 #if $output_params.rest_file: |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
35 --rest-file=$rest_output |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
36 #end if |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
37 #if $output_params.too_short_file: |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
38 --too-short-output=$too_short_output |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
39 #end if |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
40 #if $output_params.untrimmed_file: |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
41 --untrimmed-output=$untrimmed_output |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
42 #end if |
3
7ed26fc9fa8a
Updated for cutadapt 0.9.4, no longer need python wrapper
Lance Parsons <lparsons@princeton.edu>
parents:
2
diff
changeset
|
43 #end if |
7ed26fc9fa8a
Updated for cutadapt 0.9.4, no longer need python wrapper
Lance Parsons <lparsons@princeton.edu>
parents:
2
diff
changeset
|
44 '$input' |
4
0a872e59164c
Added discard_stderr_wrapper.sh script to catch report and redirect to stdout
Lance Parsons <lparsons@princeton.edu>
parents:
3
diff
changeset
|
45 > $report |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
46 </command> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
47 <inputs> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
48 <param format="fastqsanger, fasta" name="input" type="data" optional="false" label="Fastq file to trim" length="100"/> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
49 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
50 <repeat name="adapters" title="3' Adapters"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
51 <conditional name="adapter_source"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
52 <param name="adapter_source_list" type="select" label="Source" > |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
53 <option value="prebuilt" selected="true">Standard (select from the list below)</option> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
54 <option value="user">Enter custom sequence</option> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
55 </param> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
56 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
57 <when value="user"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
58 <param name="adapter" size="30" label="Enter custom 3' adapter sequence" type="text" value="AATTGGCC" help="Sequence of an adapter that was ligated to the 3' end. The adapter itself and anything that follows is trimmed. If multiple adapters are specified, only the best matching adapter is trimmed."/> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
59 </when> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
60 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
61 <when value="prebuilt"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
62 <param name="adapter" type="select" label="Choose 3' adapter" help="Sequence of an adapter that was ligated to the 3' end. The adapter itself and anything that follows is trimmed. If multiple adapters are specified, only the best matching adapter is trimmed."> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
63 <options from_file="fastx_clipper_sequences.txt"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
64 <column name="name" index="1"/> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
65 <column name="value" index="0"/> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
66 </options> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
67 </param> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
68 </when> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
69 </conditional> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
70 </repeat> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
71 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
72 <repeat name="anywhere_adapters" title="5' or 3' (Anywhere) Adapters" help="Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. If multiple -a or -b options are given, only the best matching adapter is trimmed."> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
73 <conditional name="anywhere_adapter_source"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
74 <param name="anywhere_adapter_source_list" type="select" label="Source"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
75 <option value="prebuilt" selected="true">Standard (select from the list below)</option> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
76 <option value="user">Enter custom sequence</option> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
77 </param> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
78 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
79 <when value="user"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
80 <param name="anywhere_adapter" size="30" label="Enter custom 5' or 3' adapter sequence" type="text" value="AATTGGCC" help="Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. If multiple -a or -b options are given, only the best matching adapter is trimmed."/> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
81 </when> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
82 <when value="prebuilt"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
83 <param name="anywhere_adapter" type="select" label="Choose 5' or 3' adapter" help="Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. If multiple -a or -b options are given, only the best matching adapter is trimmed."> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
84 <options from_file="fastx_clipper_sequences.txt"> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
85 <column name="name" index="1"/> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
86 <column name="value" index="0"/> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
87 </options> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
88 </param> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
89 </when> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
90 </conditional> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
91 </repeat> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
92 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
93 <param name="error_rate" type="float" min="0" max="1" value="0.1" label="Maximum error rate" help="Maximum allowed error rate (no. of errors divided by the length of the matching region)." /> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
94 <param name="count" type="integer" min="1" value="1" label="Match times" help="Try to remove adapters at most COUNT times. Useful when an adapter gets appended multiple times." /> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
95 <param name="overlap" type="integer" min="1" value="3" label="Minimum overlap length" help="Minimum overlap length. If the overlap between the adapter and the sequence is shorter than LENGTH, the read is not modified." /> |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
96 <param name="discard" type="boolean" value="false" truevalue="--discard" falsevalue="" label="Discard Trimmed Reads" help="Discard reads that contain the adapter instead of trimming them. Use the 'Minimum overlap length' option in order to avoid throwing away too many randomly matching reads!" /> |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
97 <param name="min" type="integer" min="0" optional="true" value="0" label="Minimum length" help="Discard trimmed reads that are shorter than LENGTH. Reads that are too short even before adapter removal are also discarded. In colorspace, an initial primer is not counted. Value of 0 means no minimum length." /> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
98 <param name="max" type="integer" min="0" optional="true" value="0" label="Maximum length" help="Discard trimmed reads that are longer than LENGTH. Reads that are too long even before adapter removal are also discarded. In colorspace, an initial primer is not counted. Value of 0 means no maximum length." /> |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
99 <param name="quality_cutoff" type="integer" min="0" optional="true" value="0" label="Quality cutoff" help="Trim low-quality ends from reads before adapter removal. The algorithm is the same as the one used by BWA (Subtract CUTOFF from all qualities; compute partial sums from all indices to the end of the sequence; cut sequence at the index at which the sum is minimal). Value of 0 means no quality trimming." /> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
100 <conditional name="output_params"> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
101 <param name="output_type" type="select" label="Additional output options" help="By default all reads will be put in the same file. However, reads with adapters matching in the middle, unmatched reads, and too-short reads can be saved in separate files."> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
102 <option value="default">Default</option> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
103 <option value="additional">Additional output files</option> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
104 </param> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
105 <when value="default" /> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
106 <when value="additional"> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
107 <param name="rest_file" type="boolean" value="false" label="Rest of Read" help="When the adapter matches in the middle of a read, write the rest (after the adapter) into a file."/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
108 <param name="too_short_file" type="boolean" value="false" label="Too Short Reads" help="Write reads that are too short (according to minimum length specified) to a file. (default: discard reads)"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
109 <param name="untrimmed_file" type="boolean" value="false" label="Untrimmed Reads" help="Write reads that do not contain the adapter to a separate file, instead of writing them to the regular output file. (default: output to same file as trimmed)"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
110 </when> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
111 </conditional> |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
112 </inputs> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
113 <outputs> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
114 <data format="txt" name="report" label="${tool.name} on ${on_string} (Report)" /> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
115 <data format="input" name="output" metadata_source="input"/> |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
116 <data format="input" name="rest_output" metadata_source="input" label="${tool.name} on ${on_string} (Rest of Reads)" > |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
117 <filter>(output_params['output_type'] == "additional")</filter> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
118 <filter>(output_params['rest_file'] is True)</filter> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
119 </data> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
120 <data format="input" name="too_short_output" metadata_source="input" label="${tool.name} on ${on_string} (Too Short Reads)" > |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
121 <filter>(output_params['output_type'] == "additional")</filter> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
122 <filter>(output_params['too_short_file'] is True)</filter> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
123 </data> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
124 <data format="input" name="untrimmed_output" metadata_source="input" label="${tool.name} on ${on_string} (Untrimmed Reads)" > |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
125 <filter>(output_params['output_type'] == "additional")</filter> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
126 <filter>(output_params['untrimmed_file'] is True)</filter> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
127 </data> |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
128 </outputs> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
129 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
130 <tests> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
131 <test> |
5
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
132 <param name="input" value="cutadapt_small.fastq" ftype="fastqsanger"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
133 <param name="adapter_source_list" value="user"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
134 <param name="adapter" value=""/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
135 <param name="anywhere_adapter_source_list" value="user"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
136 <param name="anywhere_adapter" value="TTAGACATATCTCCGTCG"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
137 <param name="output_type" value="default"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
138 <output name="output" file="cutadapt_small.out"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
139 </test> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
140 <test> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
141 <param name="input" value="cutadapt_small.fastq" ftype="fastqsanger"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
142 <param name="adapter_source_list" value="user"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
143 <param name="adapter" value="TTAGACATATCTCCGTCG"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
144 <param name="anywhere_adapter_source_list" value="user"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
145 <param name="anywhere_adapter" value=""/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
146 <param name="discard" value="true"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
147 <param name="output_type" value="default"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
148 <output name="output" file="cutadapt_discard.out"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
149 </test> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
150 <test> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
151 <param name="input" value="cutadapt_rest.fa" ftype="fasta"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
152 <param name="adapter_source_list" value="user"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
153 <param name="adapter" value="ADAPTER"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
154 <param name="anywhere_adapter_source_list" value="user"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
155 <param name="anywhere_adapter" value=""/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
156 <param name="output_type" value="additional"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
157 <param name="rest_file" value="true"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
158 <output name="output" file="cutadapt_rest.out"/> |
1dada50cca8a
Support for cutadapt 0.9.5, added quality trimming and additional output options
Lance Parsons <lparsons@princeton.edu>
parents:
4
diff
changeset
|
159 <output name="rest_output" file="cutadapt_rest2.out"/> |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
160 </test> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
161 </tests> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
162 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
163 <help> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
164 This tool removes adapter sequences from DNA high-throughput |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
165 sequencing data. This is usually necessary when the read length of the |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
166 machine is longer than the molecule that is sequenced, such as in |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
167 microRNA data. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
168 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
169 The tool is based on the opensource cutadapt_ tool. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
170 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
171 ----- |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
172 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
173 **Algorithm** |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
174 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
175 cutadapt uses a simple semi-global alignment algorithm, without any special optimizations. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
176 For speed, the algorithm is implemented as a Python extension module in calignmodule.c. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
177 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
178 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
179 **Partial adapter matches** |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
180 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
181 Cutadapt correctly deals with partial adapter matches. As an example, suppose |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
182 your adapter sequence is "ADAPTER" (specified via 3' Adapters parameter). |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
183 If you have these input sequences: |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
184 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
185 :: |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
186 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
187 MYSEQUENCEADAPTER |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
188 MYSEQUENCEADAP |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
189 MYSEQUENCEADAPTERSOMETHINGELSE |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
190 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
191 All of them will be trimmed to "MYSEQUENCE". If the sequence starts with an |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
192 adapter, like this: |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
193 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
194 :: |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
195 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
196 ADAPTERSOMETHING |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
197 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
198 It will be empty after trimming. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
199 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
200 When the allowed error rate is sufficiently high, errors in |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
201 the adapter sequence are allowed. For example, ADABTER (1 mismatch), ADAPTR (1 deletion), |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
202 and ADAPPTER (1 insertion) will all be recognized if the error rate is set to 0.15. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
203 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
204 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
205 **Allowing adapters anywhere** |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
206 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
207 Cutadapt assumes that any adapter specified via the *3` Adapters* parameter |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
208 was ligated to the 3' end of the sequence. This is the correct assumption for |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
209 at least the SOLiD and Illumina small RNA protocols and probably others. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
210 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
211 If, on the other hand, your adapter can also be ligated to the 5' end (on |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
212 purpose or by accident), you should tell cutadapt so by using the *5' or 3' (Anywhere) |
2 | 213 Adapters* parameter. It will then use a different alignment algorithm and |
0
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
214 correctly trim adapters that appear in the beginning of a read. An adapter |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
215 specified this way will also be found if it appears only partially in the |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
216 beginning of a read. For example, these sequences |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
217 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
218 :: |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
219 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
220 ADAPTERMYSEQUENCE |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
221 PTERMYSEQUENCE |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
222 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
223 will be trimmed to "MYSEQUENCE". Note that the regular algorithm would trim |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
224 the first read to an empty sequence. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
225 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
226 This parameter currently does not work with color space data. |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
227 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
228 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
229 .. _cutadapt: http://code.google.com/p/cutadapt/ |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
230 </help> |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
231 |
8b064ea16722
Initial version with multiple adapter support
Lance Parsons <lparsons@princeton.edu>
parents:
diff
changeset
|
232 </tool> |