annotate rnaseq/cutadapt/cutadapt.xml @ 12:0773b11fb822

cutadapters added
author jjkoehorst <jasperkoehorst@gmail.com>
date Sat, 21 Feb 2015 16:56:49 +0100
parents a712b378e090
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
11
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
1 <tool id="cutadapt" name="Cutadapt" version="1.6">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
2 <description>Remove adapter sequences from Fastq/Fasta</description>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
3 <requirements>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
4 <requirement type="package" version="1.6">cutadapt</requirement>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
5 </requirements>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
6 <version_command>cutadapt --version</version_command>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
7
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
8 <command>cutadapt
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
9 #if $input.extension.startswith( "fastq"):
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
10 --format=fastq
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
11 #if $input.extension == "fastqillumina":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
12 --quality-base=64
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
13 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
14 #if $input.extension == "fastqsolexa":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
15 --quality-base=64
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
16 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
17 #else
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
18 --format=$input.extension
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
19 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
20 #for $a in $adapters
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
21 #if $a.adapter_source.adapter_source_list == 'prebuilt':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
22 --adapter="${a.adapter_source.adapter.fields.name}"='${a.adapter_source.adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
23 #else if str($a.adapter_source.adapter_name) != "":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
24 --adapter='${a.adapter_source.adapter_name}'='${a.adapter_source.adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
25 #else
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
26 --adapter='${a.adapter_source.adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
27 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
28 #end for
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
29
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
30 $anywhere_adapters_textbox
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
31
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
32 #for $aa in $anywhere_adapters
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
33 #if $aa.anywhere_adapter_source.anywhere_adapter_source_list == 'prebuilt':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
34 --anywhere="${aa.anywhere_adapter_source.anywhere_adapter.fields.name}"='${aa.anywhere_adapter_source.anywhere_adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
35 #else if str($aa.anywhere_adapter_source.anywhere_adapter_name) != "":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
36 --anywhere='${aa.anywhere_adapter_source.anywhere_adapter_name}'='${aa.anywhere_adapter_source.anywhere_adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
37 #else
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
38 --anywhere='${aa.anywhere_adapter_source.anywhere_adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
39 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
40 #end for
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
41 #for $fa in $front_adapters
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
42 #if $fa.front_adapter_source.front_adapter_source_list == 'prebuilt':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
43 --front="${fa.front_adapter_source.front_adapter.fields.name}"='${fa.front_adapter_source.front_adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
44 #else if str($fa.front_adapter_source.front_adapter_name) != "":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
45 --front='${fa.front_adapter_source.front_adapter_name}'='${fa.front_adapter_source.front_adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
46 #else
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
47 --front='${fa.front_adapter_source.front_adapter}'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
48 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
49 #end for
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
50 --error-rate=$error_rate
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
51 --times=$count
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
52 --overlap=$overlap
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
53 $no_indels
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
54 $match_read_wildcards
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
55
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
56 #if str( $output_filtering_options.output_filtering) == "filter":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
57 $output_filtering_options.discard
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
58 $output_filtering_options.discard_untrimmed
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
59 $output_filtering_options.no_trim
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
60 $output_filtering_options.mask_adapter
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
61 #if str($output_filtering_options.min) != '0':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
62 --minimum-length=$output_filtering_options.min
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
63 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
64 #if str($output_filtering_options.max) != '0':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
65 --maximum-length=$output_filtering_options.max
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
66 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
67 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
68
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
69 --output='$output'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
70
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
71 #if $paired_end.paired_end_boolean:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
72 --paired-output='$paired_output'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
73 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
74
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
75 #if str( $output_params.output_type ) == "additional":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
76 #if $output_params.rest_file:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
77 --rest-file=$rest_output
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
78 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
79 #if $output_params.wildcard_file:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
80 --wildcard-file=$wild_output
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
81 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
82 #if $output_params.too_short_file:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
83 --too-short-output=$too_short_output
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
84 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
85 #if $output_params.too_long_file:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
86 --too-long-output=$too_long_output
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
87 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
88 #if $output_params.untrimmed_file:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
89 --untrimmed-output=$untrimmed_output
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
90 #if $paired_end.paired_end_boolean:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
91 --untrimmed-paired-output=$untrimmed_paired_output
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
92 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
93 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
94 #if $output_params.info_file:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
95 --info-file=$info_file
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
96 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
97
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
98 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
99
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
100 #if str( $read_modification_params.read_modification) == "modify":
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
101 #if str($read_modification_params.quality_cutoff) != '0':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
102 --quality-cutoff=$read_modification_params.quality_cutoff
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
103 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
104 #if str($read_modification_params.cut) != '0':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
105 --cut=$read_modification_params.cut
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
106 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
107 #if $read_modification_params.prefix != '':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
108 --prefix="$read_modification_params.prefix"
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
109 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
110 #if $read_modification_params.suffix != '':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
111 --suffix="$read_modification_params.suffix"
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
112 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
113 #if $read_modification_params.length_tag != '':
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
114 --length-tag="$read_modification_params.length_tag"
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
115 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
116 $read_modification_params.zero_cap
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
117 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
118
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
119 '$input'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
120
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
121 #if $paired_end.paired_end_boolean:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
122 '$input2'
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
123 #end if
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
124
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
125 > $report
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
126 </command>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
127 <inputs>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
128 <param format="fastq, fastqsanger, fastqillumina, fastqsolexa, fasta" name="input" type="data" optional="false" label="Fastq file to trim" length="100"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
129 <conditional name="paired_end">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
130 <param name="paired_end_boolean" type="boolean" value="false" label="Track Paired Reads" help="This option will keep a second file synchronized if you use one of the filtering options that discards reads. It will NOT trim adapters off of the second read. You must run Cutadapt a second time on the output of the first run to trim adapters from both reads (see Cutadapt documentation for details)." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
131 <when value="true">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
132 <param format="fastq, fastqsanger, fastqillumina, fastqsolexa, fasta" name="input2" type="data" optional="false" label="Paired fastq file (NOT trimmed)" length="100"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
133 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
134 </conditional>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
135 <param name="anywhere_adapters_textbox" area="True" size="20x40" type="text" value="-b GATCGGAAGAGCACACGTCTGAACTCCAGTCACACAGTGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTGATATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACACTTGAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTCAAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACAGTTCCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACATCACGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACATGAGCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACATGTCAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACATTCCTATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAAAAGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAACTAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCACCGGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCACGATATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCACTCAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGATCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCAGGCGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCATGGCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCATTTTATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCCAACAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCCGTCCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGATGTATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGGAATATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCGTACGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTAGCTATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTATACATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTCAGAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACCTTGTAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGACGACATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGAGTGGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGATCAGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGCCAATATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGGCTACATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGGTAGCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTAGAGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTCCGCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGAAAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTGGCCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACGTTTCGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAATCGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTACAGCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTAGCTTATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTATAATATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCATTCATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCCCGAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCGAAGATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTCGGCAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTGACCAATCTCGTATGCCGTCTTCTGCTTG -b GATCGGAAGAGCACACGTCTGAACTCCAGTCACTTAGGCATCTCGTATGCCGTCTTCTGCTTG -b " label="Preset cutadapters, replace by yours if required"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
136
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
137 <repeat name="adapters" title="3' Adapters" help="Sequence of an adapter that was ligated to the 3' end. The adapter itself and anything that follows is trimmed.">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
138 <conditional name="adapter_source">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
139 <param name="adapter_source_list" type="select" label="Source" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
140 <option value="prebuilt" selected="true">Standard (select from the list below)</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
141 <option value="user">Enter custom sequence</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
142 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
143
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
144 <when value="user">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
145 <param name="adapter_name" size="30" label="Enter custom 3' adapter name (Optional)" type="text" value="" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
146 <param name="adapter" size="30" label="Enter custom 3' adapter sequence" type="text" value="AATTGGCC" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
147 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
148
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
149 <when value="prebuilt">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
150 <param name="adapter" type="select" label="Choose 3' adapter">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
151 <options from_file="cutadapt_adapters.txt">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
152 <column name="name" index="1"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
153 <column name="value" index="0"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
154 </options>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
155 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
156 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
157 </conditional>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
158 </repeat>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
159
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
160 <repeat name="anywhere_adapters" title="5' or 3' (Anywhere) Adapters" help="Sequence of an adapter that was ligated to the 5' or 3' end. If the adapter is found within the read or overlapping the 3' end of the read, the behavior is the same as for the -a option. If the adapter overlaps the 5' end (beginning of the read), the initial portion of the read matching the adapter is trimmed, but anything that follows is kept. If multiple -a or -b options are given, only the best matching adapter is trimmed.">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
161 <conditional name="anywhere_adapter_source">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
162 <param name="anywhere_adapter_source_list" type="select" label="Source">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
163 <option value="prebuilt" selected="true">Standard (select from the list below)</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
164 <option value="user">Enter custom sequence</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
165 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
166
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
167 <when value="user">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
168 <param name="anywhere_adapter_name" size="30" label="Enter custom 5' or 3' adapter name (Optional)" type="text" value="" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
169 <param name="anywhere_adapter" size="30" label="Enter custom 5' or 3' adapter sequence" type="text" value="AATTGGCC" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
170 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
171 <when value="prebuilt">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
172 <param name="anywhere_adapter" type="select" label="Choose 5' or 3' adapter">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
173 <options from_file="cutadapt_adapters.txt">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
174 <column name="name" index="1"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
175 <column name="value" index="0"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
176 </options>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
177 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
178 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
179 </conditional>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
180 </repeat>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
181
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
182 <repeat name="front_adapters" title="5' (Front) Adapters" help="Sequence of an adapter that was ligated to the 5' end. If the adapter sequence starts with the character '^', the adapter is 'anchored'. An anchored adapter must appear in its entirety at the 5' end of the read (it is a prefix of the read). A non-anchored adapter may appear partially at the 5' end, or it may occur within the read. If it is found within a read, the sequence preceding the adapter is also trimmed. In all cases the adapter itself is trimmed.">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
183 <conditional name="front_adapter_source">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
184 <param name="front_adapter_source_list" type="select" label="Source">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
185 <option value="prebuilt" selected="true">Standard (select from the list below)</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
186 <option value="user">Enter custom sequence</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
187 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
188
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
189 <when value="user">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
190 <param name="front_adapter_name" size="30" label="Enter custom 5' adapter name (Optional)" type="text" value="" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
191 <param name="front_adapter" size="30" label="Enter custom 5' adapter sequence" type="text" value="AATTGGCC" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
192 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
193 <when value="prebuilt">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
194 <param name="front_adapter" type="select" label="Choose 5' adapter">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
195 <options from_file="cutadapt_adapters.txt">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
196 <column name="name" index="1"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
197 <column name="value" index="0"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
198 </options>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
199 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
200 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
201 </conditional>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
202 </repeat>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
203
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
204 <param name="error_rate" type="float" min="0" max="1" value="0.1" label="Maximum error rate" help="Maximum allowed error rate (no. of errors divided by the length of the matching region)." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
205 <param name="no_indels" type="boolean" value="false" truevalue="--no-indels" falsevalue="" label="Do not allow indels (Use ONLY with anchored 5' (front) adapters)." help="Do not allow indels in the alignments. That is, allow only mismatches. This option is currently only supported for anchored 5' adapters ('^ADAPTER') (default: both mismatches and indels are allowed)." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
206 <param name="count" type="integer" min="1" value="1" label="Match times" help="Try to remove adapters at most COUNT times. Useful when an adapter gets appended multiple times." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
207 <param name="overlap" type="integer" min="1" value="3" label="Minimum overlap length" help="Minimum overlap length. If the overlap between the adapter and the sequence is shorter than LENGTH, the read is not modified. This reduces the number of bases trimmed purely due to short random adapter matches." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
208 <param name="match_read_wildcards" type="boolean" value="false" truevalue="--match-read-wildcards" falsevalue="" label="Match Read Wildcards" help="Allow 'N's in the read as matches to the adapter." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
209
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
210 <conditional name="output_filtering_options">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
211 <param name="output_filtering" type="select" label="Output filtering options" help="Options for filtering processed reads by those that contain the adapter or by minimum or maximum length">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
212 <option value="default">Default (no filtering)</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
213 <option value="filter">Set Filters</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
214 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
215 <when value="default" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
216 <when value="filter">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
217 <param name="discard" type="boolean" value="false" truevalue="--discard" falsevalue="" label="Discard Trimmed Reads" help="Discard reads that contain the adapter instead of trimming them. Use the 'Minimum overlap length' option in order to avoid throwing away too many randomly matching reads!" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
218 <param name="discard_untrimmed" type="boolean" value="false" truevalue="--discard-untrimmed" falsevalue="" label="Discard Untrimmed Reads" help="Discard reads that do not contain the adapter." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
219 <param name="min" type="integer" min="0" optional="true" value="0" label="Minimum length" help="Discard trimmed reads that are shorter than LENGTH. Reads that are too short even before adapter removal are also discarded. In colorspace, an initial primer is not counted. Value of 0 means no minimum length." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
220 <param name="max" type="integer" min="0" optional="true" value="0" label="Maximum length" help="Discard trimmed reads that are longer than LENGTH. Reads that are too long even before adapter removal are also discarded. In colorspace, an initial primer is not counted. Value of 0 means no maximum length." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
221 <param name="no_trim" type="boolean" value="false" truevalue="--no-trim" falsevalue="" label="Do not trim adapters" help="Match and redirect reads to output/untrimmed-output as usual, but don't remove the adapters (default: trim the adapters)." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
222 <param name="mask_adapter" type="boolean" value="false" truevalue="--mask-adapter" falsevalue="" label="Mask Adapters" help="Mask adapter bases with 'N' instead of trimming them (default: trim adapters)." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
223 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
224 </conditional>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
225
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
226 <conditional name="output_params">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
227 <param name="output_type" type="select" label="Additional output options" help="By default all reads will be put in the same file. However, reads with adapters matching in the middle, unmatched reads, and too-short reads can be saved in separate files.">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
228 <option value="default">Default</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
229 <option value="additional">Additional output files</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
230 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
231 <when value="default" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
232 <when value="additional">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
233 <param name="info_file" type="boolean" value="false" label="Info File" help="Write information about each read and its adapter matches to a file."/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
234 <param name="rest_file" type="boolean" value="false" label="Rest of Read" help="When the adapter matches in the middle of a read, write the rest (after the adapter) into a file."/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
235 <param name="wildcard_file" type="boolean" value="false" label="Wildcard File" help="When the adapter has wildcard bases ('N's) write adapter bases matching wildcard positions to file."/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
236 <param name="too_short_file" type="boolean" value="false" label="Too Short Reads" help="Write reads that are too short (according to minimum length specified) to a file. (default: discard reads)"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
237 <param name="too_long_file" type="boolean" value="false" label="Too Long Reads" help="Write reads that are too long (according to maximum length specified) to a file. (default: discard reads)"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
238 <param name="untrimmed_file" type="boolean" value="false" label="Untrimmed Reads" help="Write reads that do not contain the adapter to a separate file, instead of writing them to the regular output file. (default: output to same file as trimmed)"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
239 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
240 </conditional>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
241
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
242 <conditional name="read_modification_params">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
243 <param name="read_modification" type="select" label="Additional modifications to reads" help="Various options to trim reads based on quality, modify read names and quality scores">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
244 <option value="none">No Read Modifications</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
245 <option value="modify">Set Modification Options</option>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
246 </param>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
247 <when value="none" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
248 <when value="modify">
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
249 <param name="cut" type="integer" optional="true" value="0" label="Cut bases from reads before adapter trimming" help="Remove bases from the beginning or end of each read before trimming adapters. If positive, the bases are removed from the beginning of each read. If negative, the bases are removed from the end of each read." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
250 <param name="quality_cutoff" type="integer" min="0" optional="true" value="0" label="Quality cutoff" help="Trim low-quality ends from reads before adapter removal. The algorithm is the same as the one used by BWA (Subtract CUTOFF from all qualities; compute partial sums from all indices to the end of the sequence; cut sequence at the index at which the sum is minimal). Value of 0 means no quality trimming." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
251 <param name="prefix" label="Prefix" type="text" help="Add this prefix to read names" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
252 <param name="suffix" label="Suffix" type="text" help="Add this suffix to read names" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
253 <param name="strip_suffix" label="Strip suffix" type="text" help="Remove this suffix from read names if present." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
254 <param name="length_tag" label="Length Tag" type="text" help="Search for TAG followed by a decimal number in the name of the read (description/comment field of the FASTA or FASTQ file). Replace the decimal number with the correct length of the trimmed read. For example, use --length-tag 'length=' to search for fields like 'length=123'." />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
255 <param name="zero_cap" type="boolean" value="false" label="Change negative quality values to zero (0)" truevalue="--zero-cap" falsevalue="" help="Workaround to avoid segmentation faults in BWA" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
256 </when>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
257 </conditional>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
258 </inputs>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
259
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
260 <outputs>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
261 <data format="txt" name="report" label="${tool.name} on ${on_string} (Report)" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
262 <data format="input" name="output" metadata_source="input" label="${tool.name} on ${on_string} (Reads)"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
263 <data format="input" name="paired_output" metadata_source="input" label="${tool.name} on ${on_string} (Paired Reads)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
264 <filter>(paired_end['paired_end_boolean'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
265 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
266 <data format="input" name="rest_output" metadata_source="input" label="${tool.name} on ${on_string} (Rest of Reads)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
267 <filter>(output_params['output_type'] == "additional")</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
268 <filter>(output_params['rest_file'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
269 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
270 <data format="txt" name="wild_output" metadata_source="input" label="${tool.name} on ${on_string} (Wildcard File)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
271 <filter>(output_params['output_type'] == "additional")</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
272 <filter>(output_params['wildcard_file'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
273 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
274 <data format="input" name="too_short_output" metadata_source="input" label="${tool.name} on ${on_string} (Too Short Reads)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
275 <filter>(output_params['output_type'] == "additional")</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
276 <filter>(output_params['too_short_file'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
277 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
278 <data format="input" name="too_long_output" metadata_source="input" label="${tool.name} on ${on_string} (Too Long Reads)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
279 <filter>(output_params['output_type'] == "additional")</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
280 <filter>(output_params['too_long_file'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
281 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
282 <data format="input" name="untrimmed_output" metadata_source="input" label="${tool.name} on ${on_string} (Untrimmed Reads)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
283 <filter>(output_params['output_type'] == "additional")</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
284 <filter>(output_params['untrimmed_file'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
285 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
286 <data format="input" name="untrimmed_paired_output" metadata_source="input" label="${tool.name} on ${on_string} (Untrimmed Paired Reads)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
287 <filter>(paired_end['paired_end_boolean'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
288 <filter>(output_params['output_type'] == "additional")</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
289 <filter>(output_params['untrimmed_file'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
290 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
291 <data format="txt" name="info_file" metadata_source="input" label="${tool.name} on ${on_string} (Info File)" >
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
292 <filter>(output_params['output_type'] == "additional")</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
293 <filter>(output_params['info_file'] is True)</filter>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
294 </data>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
295 </outputs>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
296
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
297 <stdio>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
298 <exit_code range="1" level="fatal" description="IOError, FormatError, or Interrupt" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
299 <exit_code range="2" level="fatal" description="Invalid options specified" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
300 <exit_code range="3:" level="fatal" description="Unknown error" />
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
301 </stdio>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
302
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
303 <tests>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
304 <test>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
305 <param name="input" value="cutadapt_small.fastq" ftype="fastqsanger"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
306 <param name="adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
307 <param name="adapter" value=""/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
308 <param name="anywhere_adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
309 <param name="anywhere_adapter" value="TTAGACATATCTCCGTCG"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
310 <param name="front_adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
311 <param name="front_adapter" value=""/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
312 <param name="output_filtering" value="default"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
313 <param name="read_modification" value="none"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
314 <param name="output_type" value="default"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
315 <output name="output" file="cutadapt_small.out"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
316 </test>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
317 <!-- Unable to get tests to function with conditional parameters
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
318 <test>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
319 <param name="input" value="cutadapt_small.fastq" ftype="fastqsanger"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
320 <param name="adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
321 <param name="adapter" value="TTAGACATATCTCCGTCG"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
322 <param name="anywhere_adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
323 <param name="anywhere_adapter" value=""/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
324 <param name="front_adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
325 <param name="front_adapter" value=""/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
326 <param name="output_filtering" value="filter"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
327 <param name="discard" value="true"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
328 <param name="read_modification" value="none"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
329 <param name="output_type" value="default"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
330 <output name="output" file="cutadapt_discard.out"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
331 </test>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
332 <test>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
333 <param name="input" value="cutadapt_rest.fa" ftype="fasta"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
334 <param name="adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
335 <param name="adapter" value="ADAPTER"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
336 <param name="anywhere_adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
337 <param name="anywhere_adapter" value=""/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
338 <param name="front_adapter_source_list" value="user"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
339 <param name="front_adapter" value=""/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
340 <param name="output_filtering" value="default"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
341 <param name="read_modification" value="none"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
342 <param name="output_type" value="additional"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
343 <param name="rest_file" value="true"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
344 <output name="output" file="cutadapt_rest.out"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
345 <output name="rest_output" file="cutadapt_rest2.out"/>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
346 </test>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
347 -->
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
348 </tests>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
349
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
350 <help>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
351 Note
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
352 -------
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
353 Original code from lparsons, 1% modified to add more reads at once.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
354
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
355 Summary
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
356 -------
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
357 This tool removes adapter sequences from DNA high-throughput
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
358 sequencing data. This is usually necessary when the read length of the
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
359 machine is longer than the molecule that is sequenced, such as in
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
360 microRNA data.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
361
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
362 The tool is based on the opensource `cutadapt
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
363 &lt;http://code.google.com/p/cutadapt/>`_ tool. See the `complete cutadapt
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
364 documentation &lt;https://cutadapt.readthedocs.org/en/latest/index.html>`_ for additional details.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
365
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
366 -----
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
367
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
368 Algorithm
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
369 ---------
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
370
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
371 cutadapt uses a simple semi-global alignment algorithm, without any special optimizations.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
372 For speed, the algorithm is implemented as a Python extension module in ``calignmodule.c``.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
373
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
374
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
375 Partial adapter matches
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
376 -----------------------
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
377
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
378 Cutadapt correctly deals with partial adapter matches. As an example, suppose
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
379 your adapter sequence is ``ADAPTER`` (specified via 3' Adapters parameter).
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
380 If you have these input sequences::
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
381
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
382 MYSEQUENCEADAPTER
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
383 MYSEQUENCEADAP
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
384 MYSEQUENCEADAPTERSOMETHINGELSE
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
385
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
386 All of them will be trimmed to ``MYSEQUENCE``. If the sequence starts with an
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
387 adapter, like this::
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
388
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
389 ADAPTERSOMETHING
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
390
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
391 It will be empty after trimming.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
392
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
393 When the allowed error rate is sufficiently high, errors in
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
394 the adapter sequence are allowed. For example, ``ADABTER`` (1 mismatch), ``ADAPTR`` (1 deletion),
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
395 and ``ADAPPTER`` (1 insertion) will all be recognized if the error rate is set to 0.15.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
396
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
397
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
398 Anchoring 5' adapters
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
399 ---------------------
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
400
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
401 If you specify a 5' (Front) adapter, the adapter may overlap the beginning of the read or
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
402 occur anywhere whithin it. If it appears withing the read, the sequence that precedes it
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
403 will also be trimmed in addition to the adapter. For example when the adapter sequence is
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
404 ``ADAPTER``::
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
405
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
406 HELLOADAPTERTHERE
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
407 APTERTHERE
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
408
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
409 will both be trimmed to ``THERE``. To avoid this, you can prefix the adapter with the character
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
410 ``^``. This will restrict the search, forcing the adapter to be a prefix of the read. With
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
411 the adapter sequence set to ``^ADAPTER``, only reads like this will be trimmed::
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
412
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
413 ADAPTERHELLO
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
414
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
415
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
416 Allowing adapters anywhere
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
417 --------------------------
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
418
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
419 Cutadapt assumes that any adapter specified via the 3' Adapter parameter
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
420 was ligated to the 3\' end of the sequence. This is the correct assumption for
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
421 at least the SOLiD and Illumina small RNA protocols and probably others.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
422 The assumption is enforced by the alignment algorithm, which only finds the adapter
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
423 when its starting position is within the read. In other words, the 5' base of
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
424 the adapter must appear within the read. The adapter and all bases following
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
425 it are removed.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
426
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
427 If, on the other hand, your adapter can also be ligated to the 5' end (on
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
428 purpose or by accident), you should tell cutadapt so by using the Anywhere Adapter
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
429 parameter. It will then use a slightly different alignment algorithm
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
430 (so-called semiglobal alignment), which allows any type of overlap between the
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
431 adapter and the sequence. In particular, the adapter may appear only partially
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
432 in the beginning of the read, like this::
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
433
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
434 PTERMYSEQUENCE
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
435
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
436 The decision which part of the read to remove is made as follows: If there is at
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
437 least one base before the found adapter, then the adapter is considered to be
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
438 a 3' adapter and the adapter itself and everything following it is removed.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
439 Otherwise, the adapter is considered to be a 5' adapter and it is removed from
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
440 the read.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
441
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
442 Here are some examples, which may make this clearer (left: read, right: trimmed
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
443 read)::
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
444
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
445 MYSEQUENCEADAPTER -> MYSEQUENCE (3' adapter)
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
446 MADAPTER -> M (3' adapter)
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
447 ADAPTERMYSEQUENCE -> MYSEQUENCE (5' adapter)
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
448 PTERMYSEQUENCE -> MYSEQUENCE (5' adapter)
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
449
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
450 The regular algorithm (3' Adapter) would trim the first two examples in the same way,
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
451 but trim the third to an empty sequence and trim the fourth not at all.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
452
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
453
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
454 Format of the info file
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
455 -----------------------
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
456 The info file contains information about the found adapters. The output is a tab-separated text file. Each line corresponds to one read of the input file. The fields are:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
457
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
458 1. Read name
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
459 2. Number of errors
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
460 3. 0-based start coordinate of the adapter match
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
461 4. 0-based end coordinate of the adapter match
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
462 5. Sequence of the read to the left of the adapter match (can be empty)
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
463 6. Sequence of the read that was matched to the adapter
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
464 7. Sequence of the read to the right of the adapter match (can be empty)
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
465 8. Name of the found adapter.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
466
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
467 The concatenation of the fields 5-7 yields the full read sequence. In column 8, adapters without a name are numbered starting from 1.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
468
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
469 If no adapter was found, the format is as follows:
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
470
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
471 1. Read name
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
472 2. The value -1
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
473 3. The read sequence
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
474
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
475 When parsing that file, be aware that additional columns may be added in the future. Note also that some fields can be empty, resulting in consecutive tabs within a line. Also, in the current version, when the *Match times* option is set to a value other than 1 (the default value), multiple lines are written to the info file for each read.
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
476
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
477 .. _cutadapt: http://code.google.com/p/cutadapt/
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
478 </help>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
479
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
480 </tool>
a712b378e090 cutadapt added
jjkoehorst <jasperkoehorst@gmail.com>
parents:
diff changeset
481