annotate edena_ovl_wrapper.xml @ 2:b8c6a38530eb draft default tip

Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA _SITE_OPTIONS. Directly call edena, remove edena_ovl_wrapper.py and edena_ass_wrapper.py . Discard stderr instead of redirecting to stdout. Do not redirect stdout to logfile. Add readme.rst .
author crs4
date Fri, 31 Jan 2014 12:08:21 -0500
parents cd6cc6d76708
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
1 <tool id="edena_ovl_wrapper" name="Edena (overlapping)" version="0.3">
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
2 <description></description>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
3 <requirements>
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
4 <requirement type="package" version="3.131028">edena</requirement>
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
5 </requirements>
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
6 <version_command>edena | head -n 1</version_command>
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
7 <command>
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
8 edena
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
9 -nThreads \${GALAXY_SLOTS:-2}
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
10 #if $input_selection.input == "unpaired_file"
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
11 -r
1
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
12 #for $ui in $input_selection.unpaired_input
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
13 ${ui.unpaired_file}
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
14 #end for
60609a9cef3b Uploaded
crs4
parents:
diff changeset
15 #elif $input_selection.input == "dr_pairs"
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
16 -DRpairs
1
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
17 #for $dpi in $input_selection.dr_pairs_input
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
18 ${dpi.dr_pair_1} ${dpi.dr_pair_2}
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
19 #end for
60609a9cef3b Uploaded
crs4
parents:
diff changeset
20 #elif $input_selection.input == "rd_pairs"
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
21 -RDpairs
1
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
22 #for $rpi in $input_selection.rd_pairs_input
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
23 ${rpi.rd_pair_1} ${rpi.rd_pair_2}
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
24 #end for
60609a9cef3b Uploaded
crs4
parents:
diff changeset
25 #end if
60609a9cef3b Uploaded
crs4
parents:
diff changeset
26 #if str($minOlap)
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
27 -M $minOlap
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
28 #end if
60609a9cef3b Uploaded
crs4
parents:
diff changeset
29 #if str($readsTruncation)
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
30 -t $readsTruncation
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
31 #end if
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
32 2&gt;/dev/null ## need to discard stderr because edena writes some progress info there (e.g. "Computing overlaps >=30...")
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
33 </command>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
34
60609a9cef3b Uploaded
crs4
parents:
diff changeset
35 <inputs>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
36 <conditional name="input_selection">
60609a9cef3b Uploaded
crs4
parents:
diff changeset
37 <param name="input" type="select" label="Select input type">
1
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
38 <option value="unpaired_file">Unpaired files</option>
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
39 <option value="dr_pairs">Paired-end (forward-reverse) files</option>
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
40 <option value="rd_pairs">Mate-pair (reverse-forward) files</option>
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
41 </param>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
42
60609a9cef3b Uploaded
crs4
parents:
diff changeset
43 <when value="unpaired_file">
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
44 <repeat name="unpaired_input" title="Unpaired input" help="(-r)" min="1">
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
45 <param name="unpaired_file" type="data" format="fasta,fastq" label="Unpaired file" help="FASTA or FASTQ format" />
60609a9cef3b Uploaded
crs4
parents:
diff changeset
46 </repeat>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
47 </when>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
48
60609a9cef3b Uploaded
crs4
parents:
diff changeset
49 <when value="dr_pairs">
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
50 <repeat name="dr_pairs_input" title="Paired-end (forward-reverse) file" help="(-DRpairs, -paired)" min="1">
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
51 <param name="dr_pair_1" type="data" format="fasta,fastq" label="Paired-end file 1" help="FASTA or FASTQ format" />
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
52 <param name="dr_pair_2" type="data" format="fasta,fastq" label="Paired-end file 2" help="FASTA or FASTQ format" />
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
53 </repeat>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
54 </when>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
55
60609a9cef3b Uploaded
crs4
parents:
diff changeset
56 <when value="rd_pairs">
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
57 <repeat name="rd_pairs_input" title="Mate-pair (reverse-forward) file" help="(-RDpairs, -matePairs)" min="1">
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
58 <param name="rd_pair_1" type="data" format="fasta,fastq" label="Mate-pair file 1" help="FASTA or FASTQ format" />
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
59 <param name="rd_pair_2" type="data" format="fasta,fastq" label="Mate-pair file 2" help="FASTA or FASTQ format" />
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
60 </repeat>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
61 </when>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
62 </conditional>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
63
60609a9cef3b Uploaded
crs4
parents:
diff changeset
64 <param name="minOlap" type="integer" value="" optional="true" label="Minimum overlap size to compute (-M)" help="If not specified, this value is set to half of the reads length. When the sequencing coverage is sufficient, you can increase this value which will reduce the computational time. Edena will compute the overlaps whose sizes range from this value to the reads length." />
60609a9cef3b Uploaded
crs4
parents:
diff changeset
65
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
66 <param name="readsTruncation" type="integer" value="" optional="true" label="3' end reads truncation (-t)" help="Use this option to truncate the 3’end of the reads to the specified length. You may consider reads truncation since it can significantly improve the assembly. Since Edena computes exact overlaps, only error free reads can take part to the assembly. Since errors are likely to occur at the 3’ ends, shortening the reads by some nucleotides may increase the number of errors-free reads in the dataset, and thus increase the assembly performance." />
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
67
60609a9cef3b Uploaded
crs4
parents:
diff changeset
68 </inputs>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
69
60609a9cef3b Uploaded
crs4
parents:
diff changeset
70 <outputs>
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
71 <data name="logfile" format="txt" label="${tool.name} on ${on_string}: log" from_work_dir="out_overlapping.log" />
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
72 <data name="output" format="ovl" label="${tool.name} on ${on_string}: overlapping" from_work_dir="out.ovl" />
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
73 </outputs>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
74
60609a9cef3b Uploaded
crs4
parents:
diff changeset
75 <tests>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
76
60609a9cef3b Uploaded
crs4
parents:
diff changeset
77 </tests>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
78 <help>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
79 **What it does**
60609a9cef3b Uploaded
crs4
parents:
diff changeset
80
2
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
81 Edena is an overlaps graph based short reads assembler and is suited to Illumina GA reads. This program requires the reads to be all the same length, as Illumina GA reads are. This is due to historical reasons and because it greatly simplifies several computational steps. 454 or Sanger reads are therefore not suited to Edena. If you provide multiple files with different read lengths, Edena will trim the 3’ end of the reads so that the reads are all the same length as the shortest reads in the file.
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
82
b8c6a38530eb Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents: 1
diff changeset
83 An assembly with Edena is a two step process: overlapping and assembling.
1
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
84
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
85 In the overlapping step, the reads files are provided to the program which computes the transitively reduced overlaps graph. This structure is then stored together with the sequence reads in the overlapping file.
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
86
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
87 Edena can accept both unpaired and paired files, FASTQ and FASTA format. Note that for technical reasons, all reads are required to be of the same length. You can however provide the program with different files containing different reads length. In such case, Edena will trim the 3’ ends of the longer reads so that they fit the shorter length. It is however required that reads within each individual file are of the same length (as Illumina GA reads are). By default all overlaps with a minimum size corresponding to half of the reads length are computed. This is quite conservative. Provided enough coverage, this value can be increased (option -M) to reduce the memory requirements.
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
88
cd6cc6d76708 Simplify passing repeated params to Python script.
crs4
parents: 0
diff changeset
89 For reads longer than 100bp, you may consider the reads truncation option, which could help in discarding 3’ base calling errors.
0
60609a9cef3b Uploaded
crs4
parents:
diff changeset
90
60609a9cef3b Uploaded
crs4
parents:
diff changeset
91 **License and citation**
60609a9cef3b Uploaded
crs4
parents:
diff changeset
92
60609a9cef3b Uploaded
crs4
parents:
diff changeset
93 This Galaxy tool is Copyright © 2013 `CRS4 Srl.`_ and is released under the `MIT license`_.
60609a9cef3b Uploaded
crs4
parents:
diff changeset
94
60609a9cef3b Uploaded
crs4
parents:
diff changeset
95 .. _CRS4 Srl.: http://www.crs4.it/
60609a9cef3b Uploaded
crs4
parents:
diff changeset
96 .. _MIT license: http://opensource.org/licenses/MIT
60609a9cef3b Uploaded
crs4
parents:
diff changeset
97
60609a9cef3b Uploaded
crs4
parents:
diff changeset
98 If you use this tool in Galaxy, please cite |Cuccuru2013|_.
60609a9cef3b Uploaded
crs4
parents:
diff changeset
99
60609a9cef3b Uploaded
crs4
parents:
diff changeset
100 .. |Cuccuru2013| replace:: Cuccuru, G., Orsini, M., Pinna, A., Sbardellati, A., Soranzo, N., Travaglione, A., Uva, P., Zanetti, G., Fotia, G. (2013) Orione, a web-based framework for NGS analysis in microbiology. *Submitted*
60609a9cef3b Uploaded
crs4
parents:
diff changeset
101 .. _Cuccuru2013: http://orione.crs4.it/
60609a9cef3b Uploaded
crs4
parents:
diff changeset
102
60609a9cef3b Uploaded
crs4
parents:
diff changeset
103 This tool uses `Edena`_, which is licensed separately. Please cite |Hernandez2008|_.
60609a9cef3b Uploaded
crs4
parents:
diff changeset
104
60609a9cef3b Uploaded
crs4
parents:
diff changeset
105 .. _Edena: http://www.genomic.ch/edena.php
60609a9cef3b Uploaded
crs4
parents:
diff changeset
106 .. |Hernandez2008| replace:: Hernandez, D., *et al.* (2008) De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer. *Genome Res.* 18(5), 802-809
60609a9cef3b Uploaded
crs4
parents:
diff changeset
107 .. _Hernandez2008: http://genome.cshlp.org/content/18/5/802
60609a9cef3b Uploaded
crs4
parents:
diff changeset
108 </help>
60609a9cef3b Uploaded
crs4
parents:
diff changeset
109 </tool>