Mercurial > repos > crs4 > edena
annotate edena_ovl_wrapper.xml @ 2:b8c6a38530eb draft default tip
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
_SITE_OPTIONS. Directly call edena, remove edena_ovl_wrapper.py and edena_ass_wrapper.py . Discard stderr instead of redirecting to stdout. Do not redirect stdout to logfile. Add readme.rst .
author | crs4 |
---|---|
date | Fri, 31 Jan 2014 12:08:21 -0500 |
parents | cd6cc6d76708 |
children |
rev | line source |
---|---|
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
1 <tool id="edena_ovl_wrapper" name="Edena (overlapping)" version="0.3"> |
0 | 2 <description></description> |
3 <requirements> | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
4 <requirement type="package" version="3.131028">edena</requirement> |
0 | 5 </requirements> |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
6 <version_command>edena | head -n 1</version_command> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
7 <command> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
8 edena |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
9 -nThreads \${GALAXY_SLOTS:-2} |
0 | 10 #if $input_selection.input == "unpaired_file" |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
11 -r |
1 | 12 #for $ui in $input_selection.unpaired_input |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
13 ${ui.unpaired_file} |
0 | 14 #end for |
15 #elif $input_selection.input == "dr_pairs" | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
16 -DRpairs |
1 | 17 #for $dpi in $input_selection.dr_pairs_input |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
18 ${dpi.dr_pair_1} ${dpi.dr_pair_2} |
0 | 19 #end for |
20 #elif $input_selection.input == "rd_pairs" | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
21 -RDpairs |
1 | 22 #for $rpi in $input_selection.rd_pairs_input |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
23 ${rpi.rd_pair_1} ${rpi.rd_pair_2} |
0 | 24 #end for |
25 #end if | |
26 #if str($minOlap) | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
27 -M $minOlap |
0 | 28 #end if |
29 #if str($readsTruncation) | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
30 -t $readsTruncation |
0 | 31 #end if |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
32 2>/dev/null ## need to discard stderr because edena writes some progress info there (e.g. "Computing overlaps >=30...") |
0 | 33 </command> |
34 | |
35 <inputs> | |
36 <conditional name="input_selection"> | |
37 <param name="input" type="select" label="Select input type"> | |
1 | 38 <option value="unpaired_file">Unpaired files</option> |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
39 <option value="dr_pairs">Paired-end (forward-reverse) files</option> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
40 <option value="rd_pairs">Mate-pair (reverse-forward) files</option> |
0 | 41 </param> |
42 | |
43 <when value="unpaired_file"> | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
44 <repeat name="unpaired_input" title="Unpaired input" help="(-r)" min="1"> |
0 | 45 <param name="unpaired_file" type="data" format="fasta,fastq" label="Unpaired file" help="FASTA or FASTQ format" /> |
46 </repeat> | |
47 </when> | |
48 | |
49 <when value="dr_pairs"> | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
50 <repeat name="dr_pairs_input" title="Paired-end (forward-reverse) file" help="(-DRpairs, -paired)" min="1"> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
51 <param name="dr_pair_1" type="data" format="fasta,fastq" label="Paired-end file 1" help="FASTA or FASTQ format" /> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
52 <param name="dr_pair_2" type="data" format="fasta,fastq" label="Paired-end file 2" help="FASTA or FASTQ format" /> |
0 | 53 </repeat> |
54 </when> | |
55 | |
56 <when value="rd_pairs"> | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
57 <repeat name="rd_pairs_input" title="Mate-pair (reverse-forward) file" help="(-RDpairs, -matePairs)" min="1"> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
58 <param name="rd_pair_1" type="data" format="fasta,fastq" label="Mate-pair file 1" help="FASTA or FASTQ format" /> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
59 <param name="rd_pair_2" type="data" format="fasta,fastq" label="Mate-pair file 2" help="FASTA or FASTQ format" /> |
0 | 60 </repeat> |
61 </when> | |
62 </conditional> | |
63 | |
64 <param name="minOlap" type="integer" value="" optional="true" label="Minimum overlap size to compute (-M)" help="If not specified, this value is set to half of the reads length. When the sequencing coverage is sufficient, you can increase this value which will reduce the computational time. Edena will compute the overlaps whose sizes range from this value to the reads length." /> | |
65 | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
66 <param name="readsTruncation" type="integer" value="" optional="true" label="3' end reads truncation (-t)" help="Use this option to truncate the 3’end of the reads to the specified length. You may consider reads truncation since it can significantly improve the assembly. Since Edena computes exact overlaps, only error free reads can take part to the assembly. Since errors are likely to occur at the 3’ ends, shortening the reads by some nucleotides may increase the number of errors-free reads in the dataset, and thus increase the assembly performance." /> |
0 | 67 |
68 </inputs> | |
69 | |
70 <outputs> | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
71 <data name="logfile" format="txt" label="${tool.name} on ${on_string}: log" from_work_dir="out_overlapping.log" /> |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
72 <data name="output" format="ovl" label="${tool.name} on ${on_string}: overlapping" from_work_dir="out.ovl" /> |
0 | 73 </outputs> |
74 | |
75 <tests> | |
76 | |
77 </tests> | |
78 <help> | |
79 **What it does** | |
80 | |
2
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
81 Edena is an overlaps graph based short reads assembler and is suited to Illumina GA reads. This program requires the reads to be all the same length, as Illumina GA reads are. This is due to historical reasons and because it greatly simplifies several computational steps. 454 or Sanger reads are therefore not suited to Edena. If you provide multiple files with different read lengths, Edena will trim the 3’ end of the reads so that the reads are all the same length as the shortest reads in the file. |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
82 |
b8c6a38530eb
Support Edena v. 3.131028 (new <version_command>, official overlapping log file, covStats output file removed, -lph and -sph options instead of -peHorizon). Use $GALAXY_SLOTS instead of $EDENA
crs4
parents:
1
diff
changeset
|
83 An assembly with Edena is a two step process: overlapping and assembling. |
1 | 84 |
85 In the overlapping step, the reads files are provided to the program which computes the transitively reduced overlaps graph. This structure is then stored together with the sequence reads in the overlapping file. | |
86 | |
87 Edena can accept both unpaired and paired files, FASTQ and FASTA format. Note that for technical reasons, all reads are required to be of the same length. You can however provide the program with different files containing different reads length. In such case, Edena will trim the 3’ ends of the longer reads so that they fit the shorter length. It is however required that reads within each individual file are of the same length (as Illumina GA reads are). By default all overlaps with a minimum size corresponding to half of the reads length are computed. This is quite conservative. Provided enough coverage, this value can be increased (option -M) to reduce the memory requirements. | |
88 | |
89 For reads longer than 100bp, you may consider the reads truncation option, which could help in discarding 3’ base calling errors. | |
0 | 90 |
91 **License and citation** | |
92 | |
93 This Galaxy tool is Copyright © 2013 `CRS4 Srl.`_ and is released under the `MIT license`_. | |
94 | |
95 .. _CRS4 Srl.: http://www.crs4.it/ | |
96 .. _MIT license: http://opensource.org/licenses/MIT | |
97 | |
98 If you use this tool in Galaxy, please cite |Cuccuru2013|_. | |
99 | |
100 .. |Cuccuru2013| replace:: Cuccuru, G., Orsini, M., Pinna, A., Sbardellati, A., Soranzo, N., Travaglione, A., Uva, P., Zanetti, G., Fotia, G. (2013) Orione, a web-based framework for NGS analysis in microbiology. *Submitted* | |
101 .. _Cuccuru2013: http://orione.crs4.it/ | |
102 | |
103 This tool uses `Edena`_, which is licensed separately. Please cite |Hernandez2008|_. | |
104 | |
105 .. _Edena: http://www.genomic.ch/edena.php | |
106 .. |Hernandez2008| replace:: Hernandez, D., *et al.* (2008) De novo bacterial genome sequencing: Millions of very short reads assembled on a desktop computer. *Genome Res.* 18(5), 802-809 | |
107 .. _Hernandez2008: http://genome.cshlp.org/content/18/5/802 | |
108 </help> | |
109 </tool> |