changeset 0:50ae1360fbbe default tip

Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
author konradpaszkiewicz
date Tue, 07 Jun 2011 18:07:56 -0400
parents
children
files README VelvetOptimiser-2.1.7_modified/README VelvetOptimiser-2.1.7_modified/VelvetOpt/Assembly.pm VelvetOptimiser-2.1.7_modified/VelvetOpt/Utils.pm VelvetOptimiser-2.1.7_modified/VelvetOpt/gwrap.pm VelvetOptimiser-2.1.7_modified/VelvetOpt/hwrap.pm VelvetOptimiser-2.1.7_modified/VelvetOptimiser.pl velvet_optimiser.py velvet_optimiser.xml
diffstat 9 files changed, 2810 insertions(+), 0 deletions(-) [+]
line wrap: on
line diff
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/README	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,11 @@
+#Created on 07/01/2011 by Konrad Paszkiewicz, Exeter Sequencing Service, University of Exeter
+
+VelvetOptimiser
+
+This tool was designed to incorporate Simon Gladstone's VelvetOptimiser perl tools in Galaxy. The tool files are based on the original velvet tool scripts.
+
+Prerequisites:
+
+1. Working installation of Velvet
+2. Bundled copy of VelvetOptimiser (note that this version is a slightly modified version of Simon Gladstone's original script - it must be used if the tool is to find the final velvet assembly in the correct location.
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/VelvetOptimiser-2.1.7_modified/README	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,325 @@
+NAME
+====
+
+VelvetOptimiser
+
+VERSION
+=======
+
+Version 2.1.7
+
+LICENCE
+=======
+
+Copyright 2009 - Simon Gladman - CSIRO.
+	
+simon.gladman@csiro.au
+
+This program is free software; you can redistribute it and/or modify
+it under the terms of the GNU General Public License as published by
+the Free Software Foundation; either version 2 of the License, or
+(at your option) any later version.
+    
+This program is distributed in the hope that it will be useful,
+but WITHOUT ANY WARRANTY; without even the implied warranty of
+MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+GNU General Public License for more details.
+        
+You should have received a copy of the GNU General Public License
+along with this program; if not, write to the Free Software
+Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+MA 02110-1301, USA.
+
+
+INTRODUCTION
+============
+
+The VelvetOptimiser is designed to run as a wrapper script for the Velvet
+assembler (Daniel Zerbino, EBI UK) and to assist with optimising the
+assembly.  It searches a supplied hash value range for the optimum,
+estimates the expected coverage and then searches for the optimum coverage
+cutoff.  It uses Velvet's internal mechanism for estimating insert lengths
+for paired end libraries.  It can optimise the assemblies by either the
+default optimisation condition or by a user supplied one.  It outputs the
+results to a subdirectory and records all its operations in a logfile.
+
+Expected coverage is estimated using the length weighted mode of the contig
+coverage in all active short columns of the stats.txt file.
+
+
+PREREQUISITES
+=============
+
+Velvet => 0.7.51
+Perl => 5.8.8
+BioPerl >= 1.4
+GNU utilities: grep sed free cut
+
+
+COMMAND LINE
+============
+	
+	VelvetOptimiser.pl [options] -f 'velveth input line'
+  
+  Options:
+  
+  --help          	This help.
+  --v|verbose+    	Verbose logging, includes all velvet output in the logfile. (default '0').
+  --s|hashs=i     	The starting (lower) hash value (default '19').
+  --e|hashe=i     	The end (higher) hash value (default '31').
+  --f|velvethfiles=s The file section of the velveth command line. (default '0').
+  --a|amosfile!   	Turn on velvet's read tracking and amos file output. (default '0').
+  --o|velvetgoptions=s Extra velvetg options to pass through.  eg. -long_mult_cutoff -max_coverage etc (default '').
+  --t|threads=i   	The maximum number of simulataneous velvet instances to run. (default '4').
+  --g|genomesize=f 	The approximate size of the genome to be assembled in megabases.
+					Only used in memory use estimation. If not specified, memory use estimation
+					will not occur. If memory use is estimated, the results are shown and then program exits. (default '0').
+  --k|optFuncKmer=s The optimisation function used for k-mer choice. (default 'n50').
+  --c|optFuncCov=s 	The optimisation function used for cov_cutoff optimisation. (default 'Lbp').
+  --p|prefix=s    	The prefix for the output filenames, the default is the date and time in the format DD-MM-YYYY-HH-MM_. (default 'auto').
+
+Advanced!: Changing the optimisation function(s)
+
+Velvet optimiser assembly optimisation function can be built from the following variables.
+	Lbp = The total number of base pairs in large contigs
+	Lcon = The number of large contigs
+	max = The length of the longest contig
+	n50 = The n50
+	ncon = The total number of contigs
+	tbp = The total number of basepairs in contigs
+Examples are:
+	'Lbp' = Just the total basepairs in contigs longer than 1kb
+	'n50*Lcon' = The n50 times the number of long contigs.
+	'n50*Lcon/tbp+log(Lbp)' = The n50 times the number of long contigs divided
+		by the total bases in all contigs plus the log of the number of bases
+		in long contigs.
+
+
+
+EXAMPLES
+========
+
+Find the best assembly for a lane of Illumina single-end reads, trying k-values between 27 and 31:
+
+% VelvetOptimiser.pl -s 27 -e 31 -f '-short -fastq s_1_sequence.txt'
+
+Print an estimate of how much RAM is needed by the above command, if we use eight threads at once,
+and we estimate our assembled genome to be 4.5 megabases long:
+
+% VelvetOptimiser.pl -s 27 -e 31 -f '-short -fastq s_1_sequence.txt' -g 4.5
+
+Find the best assembly for Illumina paired end reads just for k=31, using four threads (eg. quad core CPU), 
+but optimizing for N50 for k-mer length rather than sum of large contig sizes:
+
+% VelvetOptimiser.pl -s 31 -e 31 -f '-shortPaired -fasta interleaved.fasta' -t 4 --optFuncKmer 'n50'
+
+
+DETAILED OPTIONS
+================
+
+-h or --help
+
+	Prints the commandline help to STDOUT.
+
+-v or --verbose
+
+	Adds the full velveth and velvetg output to the logfile. (Handy for
+        looking at the insert lengths and sds that Velvet has chosen for each library.)
+
+-s or --hashs
+
+	Parameter type required: odd integer > 0 & <= the MAXKMERLENGTH velvet was compiled with.
+	Default: 19
+	
+	This is the lower end of the hash value range that the optimiser will search for the optimum.
+	If the supplied value is even, it will be lowered by 1.
+	If the supplied value is higher than MAXKMERLENGTH, it will be dropped to MAXKMERLENGTH.
+	
+-e or --hashe
+
+	Parameter type required: odd integer >= 'hashs' & <= the MAXKMERLENGTH velvet was compiled with.
+	Default: MAXKMERLENGTH
+	
+	This is the upper end of the hash value range that the optimiser will search for the optimum.
+	If the supplied value is even, it will be lowered by 1.
+	If the supplied value is higher than MAXKMERLENGTH, it will be dropped to MAXKMERLENGTH.
+	If the supplied value is lower than 'hashs' then it will be set to equal 'hashs'.
+	
+-f or --velvethfiles
+
+	Parameter type required: string with '' or ""
+	No default.
+	
+	This is a required parameter.  If this option is not specified, then the optimisers usage will be displayed.
+	
+	You need to supply everything you would normally supply velveth at this point except for the hash size and the 
+        directory name in the following format.  
+		
+	{[-file_format][-read_type] filename} repeated for as many read files as you have.
+
+
+	File format options:
+		-fasta
+		-fastq
+		-fasta.gz
+		-fastq.gz
+		-bam
+		-sam
+		-eland
+		-gerald
+
+	Read type options:
+		-short
+		-shortPaired
+		-short2
+		-shortPaired2
+		-long
+		-longPaired
+
+	At this stage the optimiser does not support more than the 2 standard CATEGORIES but this will be added in the near future.
+	
+	Examples:
+	
+	-f 'reads.fna'
+		reads.fna is short not paired and fasta.  (these are the defaults: -short and -fasta)
+		
+	-f '-shortPaired -fastq paired_reads.fastq -long long_reads.fna'
+		Two read files supplied, first one is a paired end fastq file and the second is a long single ended read file.
+		
+	-f '-shortPaired paired_reads_1.fna -shortPaired2 paired_reads_2.fna'
+		Two read files supplied, both are short paired fastas but come from two different libraries, therefore needing two different CATEGORIES.
+		
+	There is a fairly extensive checker built into the optimiser to check if the format of the string is correct.  However, it won't check the read files for their format (fasta, fastq, eland etc.)
+	
+-a or --amosfile
+
+	Turns on Velvets read tracking and amos file output.
+	This option is the same as specifying '-amos_file yes -read_trkg yes' in velvetg.  However, it will only be added to the final assembly and not to the intermediate ones.
+
+-o or --velvetgoptions
+
+	Parameter type required: string.
+	No default
+	
+	String should contain extra options to be passed to velvetg as required such as "-long_mult_cutoff 1" or "-max_coverage 50" etc.  Warning, there is no sanity check, so be careful.  The optimiser will crash if you give velvetg something it doesn't handle.
+	
+-t or --threads
+
+	Parameter type required: integer
+	
+	Specifies the maximum number of threads (simulataneous Velvet instances) to run.  It defaults to the number of CPUs in the current computer.
+
+-g or --genomesize
+
+	Parameter type required: float.
+	No default.
+	
+	This option will run the Optimiser's memory estimator.  It will estimate the memory required to run Velvet with the current -f parameter and number of threads.  Once the estimator has finsihed its calulations, it will display the required memory, make a recommendation and then exit the script.  This is useful for determining if you will have sufficient free RAM to run the assembly before you start.
+	You need to supply the approximate size of the genome you are assembling in mega bases.  For example, for a Salmonella genome I would use: -g 5
+	
+-k or --optFuncKmer
+
+	Parameter type required: string.
+	Default: 'n50'
+		
+	This option will change the function that the Optimiser uses to find the best hash value from the given range.  The default is to use the n50.  I have found this function to work for me better than the previous single optimisation function, but you may wish to change it.  A list of possible variables to use in your optimisation function and some examples are shown below.
+
+-c or --optFuncCov
+
+	Parameter type required: string.
+	Default: 'Lbp'
+		
+	This option will change the function that the Optimiser uses to find the best hash value from the given range.  The default is to use the number of basepairs in contigs greater than 1 kilobase in length.  I have found this function to work for me but you may wish to change it.  A list of possible variables to use in your optimisation function and some examples are shown below.
+		
+	Velvet optimiser assembly optimisation functions can be built from the following variables:
+		
+		Lbp = The total number of base pairs in large contigs
+		Lcon = The number of large contigs
+		max = The length of the longest contig
+		n50 = The n50
+		ncon = The total number of contigs
+		tbp = The total number of basepairs in contigs
+
+	Examples are:
+	
+		'Lbp' = Just the total basepairs in contigs longer than 1kb
+		'n50*Lcon' = The n50 times the number of long contigs.
+		'n50*Lcon/tbp+log(Lbp)' = The n50 times the number of long contigs divided
+			by the total bases in all contigs plus the log of the number of bases
+			in long contigs.
+
+	Be warned! The optimiser doesn't care what you supply in this string and will attempt to run anyway.  If you give it a nonsensical optimisation function be prepared to receive a nonsensical assembly!
+	
+-p or --prefix
+
+	Parameter type required: string
+	Default: The current date and time in the format "DD-MM-YYYY-HH-MM-SS_"
+	
+	Names the logfile and the output directory with whatever prefix is supplied followed by "_logfile.txt" for the logfile and "_data_k" where k is the optimum hash value for the ouput data directory.
+
+
+BUGS
+====
+
+* None that I am aware of.
+
+
+CHANGE LOG
+==========
+
+Changes since Version 2.0:
+
+2.0.1:
+
+*	Added Mikael Brandstrom Durling's code to get free_mem and num_cpus for the Mac.
+*	Fixed a bug where if no assembly score was calculable the program crashed.  It now sets the assembly score to 0 instead.
+
+2.1.0:
+
+*	Added two stage optimisation functions.  First one is used to optimise for hash value and second to optimise for cov_cutoff.  Both are user definable and default to "n50" for k-mer size and "Lbp" for cov_cutoff.
+*	Above necessitated change in command line option letters to minimise confusion.  first stage opt. func is -k for k-mer size and second is -c for cov_cutoff
+*	Fixed a bug in Utils.pm where the exp_cov was only calculated for the first two categories and left out the rest.  Now uses all short read categories.
+*	Added a command line option -o to pass through extra commands to velvetg (such as long_mult_cutoff etc.)  NB: No sanity checking here!
+
+2.1.1:
+
+*	Fixed a bug where prefixs containing '-' or '.' would cause the script to fail.
+
+2.1.2:
+
+*	Fixed a bug where estExpCov would try and incorporate columns in the stats.txt file that contained "Inf" or "N/A" into the calculation and thereby crash.
+
+2.1.3:
+
+*	Now gives a nice warning when optimisation function returns undef or 0, instead of cryptic perl error message.
+
+2.1.4:
+
+*	Fixed another bug in estExpCov in Utils.pm so it now doesn't count stats with coverage < 2 and contigs of less than 3 * kmer size - 1.
+
+2.1.5:
+
+*	Added support for velveth's new input file types. (bam, sam and raw) and attempted to future proof it..
+
+2.1.6
+
+*	Now prints Velvet calculated insert sizes and standard deviations in assembly summaries, both in the logfile and on screen
+
+2.1.7
+
+*	Takes new velveth help format into account.  Thanks to Alexie Papanicolaou - CSIRO for the patch.
+
+TO DO
+=====
+	
+	* Add the number of N's in the assembly output to the list of variables available to the optimisation function.  
+
+
+CONTACT
+=======
+
+Simon Gladman <simon.gladman@csiro.au>
+
+
+
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/VelvetOptimiser-2.1.7_modified/VelvetOpt/Assembly.pm	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,560 @@
+#       VelvetOpt::Assembly.pm
+#
+#       Copyright 2008,2009 Simon Gladman <simon.gladman@csiro.au>
+#
+#       This program is free software; you can redistribute it and/or modify
+#       it under the terms of the GNU General Public License as published by
+#       the Free Software Foundation; either version 2 of the License, or
+#       (at your option) any later version.
+#
+#       This program is distributed in the hope that it will be useful,
+#       but WITHOUT ANY WARRANTY; without even the implied warranty of
+#       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#       GNU General Public License for more details.
+#
+#       You should have received a copy of the GNU General Public License
+#       along with this program; if not, write to the Free Software
+#       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+#       MA 02110-1301, USA.
+
+#	Version 2.1.2
+#
+#	Changes for 2.0.1
+#	*Bug fix in CalcAssemblyScore.  Now returns 0 if there is no calculable score instead of crashing.
+#
+#	Changes for 2.1.0
+#	*Added 2 stage optimisation functions for optimising kmer size and cov_cutoff differently if required.
+#
+#	Changes for 2.1.1
+#	*Allowed for non-word characters in prefix names.  (. - etc.)  Still no spaces allowed in prefix name or any filenames.
+#
+#	Changes for 2.1.2
+#	*Now warns nicely of optimisation function returning undef or 0. Suggests you choose and alternative.
+#
+#	Changes for 2.1.3
+#	*Now prints the velvet calculated insert sizes and standard deviations in the Assembly summaries (both log files and screen).
+
+package VelvetOpt::Assembly;
+
+=head1 NAME
+
+VelvetOpt::Assembly.pm - Velvet assembly container class.
+
+=head1 AUTHOR
+
+Simon Gladman, CSIRO, 2007, 2008.
+
+=head1 LICENSE
+
+Copyright 2008, 2009 Simon Gladman <simon.gladman@csiro.au>
+
+       This program is free software; you can redistribute it and/or modify
+       it under the terms of the GNU General Public License as published by
+       the Free Software Foundation; either version 2 of the License, or
+       (at your option) any later version.
+
+       This program is distributed in the hope that it will be useful,
+       but WITHOUT ANY WARRANTY; without even the implied warranty of
+       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+       GNU General Public License for more details.
+
+       You should have received a copy of the GNU General Public License
+       along with this program; if not, write to the Free Software
+       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+       MA 02110-1301, USA.
+
+=head1 SYNOPSIS
+
+    use VelvetOpt::Assembly;
+    my $object = VelvetOpt::Assembly->new(
+        timestamph => "23 November 2008 15:00:00",
+        ass_id => "1",
+        versionh => "0.7.04",
+        ass_dir => "/home/gla048/Desktop/newVelvetOptimiser/data_1"
+    );
+    print $object->toString();
+
+=head1 DESCRIPTION
+
+A container class to hold the results of a Velvet assembly.  Includes timestamps,
+version information, parameter strings and assembly output metrics.
+
+Version 1.1
+
+=head2 Uses
+
+=over 8
+
+=item strict
+
+=item warnings
+
+=item Carp
+
+=back
+
+=head2 Fields
+
+=over 8
+
+=item assmscore
+
+The assembly score metric for this object
+
+=item timstamph
+
+The timestamp of the start of the velveth run for this assembly
+
+=item timestampg
+
+The date and time of the end of the velvetg run.
+
+=item ass_id
+
+The assembly id number.  Sequential for all the runs for this optimisation.
+
+=item versionh
+
+The version number of velveth used in this assembly
+
+=item versiong
+
+The version number of velvetg used in this assembly
+
+=item readfilename
+
+The name of the file containing all the reads (or a qw of them if more than one...)
+
+=item pstringh
+
+The velveth parameter string used in this assembly
+
+=item pstringg
+
+The velvetg parameter string used in this assembly
+
+=item ass_dir
+
+The assembly directory path (full)
+
+=item hashval
+
+The hash value used for this assembly
+
+=item rmapfs
+
+The roadmap file size
+
+=item sequences
+
+The total number of sequences in the input files
+
+=item nconts
+
+The number of contigs in the final assembly
+
+=item totalbp
+
+The total number of bases in the contigs
+
+=item n50
+
+The n50 of the assembly
+
+=item maxlength
+
+The length of the longest contig in the assembly
+
+=item maxcont
+
+The size of the largest contig in the assembly
+
+=item nconts1k
+
+The number of contigs greater than 1k in size
+
+=item totalbp1k
+
+the sum of the length of contigs > 1k in size
+
+=item velvethout
+
+The velveth output
+
+=item velvetgout
+
+The velvetg output
+
+=back
+
+=head2 Methods
+
+=over 8
+
+=item new
+
+Returns a new VelvetAssembly object.
+
+=item accessor methods
+
+Accessor methods for all fields.
+
+=item calcAssemblyScore
+
+Calculates the assembly score of the object (after velvetg has been run.) and stores it in self.
+
+=item getHashingDetails
+
+Gets the details of the outputs from the velveth run and stores it in self.
+
+=item getAssemblyDetails
+
+Gets the details of the outputs from the velvetg run and stores it in self.
+
+=item toString
+
+Returns a string representation of the object's contents.
+
+=item toStringNoV
+
+Returns a string representation of the object's contents without the velvet outputs which are large.
+
+=item opt_func_toString
+
+Returns the usage of the optimisation function.
+
+=back
+
+=cut
+
+use strict;
+use lib "/usr/local/lib/perl5/site_perl/5.8.8";
+use Carp;
+use warnings;
+#use base "Storable";
+use Cwd;
+use Bio::SeqIO;
+
+my $interested = 0;
+
+
+
+#constructor
+sub new {
+    my $class = shift;
+    my $self = {@_};
+    bless ($self, $class);
+    return $self;
+}
+
+#optimisation function options...
+my %f_opts;
+	$f_opts{'ncon'}->{'intname'} = 'nconts';
+	$f_opts{'ncon'}->{'desc'} = "The total number of contigs";
+	$f_opts{'n50'}->{'intname'} = 'n50';
+	$f_opts{'n50'}->{'desc'} = "The n50";
+	$f_opts{'max'}->{'intname'} = 'maxlength';
+	$f_opts{'max'}->{'desc'} = "The length of the longest contig";
+	$f_opts{'Lcon'}->{'intname'} = 'nconts1k';
+	$f_opts{'Lcon'}->{'desc'} = "The number of large contigs";
+	$f_opts{'tbp'}->{'intname'} = 'totalbp';
+	$f_opts{'tbp'}->{'desc'} = "The total number of basepairs in contigs";
+	$f_opts{'Lbp'}->{'intname'} = 'totalbp1k';
+	$f_opts{'Lbp'}->{'desc'} = "The total number of base pairs in large contigs";
+
+#accessor methods
+sub assmscore{ $_[0]->{assmscore}=$_[1] if defined $_[1]; $_[0]->{assmscore}}
+sub timestamph{ $_[0]->{timestamph}=$_[1] if defined $_[1]; $_[0]->{timestamph}}
+sub timestampg{ $_[0]->{timestampg}=$_[1] if defined $_[1]; $_[0]->{timestampg}}
+sub ass_id{ $_[0]->{ass_id}=$_[1] if defined $_[1]; $_[0]->{ass_id}}
+sub versionh{ $_[0]->{versionh}=$_[1] if defined $_[1]; $_[0]->{versionh}}
+sub versiong{ $_[0]->{versiong}=$_[1] if defined $_[1]; $_[0]->{versiong}}
+sub readfilename{ $_[0]->{readfilename}=$_[1] if defined $_[1]; $_[0]->{readfilename}}
+sub pstringh{ $_[0]->{pstringh}=$_[1] if defined $_[1]; $_[0]->{pstringh}}
+sub pstringg{ $_[0]->{pstringg}=$_[1] if defined $_[1]; $_[0]->{pstringg}}
+sub ass_dir{ $_[0]->{ass_dir}=$_[1] if defined $_[1]; $_[0]->{ass_dir}}
+sub hashval{ $_[0]->{hashval}=$_[1] if defined $_[1]; $_[0]->{hashval}}
+sub rmapfs{ $_[0]->{rmapfs}=$_[1] if defined $_[1]; $_[0]->{rmapfs}}
+sub nconts{ $_[0]->{nconts}=$_[1] if defined $_[1]; $_[0]->{nconts}}
+sub n50{ $_[0]->{n50}=$_[1] if defined $_[1]; $_[0]->{n50}}
+sub maxlength{ $_[0]->{maxlength}=$_[1] if defined $_[1]; $_[0]->{maxlength}}
+sub nconts1k{ $_[0]->{nconts1k}=$_[1] if defined $_[1]; $_[0]->{nconts1k}}
+sub totalbp{ $_[0]->{totalbp}=$_[1] if defined $_[1]; $_[0]->{totalbp}}
+sub totalbp1k{ $_[0]->{totalbp1k}=$_[1] if defined $_[1]; $_[0]->{totalbp1k}}
+sub velvethout{ $_[0]->{velvethout}=$_[1] if defined $_[1]; $_[0]->{velvethout}}
+sub velvetgout{ $_[0]->{velvetgout}=$_[1] if defined $_[1]; $_[0]->{velvetgout}}
+sub sequences{ $_[0]->{sequences}=$_[1] if defined $_[1]; $_[0]->{sequences}}
+sub assmfunc{ $_[0]->{assmfunc}=$_[1] if defined $_[1]; $_[0]->{assmfunc}}
+sub assmfunc2{ $_[0]->{assmfunc2}=$_[1] if defined $_[1]; $_[0]->{assmfunc2}}
+
+#assemblyScoreCalculator
+sub calcAssemblyScore {
+    use Safe;
+	
+	my $self = shift;
+	my $func = shift;
+	
+	my $cpt = new Safe;
+	
+	#Basic variable IO and traversal
+	$cpt->permit(qw(null scalar const padany lineseq leaveeval rv2sv rv2hv helem hslice each values keys exists delete rv2cv));
+	#Comparators
+	$cpt->permit(qw(lt i_lt gt i_gt le i_le ge i_ge eq i_eq ne i_ne ncmp i_ncmp slt sgt sle sge seq sne scmp));
+	#Base math
+	$cpt->permit(qw(preinc i_preinc predec i_predec postinc i_postinc postdec i_postdec int hex oct abs pow multiply i_multiply divide i_divide modulo i_modulo add i_add subtract i_subtract));
+	#Binary math
+	$cpt->permit(qw(left_shift right_shift bit_and bit_xor bit_or negate i_negate not complement));
+	#Regex
+	$cpt->permit(qw(match split qr));
+	#Conditionals
+	$cpt->permit(qw(cond_expr flip flop andassign orassign and or xor));
+	#Advanced math
+	$cpt->permit(qw(atan2 sin cos exp log sqrt rand srand));
+
+	foreach my $key (keys %f_opts){
+		print "\nkey: $key\tintname: ", $f_opts{$key}->{'intname'}, "\n" if $interested;
+		
+		$func =~ s/\b$key\b/$self->{$f_opts{$key}->{'intname'}}/g;
+	}
+		
+	my $r = $cpt->reval($func);
+	warn $@ if $@;
+	$self->{assmscore} = $r;
+	unless($r =~ /^\d+/){ 
+		warn "Optimisation function did not return a single float.\nOptimisation function was not evaluatable.\nOptfunc: $func";
+		warn "Setting assembly score to 0\n"; 
+		$self->{assmscore} = 0;
+	}
+	if($r == 0){
+		print STDERR "**********\n";
+		print STDERR "Warning: Assembly score for assembly_id " . $self->{ass_id} .  " is 0\n";
+		print STDERR "You may want to consider choosing a different optimisation variable or function.\n";
+		print STDERR "Current optimisation functions are ", $self->{assmfunc}, " for k value and ", $self->{assmfunc2}, " for cov_cutoff\n";
+		print STDERR "**********\n";
+	}
+	return 1;
+}
+
+#getHashingDetails
+sub getHashingDetails {
+    my $self = shift;
+    unless(!$self->timestamph || !$self->pstringh){
+        my $programPath = cwd;
+        $self->pstringh =~ /^(\S+)\s+(\d+)\s+(.*)$/;
+        $self->{ass_dir} = $programPath . "/" . $1;
+        $self->{rmapfs} = -s $self->ass_dir . "/Roadmaps";
+        $self->{hashval} = $2;
+        $self->{readfilename} = $3;
+        my @t = split /\n/, $self->velvethout;
+        foreach(@t){
+            if(/^(\d+).*total\.$/){
+                $self->{sequences} = $1;
+                last;
+            }
+        }
+        return 1;
+    }
+    return 0;
+}
+
+#getAssemblyDetails
+sub getAssemblyDetails {
+    my $self = shift;
+	my $file = $self->ass_dir . "/contigs.fa";
+    unless(!(-e $file)){
+		
+		my $all = &contigStats($file,1);
+		my $large = &contigStats($file,1000);
+		
+		$self->{nconts} = defined $all->{numSeqs} ? $all->{numSeqs} : 0;
+		$self->{n50} = defined $all->{n50} ? $all->{n50} : 0;
+		$self->{maxlength} = defined $all->{maxLen} ? $all->{maxLen} : 0;
+		$self->{nconts1k} = defined $large->{numSeqs} ? $large->{numSeqs} : 0;
+		$self->{totalbp} = defined $all->{numBases} ? $all->{numBases} : 0;
+		$self->{totalbp1k} = defined $large->{numBases} ? $large->{numBases} : 0;
+		
+		if($self->pstringg =~ m/cov_cutoff/){
+			$self->calcAssemblyScore($self->{assmfunc2});
+		}
+		else {
+			$self->calcAssemblyScore($self->{assmfunc});
+		}
+
+        return 1;
+	}
+    return 0;
+}
+
+#contigStats
+#Original script fa-show.pl by Torsten Seemann (Monash University, Melbourne, Australia)
+#Modified by Simon Gladman to suit.
+sub contigStats {
+	
+	my $file = shift;
+	my $minsize = shift;
+	
+	print "In contigStats with $file, $minsize\n" if $interested;
+	
+	my $numseq=0;
+	my $avglen=0;
+	my $minlen=1E9;
+	my $maxlen=0;
+	my @len;
+	my $toosmall=0;
+	my $nn=0;
+	
+	my $in = Bio::SeqIO->new(-file => $file, -format => 'Fasta');
+	while(my $seq = $in->next_seq()){
+		my $L = $seq->length;
+		#check > minsize
+		if($L < $minsize){
+			$toosmall ++;
+			next;
+		}
+		#count Ns
+		my $s = $seq->seq;
+		my $n = $s =~ s/N/N/gi;
+		$n ||= 0;
+		$nn += $n;
+		#count seqs and other stats
+		$numseq ++;
+		$avglen += $L;
+		$maxlen = $L if $L > $maxlen;
+		$minlen = $L if $L < $minlen;
+		push @len, $L;
+	}
+	@len = sort { $a <=> $b } @len;
+	my $cum = 0;
+	my $n50 = 0;
+	for my $i (0 .. $#len){
+		$cum += $len[$i];
+		if($cum >= $avglen/2) {
+			$n50 = $len[$i];
+			last;
+		}
+	}
+	
+	my %out;
+	if($numseq > 0){
+		$out{numSeqs} = $numseq;
+		$out{numBases} = $avglen;
+		$out{numOK} = ($avglen - $nn);
+		$out{numNs} = $nn;
+		$out{minLen} = $minlen;
+		$out{avgLen} = $avglen/$numseq;
+		$out{maxLen} = $maxlen;
+		$out{n50} = $n50;
+		$out{minsize} = $minsize;
+		$out{numTooSmall} = $toosmall;
+	}
+	else {
+		$out{$numseq} = 0;
+	}
+	
+	print "Leaving contigstats!\n" if $interested;
+	return (\%out);
+}
+
+
+#toString method
+sub toString {
+    my $self = shift;
+    my $tmp = $self->toStringNoV();
+    if(defined $self->velvethout){
+        $tmp .= "Velveth Output:\n" . $self->velvethout() . "\n";
+    }
+    if(defined $self->velvetgout){
+        $tmp .= "Velvetg Output:\n" . $self->velvetgout() . "\n";
+    }
+    $tmp .= "**********************************************************\n";
+    return $tmp;
+}
+
+
+#toStringNoV method
+sub toStringNoV {
+    my $self = shift;
+    my $tmp = "********************************************************\n";
+    if($self->ass_id()){
+        $tmp .= "Assembly id: " . $self->ass_id(). "\n";
+    }
+    if($self->assmscore()){
+        $tmp .= "Assembly score: " .$self->assmscore(). "\n";
+    }
+    if($self->timestamph()){
+        $tmp .= "Velveth timestamp: " . $self->timestamph(). "\n";
+    }
+    if($self->timestampg()){
+        $tmp .= "Velvetg timestamp: " . $self->timestampg(). "\n";
+    }
+    if(defined $self->versionh){
+        $tmp .= "Velveth version: " . $self->versionh(). "\n";
+    }
+    if(defined $self->versiong){
+        $tmp .= "Velvetg version: " . $self->versiong(). "\n";
+    }
+    if(defined $self->readfilename){
+        $tmp .= "Readfile(s): " . $self->readfilename(). "\n";
+    }
+    if(defined $self->pstringh){
+        $tmp .= "Velveth parameter string: " . $self->pstringh(). "\n";
+    }
+    if(defined $self->pstringg){
+        $tmp .= "Velvetg parameter string: " . $self->pstringg(). "\n";
+    }
+    if(defined $self->ass_dir){
+        $tmp .= "Assembly directory: " . $self->ass_dir(). "\n";
+    }
+    if(defined $self->hashval){
+        $tmp .= "Velvet hash value: " . $self->hashval(). "\n";
+    }
+    if(defined $self->rmapfs){
+        $tmp .= "Roadmap file size: " . $self->rmapfs(). "\n";
+    }
+    if(defined $self->sequences){
+        $tmp .= "Total number of sequences: " . $self->sequences(). "\n";
+    }
+    if(defined $self->nconts){
+        $tmp .= "Total number of contigs: " . $self->nconts(). "\n";
+    }
+    if(defined $self->n50){
+        $tmp .= "n50: " . $self->n50(). "\n";
+    }
+    if(defined $self->maxlength){
+        $tmp .= "length of longest contig: " . $self->maxlength(). "\n";
+    }
+    if(defined $self->totalbp){
+        $tmp .= "Total bases in contigs: " . $self->totalbp(). "\n";
+    }
+    if(defined $self->nconts1k){
+        $tmp .= "Number of contigs > 1k: " . $self->nconts1k(). "\n";
+    }
+    if(defined $self->totalbp1k){
+        $tmp .= "Total bases in contigs > 1k: " . $self->totalbp1k(). "\n";
+    }
+    if($self->pstringh =~ /Pair/ && defined $self->pstringg && $self->pstringg =~ /-exp_cov/){
+		$tmp .= "Paired Library insert stats:\n";
+		my @x = split /\n/, $self->velvetgout;
+		foreach(@x){
+			chomp;
+			if(/^Paired-end library \d+ has/){
+				$tmp .= "$_\n";
+			}
+		}
+	}
+    $tmp .= "**********************************************************\n";
+    return $tmp;
+}
+
+sub opt_func_toString {
+	my $out = "\nVelvet optimiser assembly optimisation function can be built from the following variables.\n";
+	foreach my $key (sort keys %f_opts){
+		$out .= "\t$key = " . $f_opts{$key}->{'desc'} . "\n";
+	}
+	$out .= "Examples are:\n\t'Lbp' = Just the total basepairs in contigs longer than 1kb\n";
+	$out .= "\t'n50*Lcon' = The n50 times the number of long contigs.\n";
+	$out .= "\t'n50*Lcon/tbp+log(Lbp)' = The n50 times the number of long contigs divided\n\t\tby the total bases in all contigs plus the log of the number of bases\n\t\tin long contigs.\n";
+	return $out
+}
+
+1;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/VelvetOptimiser-2.1.7_modified/VelvetOpt/Utils.pm	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,218 @@
+#
+#       VelvetOpt::Utils.pm
+#
+#       Copyright 2008,2009,2010 Simon Gladman <simon.gladman@csiro.au>
+#
+#       This program is free software; you can redistribute it and/or modify
+#       it under the terms of the GNU General Public License as published by
+#       the Free Software Foundation; either version 2 of the License, or
+#       (at your option) any later version.
+#
+#       This program is distributed in the hope that it will be useful,
+#       but WITHOUT ANY WARRANTY; without even the implied warranty of
+#       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#       GNU General Public License for more details.
+#
+#       You should have received a copy of the GNU General Public License
+#       along with this program; if not, write to the Free Software
+#       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+#       MA 02110-1301, USA.
+
+#		Version 2.1.3
+
+#	Changes for Version 2.0.1
+#	Added Mikael Brandstrom Durling's numCpus and freeMem for the Mac.
+#
+#	Changes for Version 2.1.0
+#	Fixed bug in estExpCov so it now correctly uses all short read categories not just the first two.
+#
+#	Changes for Version 2.1.2
+#	Fixed bug in estExpCov so it now won't take columns with "N/A" or "Inf" into account
+#
+#	Changes for Version 2.1.3
+#	Changed the minimum contig size to use for estimating expected coverage to 3*kmer size -1 and set the minimum coverage to 2 instead of 0.
+#	This should get rid of exp_covs of 1 when it should be very high for assembling reads that weren't ampped to a reference using one of the standard read mapping programs
+
+
+package VelvetOpt::Utils;
+
+use strict;
+use lib "/usr/local/lib/perl5/site_perl/5.8.8";
+use warnings;
+use POSIX qw(ceil floor);
+use Carp;
+use List::Util qw(max);
+use Bio::SeqIO;
+
+# 	num_cpu
+#	It returns the number of cpus present in the system if linux.
+#	If it is MAC then it returns the number of cores present.
+#	If the OS is not linux or Mac then it returns 1.
+#	Written by Torsten Seemann 2009 (linux) and Mikael Brandstrom Durling 2009 (Mac).
+
+sub num_cpu {
+    if ( $^O =~ m/linux/i ) {
+        my ($num) = qx(grep -c ^processor /proc/cpuinfo);
+        chomp $num;
+        return $num if $num =~ m/^\d+/;
+    }
+	elsif( $^O =~ m/darwin/i){
+		my ($num) = qx(system_profiler SPHardwareDataType | grep Cores);
+		$num =~ /.*Cores: (\d+)/;
+		$num =$1;
+		return $num;
+	}
+    return 1;
+}
+
+#	free_mem
+#	Returns the current amount of free memory
+#	Mac Section written by Mikael Brandstrom Durling 2009 (Mac).
+
+sub free_mem {
+	if( $^O =~ m/linux/i ) {
+		my $x       = `free | grep '^Mem:' | sed 's/  */~/g' | cut -d '~' -f 4,7`;
+		my @tmp     = split "~", $x;
+		my $total   = $tmp[0] + $tmp[1];
+		my $totalGB = $total / 1024 / 1024;
+		return $totalGB;
+	}
+	elsif( $^O =~ m/darwin/i){
+		my ($tmp) = qx(vm_stat | grep size);
+		$tmp =~ /.*size of (\d+) bytes.*/;
+		my $page_size = $1;
+		($tmp) = qx(vm_stat | grep free);
+		$tmp =~ /[^0-9]+(\d+).*/;
+		my $free_pages = $1;
+		my $totalGB = ($free_pages * $page_size) / 1024 / 1024 / 1024;
+		return $totalGB;
+	}
+}
+
+#	estExpCov
+#   it returns the expected coverage of short reads from an assembly by
+#   performing a math mode on the stats.txt file supplied..  It looks at
+#	all the short_cov? columns..  Uses minimum contig length and minimum coverage.
+#	needs the stats.txt file path and name, and the k-value used in the assembly.
+#	Original algorithm by Torsten Seemann 2009 under the GPL.
+#	Adapted by Simon Gladman 2009.
+#	It does a weighted mode...
+
+sub estExpCov {
+    use List::Util qw(max);
+    my $file   = shift;
+    my $kmer   = shift;
+    my $minlen = 3 * $kmer - 1;
+    my $mincov = 2;
+    my $fh;
+    unless ( open IN, $file ) {
+        croak "Unable to open $file for exp_cov determination.\n";
+    }
+    my @cov;
+    while (<IN>) {
+        chomp;
+        my @x = split m/\t/;
+		my $len = scalar @x;
+        next unless @x >= 7;
+        next unless $x[1] =~ m/^\d+$/;
+        next unless $x[1] >= $minlen;
+		
+		#add all the short_cov columns..
+		my $cov = 0;
+		for(my $i = 5; $i < $len; $i += 2){
+			if($x[$i] =~ /\d/){
+				$cov += $x[$i];
+			}
+		}
+        next unless $cov > $mincov;
+        push @cov, ( ( int($cov) ) x $x[1] );
+    }
+
+    my %freq_of;
+    map { $freq_of{$_}++ } @cov;
+    my $mode = 0;
+    $freq_of{$mode} = 0;            # sentinel
+    for my $x ( keys %freq_of ) {
+        $mode = $x if $freq_of{$x} > $freq_of{$mode};
+    }
+    return $mode;
+}
+
+#	estVelvetMemUse
+#	returns the estimated memory usage for velvet in GB
+
+sub estVelvetMemUse {
+	my ($readsize, $genomesize, $numreads, $k) = @_;
+	my $velvetgmem = -109635 + 18977*$readsize + 86326*$genomesize + 233353*$numreads - 51092*$k;
+	my $out = ($velvetgmem/1024) / 1024;
+	return $out;
+}
+
+#	getReadSizeNum
+#	returns the number of reads and average size in the short and shortPaired categories...
+
+sub getReadSizeNum {
+	my $f = shift;
+	my %reads;
+	my $num = 0;
+	my $currentfiletype = "fasta";
+	#first pull apart the velveth string and get the short and shortpaired filenames...
+	my @l = split /\s+/, $f;
+	my $i = 0;
+	foreach (@l){
+		if(/^-/){
+			if(/^-fasta/){
+				$currentfiletype = "fasta";
+			}
+			elsif(/^-fastq/){
+				$currentfiletype = "fastq";
+			}
+			elsif(/(-eland)|(-gerald)|(-fasta.gz)|(-fastq.gz)/) {
+				croak "Cannot estimate memory usage from file types other than fasta or fastq..\n";
+			}
+		}
+		elsif(-r $_){
+			my $file = $_;
+			if($currentfiletype eq "fasta"){
+				my $x = `grep -c "^>" $file`;
+				chomp($x);
+				$num += $x;
+				my $l = &getReadLength($file, 'Fasta');
+				$reads{$l} += $x;
+				print STDERR "File: $file has $x reads of length $l\n";
+			}
+			else {
+				my $x = `grep -c "^@" $file`;
+				chomp($x);
+				$num += $x;
+				my $l = &getReadLength($file, 'Fastq');
+				$reads{$l} += $x;
+				print STDERR "File: $file has $x reads of length $l\n";
+			}
+		}
+		$i ++;
+	}
+	my $totlength = 0;
+	foreach my $k (keys %reads){
+		$totlength += ($reads{$k} * $k);
+	}
+	
+	
+	my @results;
+	push @results, floor($totlength/$num);
+	push @results, ($num/1000000);
+	printf STDERR "Total reads: %.1f million. Avg length: %.1f\n",($num/1000000), ($totlength/$num);
+	return @results;
+}
+
+# getReadLength - returns the length of the first read in a file of type fasta or fastq..
+#
+sub getReadLength {
+	my ($f, $t) = @_;
+	my $sio = Bio::SeqIO->new(-file => $f, -format => $t);
+	my $s = $sio->next_seq() or croak "Something went bad while reading file $f!\n";
+	return $s->length;
+}
+
+return 1;
+
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/VelvetOptimiser-2.1.7_modified/VelvetOpt/gwrap.pm	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,171 @@
+#       VelvetOpt::gwrap.pm
+#
+#       Copyright 2008 Simon Gladman <simon.gladman@csiro.au>
+#
+#       This program is free software; you can redistribute it and/or modify
+#       it under the terms of the GNU General Public License as published by
+#       the Free Software Foundation; either version 2 of the License, or
+#       (at your option) any later version.
+#
+#       This program is distributed in the hope that it will be useful,
+#       but WITHOUT ANY WARRANTY; without even the implied warranty of
+#       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#       GNU General Public License for more details.
+#
+#       You should have received a copy of the GNU General Public License
+#       along with this program; if not, write to the Free Software
+#       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+#       MA 02110-1301, USA.
+#
+package VelvetOpt::gwrap;
+
+=head1 NAME
+
+VelvetOpt::gwrap.pm - Velvet graphing and assembly program wrapper module.
+
+=head1 AUTHOR
+
+Simon Gladman, CSIRO, 2007, 2008.
+
+=head1 LICENSE
+
+Copyright 2008 Simon Gladman <simon.gladman@csiro.au>
+
+       This program is free software; you can redistribute it and/or modify
+       it under the terms of the GNU General Public License as published by
+       the Free Software Foundation; either version 2 of the License, or
+       (at your option) any later version.
+
+       This program is distributed in the hope that it will be useful,
+       but WITHOUT ANY WARRANTY; without even the implied warranty of
+       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+       GNU General Public License for more details.
+
+       You should have received a copy of the GNU General Public License
+       along with this program; if not, write to the Free Software
+       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+       MA 02110-1301, USA.
+
+=head1 SYNOPSIS
+
+    use VelvetOpt::gwrap;
+    use VelvetOpt::Assembly;
+    my $object = VelvetOpt::Assembly->new(
+        timestamph => "23 November 2008 15:00:00",
+        ass_id => "1",
+        versiong => "0.7.19",
+        pstringg => "test",
+        ass_dir => "/home/gla048/Desktop/newVelvetOptimiser/test"
+    );
+    my $worked = VelvetOpt::gwrap::objectVelvetg($object);
+    if($worked){
+        print $object->toString();
+    }
+    else {
+        die "Error in velvetg..\n" . $object->toString();
+    }
+
+=head1 DESCRIPTION
+
+A wrapper module to run velvetg on VelvetAssembly objects or on velvetg
+parameter strings. Also contains private methods to check velvetg
+parameter strings, run velvetg and return results.
+
+=head2 Uses
+
+=over 8
+
+=item strict
+
+=item warnings
+
+=item Carp
+
+=item VelvetOpt::Assembly
+
+=item POSIX qw(strftime)
+
+=back
+
+=head2 Private Fields
+
+=over 8
+
+=item interested
+
+STDERR printing debug message toggle.  1 for on, 0 for off.
+
+=back
+
+=head2 Methods
+
+=over 8
+
+=item _runVelvetg
+
+Private method which runs velvetg with the supplied velvetg parameter string and returns velvetg output messages as a string.
+
+=item _checkVGString
+
+Private method which checks for a correctly formatted velvetg parameter string.  Returns 1 or 0.
+
+=item objectVelvetg
+
+Accepts a VelvetAssembly object, looks for the velvetg parameter string it contains, checks it, sends it to _runVelvetg, collects the results and stores them in the VelvetAssembly object.
+
+=item stringVelvetg
+
+Accepts a velvetg parameter string, checks it, sends it to _runVelvetg and then collects and returns the velvetg output messages.
+
+=back
+
+=cut
+
+use warnings;
+use strict;
+use Carp;
+use VelvetOpt::Assembly;
+use POSIX qw(strftime);
+
+my $interested = 0;
+
+sub _runVelvetg {
+    my $cmdline = shift;
+    my $output = "";
+    print STDERR "About to run velvetg!\n" if $interested;
+    $output = `velvetg $cmdline`;
+    $output .= "\nTimestamp: " . strftime("%b %e %Y %H:%M:%S", localtime) . "\n";
+    return $output;
+}
+
+sub _checkVGString {
+    return 1;
+}
+
+sub objectVelvetg {
+    my $va = shift;
+    my $cmdline = $va->{pstringg};
+    if(_checkVGString($cmdline)){
+        $va->{velvetgout} = _runVelvetg($cmdline);
+        my @t = split /\n/, $va->{velvetgout};
+        $t[$#t] =~ s/Timestamp:\s+//;
+        $va->{timestampg} = $t[$#t];
+        return 1;
+    }
+    else {
+        $va->{velvetgout} = "Formatting errors in velvetg parameter string.";
+        return 0;
+    }
+}
+
+sub stringVelvetg {
+    my $cmdline = shift;
+    if(_checkVGString($cmdline)){
+        return _runVelvetg($cmdline);
+    }
+    else {
+        return "Formatting errors in velvetg parameter string.";
+    }
+}
+
+1;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/VelvetOptimiser-2.1.7_modified/VelvetOpt/hwrap.pm	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,331 @@
+#       VelvetOpt::hwrap.pm
+#
+#       Copyright 2008 Simon Gladman <simon.gladman@csiro.au>
+#
+#       This program is free software; you can redistribute it and/or modify
+#       it under the terms of the GNU General Public License as published by
+#       the Free Software Foundation; either version 2 of the License, or
+#       (at your option) any later version.
+#
+#       This program is distributed in the hope that it will be useful,
+#       but WITHOUT ANY WARRANTY; without even the implied warranty of
+#       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#       GNU General Public License for more details.
+#
+#       You should have received a copy of the GNU General Public License
+#       along with this program; if not, write to the Free Software
+#       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+#       MA 02110-1301, USA.
+#
+#		Version 1.1 - 14/07/2010 - Added support for changing input file types
+#		Version 1.2 - 11/08/2010 - Changed velveth help parser for new velvet help format
+#									Thanks to Alexie Papanicolaou - CSIRO for the patch.
+
+package VelvetOpt::hwrap;
+
+=head1 NAME
+
+VelvetOpt::hwrap.pm - Velvet hashing program wrapper module.
+
+=head1 AUTHOR
+
+Simon Gladman, CSIRO, 2007, 2008.
+
+=head1 LICENSE
+
+Copyright 2008 Simon Gladman <simon.gladman@csiro.au>
+
+       This program is free software; you can redistribute it and/or modify
+       it under the terms of the GNU General Public License as published by
+       the Free Software Foundation; either version 2 of the License, or
+       (at your option) any later version.
+
+       This program is distributed in the hope that it will be useful,
+       but WITHOUT ANY WARRANTY; without even the implied warranty of
+       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+       GNU General Public License for more details.
+
+       You should have received a copy of the GNU General Public License
+       along with this program; if not, write to the Free Software
+       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+       MA 02110-1301, USA.
+
+=head1 SYNOPSIS
+
+    use VelvetOpt::hwrap;
+    use VelvetOpt::Assembly;
+    my $object = VelvetOpt::Assembly->new(
+        timestamph => "23 November 2008 15:00:00",
+        ass_id => "1",
+        versionh => "0.7.04",
+        pstringh => "test 21 -fasta test_reads.fna",
+        ass_dir => "/home/gla048/Desktop/newVelvetOptimiser/data_1"
+    );
+    my $worked = VelvetOpt::hwrap::objectVelveth($object);
+    if($worked){
+        print $object->toString();
+    }
+    else {
+        die "Error in velveth..\n" . $object->toString();
+    }
+
+=head1 DESCRIPTION
+
+A wrapper module to run velveth on VelvetAssembly objects or on velveth
+parameter strings. Also contains private methods to check velveth
+parameter strings, run velveth and return results.
+
+=head2 Uses
+
+=over 8
+
+=item strict
+
+=item warnings
+
+=item Carp
+
+=item VelvetOpt::Assembly
+
+=item POSIX qw(strftime)
+
+=back
+
+=head2 Private Fields
+
+=over 8
+
+=item interested
+
+STDERR printing debug message toggle.  1 for on, 0 for off.
+
+=back
+
+=head2 Methods
+
+=over 8
+
+=item _runVelveth
+
+Private method which runs velveth with the supplied velveth parameter string and returns velveth output messages as a string.
+
+=item _checkVHString
+
+Private method which checks for a correctly formatted velveth string.  Returns 1 or 0.
+
+=item objectVelveth
+
+Accepts a VelvetAssembly object and the number of categories velvet was compiled with, looks for the velveth parameter string it contains, checks it, sends it to _runVelveth, collects the results and stores them in the VelvetAssembly object.
+
+=item stringVelveth
+
+Accepts a velveth parameter string and the number of categories velvet was compiled with, checks it, sends it to _runVelveth and then collects and returns the velveth output messages.
+
+=back
+
+=cut
+
+use warnings;
+use strict;
+use Carp;
+use VelvetOpt::Assembly;
+use POSIX qw(strftime);
+
+my $interested = 0;
+
+my @Fileformats;
+my @Readtypes;
+my $usage;
+my $inited = 0;
+
+sub init {
+	#run a velveth to get its help lines..
+	my $response = &_runVelveth(" ");
+	
+	$response =~ m/CATEGORIES = (\d+)/;
+	my $cats = $1;
+	unless($cats){$cats = 2;}
+	
+	$response =~ m/(File format options:(.*)Read type options)/s;
+	my @t = split /\n/, $1;
+	foreach(@t){
+		#if(/\s+(-\S+)/){
+		while(/\s+(-\S+)/g){
+			push @Fileformats, $1;
+		}
+	}
+	
+	$response =~ m/(Read type options:(.*)Options:)/s;
+	
+	@t = ();
+	@t = split /\n/, $1;
+	foreach(@t){
+		#if(/\s+(-\S+)/){
+		while(/\s+(-\S+)/g){
+			push @Readtypes, $1;
+		}
+	}
+	
+	for(my $i = 3; $i <= $cats; $i++){
+		push @Readtypes, "-short$i";
+		push @Readtypes, "-shortPaired$i";
+	}
+	
+	$usage = "Incorrect velveth parameter string: Needs to be of the form\n{[-file_format][-read_type] filename}\n";
+	$usage .= "Where:\n\tFile format options:\n";
+	foreach(@Fileformats){
+		$usage .= "\t$_\n";
+	}
+	$usage .= "Read type options:\n";
+	foreach(@Readtypes){
+		$usage .= "\t$_\n";
+	}
+	$usage .= "\nThere can be more than one filename specified as long as its a different type.\nStopping run\n";
+	
+	$inited = 1;
+}
+
+sub _runVelveth {
+	#unless($inited){ &init(); }
+    my $cmdline = shift;
+    my $output = "";
+    print STDERR "About to run velveth!\n" if $interested;
+    $output = `velveth $cmdline`;
+    $output .= "\nTimestamp: " . strftime("%b %e %Y %H:%M:%S", localtime) . "\n";
+    return $output;
+}
+
+sub _checkVHString {
+    unless($inited){ &init(); }
+	my $line = shift;
+	my $cats = shift;
+	
+	
+	
+	my %fileform = ();
+    my %readform = ();
+	
+	foreach(@Fileformats){ $fileform{$_} = 1;}
+    foreach(@Readtypes){ $readform{$_} = 1;}
+
+    my @l = split /\s+/, $line;
+
+    #first check for a directory name as the first parameter...
+    my $dir = shift @l;
+    if(!($dir =~ /\w+/) || ($dir =~ /^\-/)){
+        carp "**** $line\n\tNo directory name specified as first parameter in velveth string. Internal error!\n$usage";
+        return 0;
+    }
+    #print "VH Check passed directory..\n";
+    my $hash = shift @l;
+    unless($hash =~ /^\d+$/){
+        carp "**** $line\n\tHash value in velveth string not a number. Internal error!\n$usage";
+        return 0;
+    }
+
+    #print "VH check passed hash value..\n";
+
+    my $i = 0;
+    my $ok = 1;
+    foreach(@l){
+        if(/^-/){
+            #s/-//;
+            if(!$fileform{$_} && !$readform{$_}){
+                carp "**** $line\n\tIncorrect fileformat or readformat specified.\n\t$_ is an invalid velveth switch.\n$usage";
+                return 0;
+            }
+            elsif($fileform{$_}){
+                if(($i + 1) > $#l){
+                    carp "$line\n\tNo filename supplied after file format type $l[$i].\n$usage";
+                    return 0;
+                }
+                if($readform{$l[$i+1]}){
+                    if(($i+2) > $#l){
+                        carp "$line\n\tNo filename supplied after read format type $l[$i+1].\n$usage";
+                        return 0;
+                    }
+                    if(-e $l[$i+2]){
+                        $ok = 1;
+                    }
+                    else{
+                        carp "**** $line\n\tVelveth filename " . $l[$i+2] . " doesn't exist.\n$usage";
+                        return 0;
+                    }
+                }
+                elsif (-e $l[$i+1]){
+                    $ok = 1;
+                }
+                else {
+                   carp "**** $line\n\tVelveth filename " . $l[$i+1] . " doesn't exist.$usage\n";
+                    return 0;
+                }
+            }
+            elsif($readform{$_}){
+                if(($i + 1) > $#l){
+                    carp "$line\n\tNo filename supplied after read format type $l[$i].\n$usage";
+                    return 0;
+                }
+                if($fileform{$l[$i+1]}){
+                    if(($i+2) > $#l){
+                        carp "$line\n\tNo filename supplied after file format type $l[$i+1].\n$usage";
+                        return 0;
+                    }
+                    if(-e $l[$i+2]){
+                        $ok = 1;
+                    }
+                    else{
+                        carp "**** $line\n\tVelveth filename " . $l[$i+2] . " doesn't exist.\n$usage";
+                        return 0;
+                    }
+                }
+                elsif (-e $l[$i+1]){
+                    $ok = 1;
+                }
+                else {
+                    carp "**** $line\n\tVelveth filename " . $l[$i+1] ." doesn't exist.\n$usage";
+                    return 0;
+                }
+            }
+        }
+        elsif(!-e $_){
+            carp "**** $line\n\tVelveth filename $_ doesn't exist.\n$usage";
+            return 0;
+        }
+        $i ++;
+    }
+    if($ok){
+        return 1;
+    }
+}
+
+sub objectVelveth {
+    unless($inited){ &init(); }
+    my $va = shift;
+	my $cats = shift;
+    my $cmdline = $va->{pstringh};
+    if(_checkVHString($cmdline, $cats)){
+        $va->{velvethout} = _runVelveth($cmdline);
+        my @t = split /\n/, $va->{velvethout};
+        $t[$#t] =~ s/Timestamp:\s+//;
+        $va->{timestamph} = $t[$#t];
+        return 1;
+    }
+    else {
+        $va->{velvethout} = "Formatting errors in velveth parameter string.$usage";
+        return 0;
+    }
+}
+
+sub stringVelveth {
+	unless($inited){ &init(); }
+    my $cmdline = shift;
+	my $cats = shift;
+    if(_checkVHString($cmdline,$cats)){
+        return _runVelveth($cmdline);
+    }
+    else {
+        return "Formatting errors in velveth parameter string.$usage";
+    }
+}
+
+1;
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/VelvetOptimiser-2.1.7_modified/VelvetOptimiser.pl	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,855 @@
+#!/usr/bin/perl -w
+#
+##Modified by Konrad Paszkiewicz 07/01/2011 
+
+#       VelvetOptimiser.pl
+#
+#       Copyright 2008, 2009, 2010 Simon Gladman <simon.gladman@csiro.au>
+#
+#       This program is free software; you can redistribute it and/or modify
+#       it under the terms of the GNU General Public License as published by
+#       the Free Software Foundation; either version 2 of the License, or
+#       (at your option) any later version.
+#
+#       This program is distributed in the hope that it will be useful,
+#       but WITHOUT ANY WARRANTY; without even the implied warranty of
+#       MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
+#       GNU General Public License for more details.
+#
+#       You should have received a copy of the GNU General Public License
+#       along with this program; if not, write to the Free Software
+#       Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
+#       MA 02110-1301, USA.
+
+#		Version 2.1.7
+
+#
+#   pragmas
+#
+use strict;
+
+#
+#   includes
+#
+use POSIX qw(strftime);
+use FindBin;
+use lib "$FindBin::Bin";
+use threads;
+use threads::shared;
+use VelvetOpt::Assembly;
+use VelvetOpt::hwrap;
+use VelvetOpt::gwrap;
+use VelvetOpt::Utils;
+use Data::Dumper;
+use Storable qw (freeze thaw);
+use Getopt::Long;
+use lib '/usr/local/lib/perl5/site_perl/5.8.8';
+
+#
+#   global var decs
+#
+
+#Change the following integer when compiling Velvet with the MAXKMERLENGTH
+#greater than 31 to the value you used.
+my $maxhash;
+my @hashvals;
+my %assemblies : shared;
+my %assembliesObjs;
+my @Options;
+my $readfile;
+my $interested = 0;
+my $verbose : shared;
+my $hashs;
+my $hashe;
+my $amos;
+my $vgoptions;
+my $genomesize;
+my @shortInserts;
+my $logfile = "logfile.txt";
+my $ass_num = 1;
+my $categories;
+my $prefix;
+my $OUT;
+my $logSem : shared;
+our $num_threads;
+my $current_threads : shared = 0;
+my $opt_func;
+my $opt_func2;
+my $OptVersion = "2.1.7";
+my $threadfailed : shared = 0;
+
+#
+#
+#	main script
+#
+#
+print STDERR "
+****************************************************
+
+           VelvetOptimiser.pl Version $OptVersion
+
+            Simon Gladman - CSIRO 2009
+
+****************************************************\n";
+
+my $currfreemem = VelvetOpt::Utils::free_mem;
+
+print STDERR "Number of CPUs available: " . VelvetOpt::Utils::num_cpu . "\n";
+printf STDERR "Current free RAM: %.3fGB\n", $currfreemem;
+
+#get the velveth and velvetg version numbers...
+my $response = VelvetOpt::hwrap::_runVelveth(" ");
+$response =~ /Version\s+(\d+\.\d+\.\d+)/s;
+my $vhversion = $1;
+unless ($vhversion){ die "Unable to find velveth, please ensure that the velvet executables are in your PATH.\n";}
+$response =~ /CATEGORIES = (\d+)/;
+$categories = $1;
+unless($categories){ $categories = 2; }
+
+$response =~ /MAXKMERLENGTH = (\d+)/;
+$maxhash = $1;
+unless($maxhash){ $maxhash = 31; }
+
+#get the options!
+&setOptions();
+
+if($prefix eq "auto"){
+	$logfile = strftime("%d-%m-%Y-%H-%M-%S", localtime) . "_Logfile.txt";
+} else {
+	$logfile = $prefix . "_logfile.txt";
+}
+
+print "Logfile name: $logfile\n";
+
+#open the logfile
+open $OUT, ">$logfile" or die "Couldn't open $logfile for writing.\n$!\n";
+
+#
+#
+#   Perform common tasks - write details to log file and screen, run velveth and vanilla velvetg
+#
+#
+
+print STDERR "\nMemory use estimation only!  Script will terminate after showing results.\n\n" if($genomesize);
+
+print STDERR "Velvet details:\n";
+print STDERR "\tVelvet version: $vhversion\n";
+print STDERR "\tCompiled categories: $categories\n" if $categories;
+print STDERR "\tCompiled max kmer length: $maxhash\n" if $maxhash;
+print STDERR "\tMaximum number of threads to run: $num_threads\n";
+
+#let user know about parameters to run with.
+print STDERR "Will run velvet optimiser with the following paramters:\n";
+print STDERR "\tVelveth parameter string:\n\t\t$readfile\n";
+print STDERR "\tVelveth start hash values:\t$hashs\n";
+print STDERR "\tVelveth end hash value:\t\t$hashe\n";
+if($vgoptions){
+	print $OUT "\tUser specified velvetg options: $vgoptions\n";
+}
+if($amos){
+    print STDERR "\tRead tracking for final assembly on.\n";
+} else {
+    print STDERR "\tRead tracking for final assembly off.\n";
+}
+
+#build the hashval array
+for(my $i = $hashs; $i <= $hashe; $i += 2){
+    push @hashvals, $i;
+}
+
+if($genomesize){
+	my $x = &estMemUse();
+	printf STDERR "\nMemory use estimated to be: %.1fGB for $num_threads threads.\n\n", $x;
+	if ($x < $currfreemem){
+		print STDERR "You should have enough memory to complete this job. (Though this estimate is no guarantee..)\n";
+		exit;
+	}
+	else {
+		print STDERR "You probably won't have enough memory to run this job.\nTry decreasing the maximum number of threads used.\n(use the -t option to set max threads.)\n";
+		exit;
+	}
+}
+
+
+print $OUT strftime("%b %e %H:%M:%S", localtime), "\n";
+
+#send run parameters to log file.
+print $OUT "Will run velvet optimiser with the following paramters:\n";
+print $OUT "\tVelveth parameter string:\n\t\t$readfile\n";
+print $OUT "\tVelveth start hash values:\t$hashs\n";
+print $OUT "\tVelveth end hash value:\t\t$hashe\n\n";
+if($vgoptions){
+	print $OUT "\tUser specified velvetg options: $vgoptions\n";
+}
+if($amos){
+    print $OUT "\tRead tracking for final assembly on.\n";
+} else {
+    print $OUT "\tRead tracking for final assembly off.\n";
+}
+
+print STDERR strftime("%b %e %H:%M:%S", localtime), " Beginning velveth runs.\n";
+print $OUT strftime("%b %e %H:%M:%S", localtime), "\n\n\tBeginning velveth runs.\n";
+
+#now run velveth for all the hashvalues in a certain number of threads..
+my @threads;
+foreach my $hashval (@hashvals){
+	while($current_threads >= $num_threads){
+		sleep(2);
+	}
+	if($threadfailed){
+		for my $thr (threads->list) {
+			#print STDERR "Waiting for thread ",$thr->tid," to complete.\n";
+			$thr->join;
+		}
+		die "Velveth failed to run! Must be a problem with file types, check by running velveth manually or by using -v option and reading the log file.\n";
+	}	
+	$threads[$ass_num] = threads->create(\&runVelveth, $readfile, $hashval, $vhversion, \$logSem, $ass_num);
+	$ass_num ++;
+	sleep(2);
+}
+
+for my $thr (threads->list) {
+    #print STDERR "Waiting for thread ",$thr->tid," to complete.\n";
+    $thr->join;
+}
+
+#now run velvetg for the all the hashvalues in a certain number of threads..
+#first get velvetg's version number.
+
+$response = VelvetOpt::gwrap::_runVelvetg(" ");
+$response =~ /Version\s+(\d+\.\d+\.\d+)/s;
+my $vgversion = $1;
+
+print STDERR strftime("%b %e %H:%M:%S", localtime), " Finished velveth runs.\n";
+
+print STDERR strftime("%b %e %H:%M:%S", localtime), " Beginning vanilla velvetg runs.\n";
+print $OUT strftime("%b %e %H:%M:%S", localtime), "\n\n\tBeginning vanilla velvetg runs.\n";
+
+foreach my $key (sort { $a <=> $b } keys %assemblies){
+	while($current_threads >= $num_threads){
+		sleep(2);
+	}
+	$threads[$ass_num] = threads->create(\&runVelvetg, $vgversion, \$logSem, $key);
+	sleep(2);
+}
+
+for my $thr (threads->list) {
+    #print STDERR "Waiting for thread ",$thr->tid," to complete.\n";
+    $thr->join;
+}
+
+
+#now to thaw it all out..
+
+foreach my $key(sort keys %assemblies){
+	my $obj = bless thaw($assemblies{$key}), "VelvetOpt::Assembly";
+	$assembliesObjs{$key} = $obj;
+}
+
+
+#find the best assembly...
+
+#
+#
+#   Now perform a velvetg optimisation based upon the file types sent to velveth
+#
+#
+
+#
+#   get the best assembly so far...
+#
+
+my $bestId;
+my $maxScore = -100;
+my $asmscorenotneg = 1;
+
+foreach my $key (keys %assembliesObjs){
+	if(($assembliesObjs{$key}->{assmscore} != -1) && $asmscorenotneg){
+    	if($assembliesObjs{$key}->{assmscore} > $maxScore){
+        	$bestId = $key;
+        	$maxScore = $assembliesObjs{$key}->{assmscore};
+    	}
+	}
+	elsif($assembliesObjs{$key}->{n50} && $asmscorenotneg){
+		if($assembliesObjs{$key}->{n50} > $maxScore){
+			$bestId = $key;
+			$maxScore = $assembliesObjs{$key}->{n50};
+		}
+	}
+	else {
+		$asmscorenotneg = 0;
+		if($assembliesObjs{$key}->{totalbp} > $maxScore){
+        	$bestId = $key;
+        	$maxScore = $assembliesObjs{$key}->{totalbp};
+    	}
+	}
+}
+print "\n\nThe best assembly so far is:\n" if $interested;
+print $assembliesObjs{$bestId}->toStringNoV() if $interested;
+
+#   determine the optimisation route for the assembly based on the velveth parameter string.
+my $optRoute = &getOptRoutine($readfile);
+
+print STDERR strftime("%b %e %H:%M:%S", localtime), " Hash value of best assembly by assembly score: ". $assembliesObjs{$bestId}->{hashval} . "\n";
+
+print $OUT strftime("%b %e %H:%M:%S", localtime), " Best assembly by assembly score - assembly id: $bestId\n";
+
+print STDERR strftime("%b %e %H:%M:%S", localtime), " Optimisation routine chosen for best assembly: $optRoute\n";
+print $OUT strftime("%b %e %H:%M:%S", localtime), " Optimisation routine chosen for best assembly: $optRoute\n";
+
+#now send the best assembly so far to the appropriate optimisation routine...
+
+if($optRoute eq "shortOpt"){
+	
+	&expCov($assembliesObjs{$bestId});
+    &covCutoff($assembliesObjs{$bestId});
+
+}
+elsif($optRoute eq "shortLong"){
+
+    &expCov($assembliesObjs{$bestId});
+    &covCutoff($assembliesObjs{$bestId});
+
+}
+elsif($optRoute eq "longPaired"){
+    &expCov($assembliesObjs{$bestId});
+    &insLengthLong($assembliesObjs{$bestId});
+    &covCutoff($assembliesObjs{$bestId});
+}
+elsif($optRoute eq "shortPaired"){
+    &expCov($assembliesObjs{$bestId});
+    &insLengthShort($assembliesObjs{$bestId});
+    &covCutoff($assembliesObjs{$bestId});
+}
+elsif($optRoute eq "shortLongPaired"){
+    &expCov($assembliesObjs{$bestId});
+    &insLengthShort($assembliesObjs{$bestId});
+    &insLengthLong($assembliesObjs{$bestId});
+    &covCutoff($assembliesObjs{$bestId});
+}
+else{
+    print STDERR "There was an error choosing an optimisation routine for this assembly.  Please change the velveth parameter string and try again.\n";
+    print $OUT "There was an error choosing an optimisation routine for this assembly.  Please change the velveth parameter string and try again.\n";
+}
+
+#   once it comes back from the optimisation routines, we need to turn on read tracking and amos output if it was selected in the options.
+#
+#
+#   The final assembly run!
+#
+#
+if($amos){
+    $assembliesObjs{$bestId}->{pstringg} .= " -amos_file yes -read_trkg yes";
+
+    my $final = VelvetOpt::gwrap::objectVelvetg($assembliesObjs{$bestId});
+    $assembliesObjs{$bestId}->getAssemblyDetails();
+}
+
+print STDERR strftime("%b %e %H:%M:%S", localtime), "\n\n\nFinal optimised assembly details:\n";
+print $OUT strftime("%b %e %H:%M:%S", localtime), "\n\n\nFinal optimised assembly details:\n";
+print STDERR $assembliesObjs{$bestId}->toStringNoV() if !$verbose;
+print $OUT $assembliesObjs{$bestId}->toStringNoV() if !$verbose;
+print STDERR $assembliesObjs{$bestId}->toString() if $verbose;
+print $OUT $assembliesObjs{$bestId}->toString() if $verbose;
+print STDERR "\n\nAssembly output files are in the following directory:\n" . $assembliesObjs{$bestId}->{ass_dir} . "\n\n";
+print $OUT "\n\nAssembly output files are in the following directory:\n" . $assembliesObjs{$bestId}->{ass_dir} . "\n";
+
+#delete superfluous directories..
+#Modified by Konrad Paszkiewicz 07/01/2011 - move best results to execution directory for ease of use in Galaxy
+foreach my $key(keys %assemblies){
+	unless($key == $bestId){ 
+		my $dir = $assembliesObjs{$key}->{ass_dir};
+		`rm -r $dir`;
+	}
+	if($key==$bestId){
+		my $dir = $assembliesObjs{$key}->{ass_dir};
+		
+		qx(mv $dir/* $dir/../);
+		`rm -rf $dir`;
+		
+	} 
+}
+
+
+
+#	subroutines...
+#
+#
+#----------------------------------------------------------------------
+
+# Option setting routines
+
+sub setOptions {
+	use Getopt::Long;
+	
+	my $thmax = VelvetOpt::Utils::num_cpu;
+
+	@Options = (
+		{OPT=>"help",    VAR=>\&usage,             DESC=>"This help"},
+		{OPT=>"v|verbose+", VAR=>\$verbose, DEFAULT=>0, DESC=>"Verbose logging, includes all velvet output in the logfile."},
+		{OPT=>"s|hashs=i", VAR=>\$hashs, DEFAULT=>19, DESC=>"The starting (lower) hash value"}, 
+		{OPT=>"e|hashe=i", VAR=>\$hashe, DEFAULT=>31, DESC=>"The end (higher) hash value"},
+		{OPT=>"f|velvethfiles=s", VAR=>\$readfile, DEFAULT=>0, DESC=>"The file section of the velveth command line."},
+		{OPT=>"a|amosfile!", VAR=>\$amos, DEFAULT=>0, DESC=>"Turn on velvet's read tracking and amos file output."},
+		{OPT=>"o|velvetgoptions=s", VAR=>\$vgoptions, DEFAULT=>'', DESC=>"Extra velvetg options to pass through.  eg. -long_mult_cutoff -max_coverage etc"},
+		{OPT=>"t|threads=i", VAR=>\$num_threads, DEFAULT=>$thmax, DESC=>"The maximum number of simulataneous velvet instances to run."},
+		{OPT=>"g|genomesize=f", VAR=>\$genomesize, DEFAULT=>0, DESC=>"The approximate size of the genome to be assembled in megabases.\n\t\t\tOnly used in memory use estimation. If not specified, memory use estimation\n\t\t\twill not occur. If memory use is estimated, the results are shown and then program exits."},
+		{OPT=>"k|optFuncKmer=s", VAR=>\$opt_func, DEFAULT=>'n50', DESC=>"The optimisation function used for k-mer choice."},
+		{OPT=>"c|optFuncCov=s", VAR=>\$opt_func2, DEFAULT=>'Lbp', DESC=>"The optimisation function used for cov_cutoff optimisation."},
+		{OPT=>"p|prefix=s", VAR=>\$prefix, DEFAULT=>'auto', DESC=>"The prefix for the output filenames, the default is the date and time in the format DD-MM-YYYY-HH-MM_."}
+	);
+
+	(@ARGV < 1) && (usage());
+
+	&GetOptions(map {$_->{OPT}, $_->{VAR}} @Options) || usage();
+
+	# Now setup default values.
+	foreach (@Options) {
+		if (defined($_->{DEFAULT}) && !defined(${$_->{VAR}})) {
+		${$_->{VAR}} = $_->{DEFAULT};
+		}
+	}
+	
+	print STDERR strftime("%b %e %H:%M:%S", localtime), " Starting to check input parameters.\n";
+	
+	unless($readfile){
+		print STDERR "\tYou must supply the velveth parameter line in quotes. eg -f '-short .....'\n";
+		&usage();
+	}
+	
+    if($hashs > $maxhash){
+        print STDERR "\tStart hash value too high.  New start hash value is $maxhash.\n";
+        $hashs = $maxhash;
+    }
+    if(!&isOdd($hashs)){
+        $hashs = $hashs - 1;
+        print STDERR "\tStart hash value not odd.  Subtracting one. New start hash value = $hashs\n";
+    }
+	
+	if($hashe > $maxhash || $hashe < 1){
+        print STDERR "\tEnd hash value not in workable range.  New end hash value is $maxhash.\n";
+        $hashe = $maxhash;
+    }
+    if($hashe < $hashs){
+        print STDERR "\tEnd hash value lower than start hash value.  New end hash value = $hashs.\n";
+        $hashe = $hashs;
+    }
+    if(!&isOdd($hashe)){
+        $hashe = $hashe - 1;
+        print STDERR "\tEnd hash value not odd.  Subtracting one. New end hash value = $hashe\n";
+    }
+	
+	#check the velveth parameter string..
+	my $vh_ok = VelvetOpt::hwrap::_checkVHString("check 21 $readfile", $categories);
+
+	unless($vh_ok){ die "Please re-start with a corrected velveth parameter string." }
+	
+	print STDERR "\tVelveth parameter string OK.\n";
+
+	print STDERR strftime("%b %e %H:%M:%S", localtime), " Finished checking input parameters.\n";
+	
+}
+
+sub usage {
+	print "Usage: $0 [options] -f 'velveth input line'\n";
+	foreach (@Options) {
+		printf "  --%-13s %s%s.\n",$_->{OPT},$_->{DESC},
+			defined($_->{DEFAULT}) ? " (default '$_->{DEFAULT}')" : "";
+	}
+	print "\nAdvanced!: Changing the optimisation function(s)\n";
+	print VelvetOpt::Assembly::opt_func_toString;
+	exit(1);
+}
+ 
+#----------------------------------------------------------------------
+
+
+#
+#	runVelveth
+#
+
+sub runVelveth{
+	
+	{
+		lock($current_threads);
+		$current_threads ++;
+	}
+	
+	my $rf = shift;
+	my $hv = shift;
+	my $vv = shift;
+	my $semRef = shift;
+	my $anum = shift;
+	my $assembly;
+	
+	print STDERR strftime("%b %e %H:%M:%S", localtime), "\t\tRunning velveth with hash value: $hv.\n";
+
+    #make the velveth command line.
+    my $vhline = $prefix . "_data_$hv $hv $rf";
+
+    #make a new VelvetAssembly and store it in the %assemblies hash...
+	$assembly = VelvetOpt::Assembly->new(ass_id => $anum, pstringh => $vhline, versionh =>$vv, assmfunc => $opt_func, assmfunc2 => $opt_func2);
+
+    #run velveth on this assembly object
+    my $vhresponse = VelvetOpt::hwrap::objectVelveth($assembly, $categories);
+
+    unless($vhresponse){ die "Velveth didn't run on hash value of $hv.\n$!\n";}
+	
+	unless(-r ($prefix . "_data_$hv" . "/Roadmaps")){ 
+		print STDERR "Velveth failed!  Response:\n$vhresponse\n";
+		{
+			lock ($threadfailed);
+			$threadfailed = 1;
+		}
+	}
+    
+	#run the hashdetail generation routine.
+    $vhresponse = $assembly->getHashingDetails();
+    
+	#print the objects to the log file...
+	{
+		lock($$semRef);
+		print $OUT $assembly->toStringNoV() if !$verbose;
+		print $OUT $assembly->toString() if $verbose;
+	}
+	
+	{
+		lock(%assemblies);
+		my $ass_str = freeze($assembly);
+		$assemblies{$anum} = $ass_str;
+	}
+	
+	{
+		lock($current_threads);
+		$current_threads --;
+	}
+	print STDERR strftime("%b %e %H:%M:%S", localtime), "\t\tVelveth with hash value $hv finished.\n";
+}
+
+#
+#	runVelvetg
+#
+sub runVelvetg{
+
+	{
+		lock($current_threads);
+		$current_threads ++;
+	}
+	
+	my $vv = shift;
+	my $semRef = shift;
+	my $anum = shift;
+	my $assembly;
+	
+	#get back the object!
+	$assembly = bless thaw($assemblies{$anum}), "VelvetOpt::Assembly";
+	
+	print STDERR strftime("%b %e %H:%M:%S", localtime), "\t\tRunning vanilla velvetg on hash value: " . $assembly->{hashval} . "\n";
+
+	#make the velvetg commandline.
+    my $vgline = $prefix . "_data_" . $assembly->{hashval};
+	
+	$vgline .= " $vgoptions";
+
+    #save the velvetg commandline in the assembly.
+    $assembly->{pstringg} = $vgline;
+	
+	#save the velvetg version in the assembly.
+	$assembly->{versiong} = $vv;
+
+    #run velvetg
+    my $vgresponse = VelvetOpt::gwrap::objectVelvetg($assembly);
+
+    unless($vgresponse){ die "Velvetg didn't run on the directory $vgline.\n$!\n";}
+
+    #run the assembly details routine..
+    $assembly->getAssemblyDetails();
+
+    #print the objects to the log file...
+	{
+		lock($$semRef);
+		print $OUT $assembly->toStringNoV() if !$verbose;
+		print $OUT $assembly->toString() if $verbose;
+	}
+	
+	{
+		lock(%assemblies);
+		my $ass_str = freeze($assembly);
+		$assemblies{$anum} = $ass_str;
+	}
+	
+	{
+		lock($current_threads);
+		$current_threads --;
+	}
+	print STDERR strftime("%b %e %H:%M:%S", localtime), "\t\tVelvetg on hash value: " . $assembly->{hashval} . " finished.\n";
+}
+
+#
+#   isOdd
+#
+sub isOdd {
+    my $x = shift;
+    if($x % 2 == 1){
+        return 1;
+    }
+    else {
+        return 0;
+    }
+}
+
+
+#
+#   getOptRoutine
+#
+sub getOptRoutine {
+
+    my $readfile = shift;
+
+    #   Choose the optimisation path depending on the types of read files sent to velveth
+    #       For short only:                 shortOpt routine
+    #       For short and long:             shortLong routine
+    #       For short paired:               shortPaired routine
+    #       For short and long paired:      longPaired routine
+    #       For short paired and long:      shortPaired routine
+    #       For short paired & long paired: shortlongPaired routine
+
+    #look at velveth string ($readfile) and look for keywords from velvet manual...
+    my $long = 0;
+    my $longPaired = 0;
+    my $shortPaired = 0;
+    my $short = 0;
+
+    #standard cases..
+    if($readfile =~ /-short.? /) { $short = 1; }
+    if($readfile =~ /-long /) { $long = 1; }
+    if($readfile =~ /-shortPaired /) { $shortPaired = 1; }
+    if($readfile =~ /-longPaired /) { $longPaired = 1; }
+
+    #weird cases to cover the non-use of the short keyword (since its the default.)
+    if(!($readfile =~ /(-short.? )|(-long )|(-shortPaired )|(-longPaired )/)) { $short = 1; } #if nothing is specified, assume short.
+    if(!($readfile =~ /-short.? /) && ($readfile =~ /(-long )|(-longPaired )/)) { $short = 1; } #if long or longPaired is specified, also assum short since very unlikely to only have long...
+
+    if($short && !($long || $longPaired || $shortPaired)){
+        return "shortOpt";
+    }
+    elsif($short && $long && !($longPaired || $shortPaired)){
+        return "shortLong";
+    }
+    elsif($short && $longPaired && !$shortPaired){
+        return "longPaired";
+    }
+    elsif($short && $shortPaired && !$longPaired){
+        return "shortPaired";
+    }
+    elsif($short && $shortPaired && $longPaired){
+        return "shortLongPaired";
+    }
+    elsif($shortPaired && !$short && !$long && !$longPaired){
+        return "shortPaired";
+    }
+    else {
+        return "Unknown";
+    }
+}
+
+#
+#   covCutoff - the coverage cutoff optimisation routine.
+#
+sub covCutoff{
+
+    my $ass = shift;
+    #get the assembly score and set the current cutoff score.
+    my $ass_score = $ass->{assmscore};
+    print "In covCutOff and assembly score is: $ass_score..\n" if $interested;
+	
+
+
+	sub func {
+		my $ass = shift;
+		my $cutoff = shift;
+		my $ass_score = $ass->{assmscore};
+		my $ps = $ass->{pstringg};
+        if($ps =~ /cov_cutoff/){
+            $ps =~ s/cov_cutoff\s+\d+(\.\d+)?/cov_cutoff $cutoff/;
+        }
+        else {
+            $ps .= " -cov_cutoff $cutoff";
+        }
+        $ass->{pstringg} = $ps;
+
+        print STDERR strftime("%b %e %H:%M:%S", localtime);
+		printf STDERR "\t\tSetting cov_cutoff to %.3f.\n", $cutoff;
+        print $OUT strftime("%b %e %H:%M:%S", localtime);
+		printf $OUT "\t\tSetting cov_cutoff to %.3f.\n", $cutoff;
+
+        my $worked = VelvetOpt::gwrap::objectVelvetg($ass);
+        if($worked){
+            $ass->getAssemblyDetails();
+        }
+        else {
+            die "Velvet Error in covCutoff!\n";
+        }
+        $ass_score = $ass->{assmscore};
+		print $OUT $ass->toStringNoV();
+		
+		return $ass_score;
+		
+	}
+	
+	print STDERR strftime("%b %e %H:%M:%S", localtime), " Beginning coverage cutoff optimisation\n";
+    print $OUT strftime("%b %e %H:%M:%S", localtime), " Beginning coverage cutoff optimisation\n";
+
+	my $dir = $ass->{ass_dir};
+    $dir .= "/stats.txt";
+    #print "\tLooking for exp_cov in $dir\n";
+    my $expCov = VelvetOpt::Utils::estExpCov($dir, $ass->{hashval});
+	
+	my $a = 0;
+	my $b = 0.8 * $expCov;
+	my $t = 0.618;
+	my $c = $a + $t * ($b - $a);
+	my $d = $b + $t * ($a - $b);
+	my $fc = func($ass, $c);
+	my $fd = func($ass, $d);
+
+	my $iters = 1;
+	
+	printf STDERR "\t\tLooking for best cutoff score between %.3f and %.3f\n", $a, $b;
+	printf $OUT "\t\tLooking for best cutoff score between %.3f and %.3f\n", $a, $b;
+	
+	while(abs($a -$b) > 1){
+		if($fc > $fd){
+			printf STDERR "\t\tMax cutoff lies between %.3f & %.3f\n", $d, $b;
+			my $absdiff = abs($fc - $fd);
+			print STDERR "\t\tfc = $fc\tfd = $fd\tabs diff = $absdiff\n";
+			printf $OUT "\t\tMax cutoff lies between %.3f & %.3f\n", $d, $b;
+			$a = $d;
+			$d = $c;
+			$fd = $fc;
+			$c = $a + $t * ($b - $a);
+			$fc = func($ass, $c);
+		}
+		else {
+			printf STDERR "\t\tMax cutoff lies between %.3f & %.3f\n", $a, $c;
+			my $absdiff = abs($fc - $fd);
+			print STDERR "\t\tfc = $fc\tfd = $fd\tabs diff = $absdiff\n";
+			printf $OUT "\t\tMax cutoff lies between %.3f & %.3f\n", $a, $c;
+			$b = $c;
+			$c = $d;
+			$fc = $fd;
+			$d = $b + $t * ($a - $b);
+			$fd = func($ass, $d);
+		}
+		$iters ++;
+	}
+
+	printf STDERR "\t\tOptimum value of cutoff is %.2f\n", $b;
+	print STDERR "\t\tTook $iters iterations\n";
+	printf $OUT "\t\tOptimum value of cutoff is %.2f\n", $b;
+	print $OUT "\t\tTook $iters iterations\n";
+
+    return 1;
+
+}
+
+#
+#   expCov - find the expected coverage for the assembly and run velvetg with that exp_cov.
+#
+sub expCov {
+
+    print STDERR strftime("%b %e %H:%M:%S", localtime), " Looking for the expected coverage\n";
+    print $OUT strftime("%b %e %H:%M:%S", localtime), " Looking for the expected coverage\n";
+
+    my $ass = shift;
+
+    #need to get the directory of the assembly and add "stats.txt" to it and then send it to
+    #the histogram methods in SlugsUtils.pm...
+    my $dir = $ass->{ass_dir};
+    $dir .= "/stats.txt";
+    my $expCov = VelvetOpt::Utils::estExpCov($dir, $ass->{hashval});
+
+    print STDERR strftime("%b %e %H:%M:%S", localtime), "\t\tExpected coverage set to $expCov\n";
+    print $OUT strftime("%b %e %H:%M:%S", localtime), "\t\tExpected coverage set to $expCov\n";
+
+    #re-write the pstringg with the new velvetg command..
+    my $vg = $ass->{pstringg};
+    if($vg =~ /exp_cov/){
+        $vg =~ s/exp_cov\s+\d+/exp_cov $expCov/;
+    }
+    else {
+        $vg .= " -exp_cov $expCov";
+    }
+
+    $ass->{pstringg} = $vg;
+    
+    print $OUT $ass->toStringNoV();
+
+}
+
+#
+#   insLengthLong - get the Long insert length and use it in the assembly..
+#
+sub insLengthLong {
+    print STDERR strftime("%b %e %H:%M:%S", localtime), " Getting the long insert length\n";
+    print $OUT strftime("%b %e %H:%M:%S", localtime), " Getting the long insert length\n";
+    my $ass = shift;
+    my $len = "auto";
+    print STDERR strftime("%b %e %H:%M:%S", localtime), " Setting assembly long insert length $len\n";
+    print $OUT strftime("%b %e %H:%M:%S", localtime), " Setting assembly long insert length $len\n";
+
+    #re-write the pstringg with the new velvetg command..
+    #my $vg = $ass->{pstringg};
+    #if($vg =~ /ins_length_long/){
+    #    $vg =~ s/ins_length_long\s+\d+/ins_length_long $len/;
+    #}
+    #else {
+    #    $vg .= " -ins_length_long $len";
+    #}
+}
+
+#
+#   insLengthShort - get the short insert length and use it in the assembly..
+#
+sub insLengthShort {
+    print STDERR strftime("%b %e %H:%M:%S", localtime), " Setting the short insert length\n";
+    print $OUT strftime("%b %e %H:%M:%S", localtime), " Setting the short insert length\n";
+    my $ass = shift;
+	my $len = "auto";
+    print STDERR strftime("%b %e %H:%M:%S", localtime), " Setting assembly short insert length(s) to $len\n";
+    print $OUT strftime("%b %e %H:%M:%S", localtime), " Setting assembly short insert length(s) to $len\n";
+
+    #re-write the pstringg with the new velvetg command..
+    #my $vg = $ass->{pstringg};
+    #if($vg =~ /ins_length /){
+    #    $vg =~ s/ins_length\s+\d+/ins_length $len/;
+    #}
+    #else {
+    #    $vg .= " -ins_length $len";
+    #}
+    #$ass->{pstringg} = $vg;
+}
+
+
+#
+#	estMemUse - estimates the memory usage from 
+#
+sub estMemUse {
+	
+	my $max_runs = @hashvals;
+	my $totmem = 0;
+	#get the read lengths and the number of reads...
+	#need the short read filenames...
+	my ($rs, $nr) = VelvetOpt::Utils::getReadSizeNum($readfile);
+	if ($max_runs > $num_threads){
+		for(my $i = 0; $i < $num_threads; $i ++){
+			$totmem += VelvetOpt::Utils::estVelvetMemUse($rs, $genomesize, $nr, $hashvals[$i]);
+		}
+	}
+	else {
+		foreach my $h (@hashvals){
+			$totmem += VelvetOpt::Utils::estVelvetMemUse($rs, $genomesize, $nr, $h);
+		}
+	}
+	return $totmem;
+}
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/velvet_optimiser.py	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,102 @@
+#!/usr/bin/env python
+
+"""
+VelvetOptimiser Wrapper
+Adapted from velveth and velvetg tools in Galaxy
+Konrad Paszkiewicz	University of Exeter, UK.
+
+"""
+import pkg_resources;
+import logging, os, string, sys, tempfile, glob, shutil, types, urllib
+import shlex, subprocess
+from optparse import OptionParser, OptionGroup
+from stat import *
+
+
+log = logging.getLogger( __name__ )
+
+assert sys.version_info[:2] >= ( 2, 4 )
+
+def stop_err( msg ):
+    sys.stderr.write( "%s\n" % msg )
+    sys.exit()
+
+def __main__():
+    #Parse Command Line
+    s = 'velvetg_optimiser.py:  argv = %s\n' % (sys.argv)
+    #print >> sys.stderr, s # so will appear as blurb for file
+    argcnt = len(sys.argv)
+    starthash = sys.argv[1]
+    endhash = sys.argv[2]
+    inputs = sys.argv[3]
+    threads = sys.argv[4]
+    afgFile = sys.argv[5]
+    kmeropt = sys.argv[6]
+    covopt = sys.argv[7]
+    contigs = sys.argv[8]
+    LastGraph = sys.argv[9]
+    velvet_asm = sys.argv[10]
+    unusedReadsFile = sys.argv[11]
+    stats = sys.argv[12]
+    othervelvetgoptions = sys.argv[13]
+    working_dir = ''
+    
+    cmdline = '/usr/local/velvet/contrib/VelvetOptimiser-2.1.7/VelvetOptimiser.pl -s %s -e %s -f \' %s \' -t %s -a 1 -k %s -c %s -o \'-unused_reads yes %s\' 2&1>/dev/null' % (starthash, endhash, inputs, threads, kmeropt, covopt, othervelvetgoptions)
+    #print >> sys.stderr, cmdline # so will appear as blurb for file
+    try:
+        proc = subprocess.Popen( args=cmdline, shell=True, stderr=subprocess.PIPE )
+        returncode = proc.wait()
+        # get stderr, allowing for case where it's very large
+        stderr = ''
+        buffsize = 1048576
+        try:
+            while True:
+                stderr += proc.stderr.read( buffsize )
+                if not stderr or len( stderr ) % buffsize != 0:
+                    break
+        except OverflowError:
+            pass
+        if returncode != 0:
+            raise Exception, stderr
+    except Exception, e:
+        stop_err( 'Error running velvet_optimiser.py' + str( e ) )
+    
+    out = open(contigs,'w')
+    contigs_path = os.path.join(working_dir,'contigs.fa')
+    #print >> sys.stderr, contigs_path
+    for line in open(contigs_path ):
+        out.write( "%s" % (line) )
+    out.close()
+    out = open(stats,'w')
+    stats_path = os.path.join(working_dir,'stats.txt')
+    for line in open( stats_path ):
+        out.write( "%s" % (line) )
+    out.close()
+    if LastGraph != 'None':
+        out = open(LastGraph,'w')
+        LastGraph_path = os.path.join(working_dir,'LastGraph')
+        for line in open( LastGraph_path ):
+            out.write( "%s" % (line) )
+        out.close()
+    if afgFile != 'None':
+        out = open(afgFile,'w')
+        afgFile_path = os.path.join(working_dir,'velvet_asm.afg')
+        try:
+            for line in open( afgFile_path ):
+                out.write( "%s" % (line) )
+        except:
+            logging.warn( 'error reading %s' %(afgFile_path))
+            pass
+        out.close()
+    if unusedReadsFile != 'None':
+        out = open(unusedReadsFile,'w')
+        unusedReadsFile_path = os.path.join(working_dir,'UnusedReads.fa')
+        try:
+            for line in open( unusedReadsFile_path ):
+                out.write( "%s" % (line) )
+        except:
+            logging.info( 'error reading %s' %(unusedReadsFile_path))
+            pass
+        out.close()
+
+if __name__ == "__main__": __main__()
--- /dev/null	Thu Jan 01 00:00:00 1970 +0000
+++ b/velvet_optimiser.xml	Tue Jun 07 18:07:56 2011 -0400
@@ -0,0 +1,237 @@
+<tool id="velvetoptimiser" name="velvetoptimiser" version="1.0.0">
+	<description>Auto optimise a genomic velvet assembly</description>
+	<command interpreter="python">
+	velvet_optimiser.py  '$start_hash_length' '$end_hash_length' 
+	'#for $i in $inputs
+		${i.file_format}
+		${i.read_type}
+		${i.input}
+           #end for
+	'
+	'$threads'
+	'1'
+	'$kmeropt'
+	'$covopt'
+	'$contigs'
+	'$LastGraph'
+	'$velvet_asm'
+	'$unused_reads_fasta'
+	'$stats'
+	'$othervelvetgoptions'
+	'$contigs.extra_files_path'
+	</command>
+
+        <inputs>
+          <param label="Start Hash Length" name="start_hash_length" type="select" help="k-mer length in base pairs of the words being hashed. Shorter hash lengths (i.e. less than 31) may cause out-of-memory problems.">
+
+                        <option value="23">23</option>
+                        <option value="25">25</option>
+                        <option value="27">27</option>
+                        <option value="29">29</option>
+			<option value="31" selected="yes">31</option>
+                        <option value="33">33</option>
+                        <option value="35">35</option>
+                        <option value="37">37</option>
+                        <option value="39">39</option>
+                        <option value="41">41</option>
+
+          </param>
+	 <param label="End Hash Length" name="end_hash_length" type="select" help="k-mer length in base pairs of the words being hashed.">
+                        <option value="43">43</option>
+                        <option value="45">45</option>
+                        <option value="47">47</option>
+                        <option value="49">49</option>
+                        <option value="51">51</option>
+                        <option value="53">53</option>
+                        <option value="55">55</option>
+                           <option value="57">57</option>
+                         <option value="59">59</option>
+                        <option value="61">61</option>
+                        <option value="63">63</option>
+                        <option value="65">65</option>
+                        <option value="67">67</option>
+			 <option value="69" selected="yes">69</option>
+
+	 </param>
+
+	 <param label="Number of threads" name="threads" type="select" help="Number of velvetg threads to run. Higher values will result in shorter run times but may cause an out of memory error. Select a lower value if this happens.">
+                        <option value="1" selected="yes">1</option>
+                        <option value="2">2</option>
+                        <option value="3">3</option>
+                        <option value="4">4</option>
+
+         </param>
+  	<param label="Kmer optimisation metric" name="kmeropt" type="select" help="Metric used to identify the optimal kmer size.">
+                        <option value="Lbp">Total number of bp in contigs >1kb</option>
+                        <option value="Lcon">Number of contigs >1kb</option>
+                        <option value="max">Length of single longest contig</option>
+                        <option value="n50" selected="yes">N50 size</option>
+			 <option value="ncon">Total number of contigs</option>
+			 <option value="tbp">Total number of bp in contigs of any size</option>
+
+         </param>
+
+	 <param label="Coverage optimisation metric" name="covopt" type="select" help="Metric used to identify the optimal coverage cutoff in velvetg.">
+                        <option value="Lbp" selected="yes">Total number of bp in contigs >1kb</option>
+                        <option value="Lcon">Number of contigs >1kb</option>
+                        <option value="max">Length of single longest contig</option>
+                        <option value="n50">N50 size</option>
+                         <option value="ncon">Total number of contigs</option>
+                         <option value="tbp">Total number of bp in contigs of any size</option>
+
+         </param>
+
+	<param label="Advanced velvetg options" name="othervelvetgoptions" title="Advanced velvetg options" help="Other Velvetg options - see below for details" type="text" size="30">
+		
+	</param>
+    <!--      <param name="strand_specific" type="boolean" checked="false" truevalue="-strand_specific" falsevalue="" label="Use strand specific transcriptome sequencing" help="If you are using a strand specific transcriptome sequencing protocol, you may wish to use this option for better results."/> -->
+          <repeat name="inputs" title="Input Files">
+                      <param label="file format" name="file_format" type="select">
+                              <option value="-fasta" selected="yes">fasta</option>
+                              <option value="-fastq">fastq</option>
+                              <option value="-eland">eland</option>
+                              <option value="-gerald">gerald</option>
+                      </param>
+                      <param label="read type" name="read_type" type="select">
+                              <option value="-short" selected="yes">short reads</option>
+                              <option value="-shortPaired">shortPaired reads</option>
+                              <option value="-short2">short2 reads</option>
+                              <option value="-shortPaired2">shortPaired2 reads</option>
+                              <option value="-long">long reads (reads >200bp e.g. 454 reads)</option>
+                              <option value="-longPaired">longPaired reads</option>
+                      </param>
+
+
+            <param name="input" type="data" format="fasta,fastq,eland,gerald" label="Dataset"/>
+          </repeat>
+        </inputs>
+	
+	<outputs>
+
+	        <data format="fasta" name="contigs" label="${tool.name} on ${on_string}: Contigs"/>
+                       <data format="fasta" name="unused_reads_fasta" label="${tool.name} on ${on_string}: Unused Reads">
+                 <!--  <filter>unused_reads['generate_unused'] == "yes"</filter>-->
+                </data>
+               
+                <data format="fasta" name="unused_reads_fasta" label="${tool.name} on ${on_string}: Unused Reads">
+                 <!--  <filter>unused_reads['generate_unused'] == "yes"</filter>-->
+                </data>
+                <data format="tabular" name="stats" label="${tool.name} on ${on_string}: Stats"/>
+   		  <data format="afg" name="velvet_asm" label="${tool.name} on ${on_string}: AMOS.afg">
+                  <!-- <filter>generate_amos['afg'] == "yes"</filter> -->
+                </data>
+
+  		  <data format="txt" name="LastGraph" label="${tool.name} on ${on_string}: LastGraph">
+		    <!-- <filter>last_graph['generate_graph'] == "yes"</filter> -->
+                </data>
+
+            
+	</outputs>
+	<requirements>
+		<requirement type="package">velvet</requirement>
+	</requirements>
+	
+	<help>
+**Velvet Optimiser Overview**
+
+Velvet_ is a de novo genomic assembler specially designed for short read sequencing technologies, such as Solexa or 454, developed by Daniel Zerbino and Ewan Birney at the European Bioinformatics Institute (EMBL-EBI), near Cambridge, in the United Kingdom.
+
+Velvet currently takes in short read sequences, removes errors then produces high quality unique contigs. It then uses paired-end read and long read information, when available, to retrieve the repeated areas between contigs.
+
+Read the Velvet `documentation`__ for details on using the Vevlet Assembler.
+
+.. _Velvet: http://www.ebi.ac.uk/~zerbino/velvet/
+
+.. __: http://www.ebi.ac.uk/~zerbino/velvet/Manual.pdf
+
+------
+
+**VelvetOptimiser**
+
+VelvetOptimiser was originally written by Simon Gladman of CSIRO.
+
+VelvetOptimiser performs a number of velveth and velvetg steps to try and optimise an assembly based on the metrics provided below. 
+
+------
+
+**Outputs**
+
+
+**Contigs**
+
+The *contigs.fa* file.
+This fasta file contains the sequences of the contigs longer than 2k, where k is the word-length used in velveth. If you have specified a min contig lgth threshold, then the contigs shorter than that value are omitted.
+Note that the length and coverage information provided in the header of each contig should therefore be understood in k-mers and in k-mer coverage (cf. 5.1) respectively.
+The N's in the sequence correspond to gaps between scaffolded contigs. The number of N's corresponds to the estimated length of the gap. For reasons of compatibility with the archives, any gap shorter than 10bp is represented by a sequence of 10 N's.
+
+**Stats**
+
+The *stats.txt* file.
+This file is a simple tabbed-delimited description of the nodes. The column names are pretty much self-explanatory. Note however that node lengths are given in k-mers. To obtain the length in nucleotides of each node you simply need to add k - 1, where k is the word-length used in velveth.
+The in and out columns correspond to the number of arcs on the 5' and 3' ends of the contig respectively.
+The coverages in columns short1 cov, short1 Ocov, short2 cov, and short2 Ocov are provided in k-mer coverage (5.1).
+Also, the difference between # cov and # Ocov is the way these values are computed. In the first count, slightly divergent sequences are added to the coverage tally. However, in the second, stricter count, only the sequences which map perfectly onto the consensus sequence are taken into account.
+
+**LastGraph**
+
+The *LastGraph* file.
+This file describes in its entirety the graph produced by Velvet.
+
+**AMOS.afg**
+The *velvet_asm.afg* file.
+This file is mainly designed to be read by the open-source AMOS genome assembly package. Nonetheless, a number of programs are available to transform this kind of file into other assembly file formats (namely ACE, TIGR, Arachne and Celera). See http://amos.sourceforge.net/ for more information.
+The file describes all the contigs contained in the contigs.fa file (cf 4.2.1).
+
+**Advanced options**
+        -scaffolding  yes|no             : scaffolding of contigs used paired end information (default: on)
+        -max_branch_length  integer      : maximum length in base pair of bubble (default: 100)
+        -max_divergence  floating-point  : maximum divergence rate between two branches in a bubble (default: 0.2)
+        -max_gap_count  integer          : maximum number of gaps allowed in the alignment of the two branches of a bubble (default: 3)
+        -min_pair_count  integer         : minimum number of paired end connections to justify the scaffolding of two long contigs (default: 10)
+        -max_coverage  floating point    : removal of high coverage nodes AFTER tour bus (default: no removal)
+        -long_mult_cutoff  int           : minimum number of long reads required to merge contigs (default: 2)
+
+
+
+**Hash Length**
+
+The hash length, also known as k-mer length, corresponds to the length, in base pairs, of the words being hashed. 
+
+The hash length is the length of the k-mers being entered in the hash table. Firstly, you must observe three technical constraints::
+
+# it must be an odd number, to avoid palindromes. If you put in an even number, Velvet will just decrement it and proceed.
+# it must be below or equal to MAXKMERHASH length (cf. 2.3.3, by default 31bp), because it is stored on 64 bits
+# it must be strictly inferior to read length, otherwise you simply will not observe any overlaps between reads, for obvious reasons.
+
+Now you still have quite a lot of possibilities. As is often the case, it's a trade- off between specificity and sensitivity. Longer kmers bring you more specificity (i.e. less spurious overlaps) but lowers coverage (cf. below). . . so there's a sweet spot to be found with time and experience.
+We like to think in terms of "k-mer coverage", i.e. how many times has a k-mer been seen among the reads. The relation between k-mer coverage Ck and standard (nucleotide-wise) coverage C is Ck = C # (L - k + 1)/L where k is your hash length, and L you read length.
+Experience shows that this kmer coverage should be above 10 to start getting decent results. If Ck is above 20, you might be "wasting" coverage. Experience also shows that empirical tests with different values for k are not that costly to run! VelvetOptimiser automates these tests for you.
+
+
+**Velvetg options**
+
+
+
+**Input Files**
+
+Velvet works mainly with fasta and fastq formats. For paired-end reads, the assumption is that each read is next to its mate
+read. In other words, if the reads are indexed from 0, then reads 0 and 1 are paired, 2 and 3, 4 and 5, etc.
+
+Supported file formats are::
+
+  fasta
+  fastq 
+  eland
+  gerald
+
+Read categories are::
+
+  short (default)
+  shortPaired
+  short2 (same as short, but for a separate insert-size library - i.e. if you have two libraries of different lengths)
+  shortPaired2 (see above)
+  long (for Sanger, 454 or even reference sequences)
+  longPaired
+
+	</help>
+</tool>