annotate VelvetOptimiser-2.1.7_modified/VelvetOpt/Utils.pm @ 0:50ae1360fbbe default tip

Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
author konradpaszkiewicz
date Tue, 07 Jun 2011 18:07:56 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
1 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
2 # VelvetOpt::Utils.pm
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
3 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
4 # Copyright 2008,2009,2010 Simon Gladman <simon.gladman@csiro.au>
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
5 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
6 # This program is free software; you can redistribute it and/or modify
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
7 # it under the terms of the GNU General Public License as published by
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
8 # the Free Software Foundation; either version 2 of the License, or
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
9 # (at your option) any later version.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
10 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
11 # This program is distributed in the hope that it will be useful,
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
12 # but WITHOUT ANY WARRANTY; without even the implied warranty of
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
13 # MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
14 # GNU General Public License for more details.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
15 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
16 # You should have received a copy of the GNU General Public License
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
17 # along with this program; if not, write to the Free Software
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
18 # Foundation, Inc., 51 Franklin Street, Fifth Floor, Boston,
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
19 # MA 02110-1301, USA.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
20
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
21 # Version 2.1.3
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
22
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
23 # Changes for Version 2.0.1
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
24 # Added Mikael Brandstrom Durling's numCpus and freeMem for the Mac.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
25 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
26 # Changes for Version 2.1.0
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
27 # Fixed bug in estExpCov so it now correctly uses all short read categories not just the first two.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
28 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
29 # Changes for Version 2.1.2
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
30 # Fixed bug in estExpCov so it now won't take columns with "N/A" or "Inf" into account
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
31 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
32 # Changes for Version 2.1.3
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
33 # Changed the minimum contig size to use for estimating expected coverage to 3*kmer size -1 and set the minimum coverage to 2 instead of 0.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
34 # This should get rid of exp_covs of 1 when it should be very high for assembling reads that weren't ampped to a reference using one of the standard read mapping programs
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
35
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
36
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
37 package VelvetOpt::Utils;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
38
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
39 use strict;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
40 use lib "/usr/local/lib/perl5/site_perl/5.8.8";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
41 use warnings;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
42 use POSIX qw(ceil floor);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
43 use Carp;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
44 use List::Util qw(max);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
45 use Bio::SeqIO;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
46
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
47 # num_cpu
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
48 # It returns the number of cpus present in the system if linux.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
49 # If it is MAC then it returns the number of cores present.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
50 # If the OS is not linux or Mac then it returns 1.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
51 # Written by Torsten Seemann 2009 (linux) and Mikael Brandstrom Durling 2009 (Mac).
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
52
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
53 sub num_cpu {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
54 if ( $^O =~ m/linux/i ) {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
55 my ($num) = qx(grep -c ^processor /proc/cpuinfo);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
56 chomp $num;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
57 return $num if $num =~ m/^\d+/;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
58 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
59 elsif( $^O =~ m/darwin/i){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
60 my ($num) = qx(system_profiler SPHardwareDataType | grep Cores);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
61 $num =~ /.*Cores: (\d+)/;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
62 $num =$1;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
63 return $num;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
64 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
65 return 1;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
66 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
67
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
68 # free_mem
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
69 # Returns the current amount of free memory
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
70 # Mac Section written by Mikael Brandstrom Durling 2009 (Mac).
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
71
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
72 sub free_mem {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
73 if( $^O =~ m/linux/i ) {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
74 my $x = `free | grep '^Mem:' | sed 's/ */~/g' | cut -d '~' -f 4,7`;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
75 my @tmp = split "~", $x;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
76 my $total = $tmp[0] + $tmp[1];
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
77 my $totalGB = $total / 1024 / 1024;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
78 return $totalGB;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
79 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
80 elsif( $^O =~ m/darwin/i){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
81 my ($tmp) = qx(vm_stat | grep size);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
82 $tmp =~ /.*size of (\d+) bytes.*/;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
83 my $page_size = $1;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
84 ($tmp) = qx(vm_stat | grep free);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
85 $tmp =~ /[^0-9]+(\d+).*/;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
86 my $free_pages = $1;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
87 my $totalGB = ($free_pages * $page_size) / 1024 / 1024 / 1024;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
88 return $totalGB;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
89 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
90 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
91
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
92 # estExpCov
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
93 # it returns the expected coverage of short reads from an assembly by
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
94 # performing a math mode on the stats.txt file supplied.. It looks at
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
95 # all the short_cov? columns.. Uses minimum contig length and minimum coverage.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
96 # needs the stats.txt file path and name, and the k-value used in the assembly.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
97 # Original algorithm by Torsten Seemann 2009 under the GPL.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
98 # Adapted by Simon Gladman 2009.
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
99 # It does a weighted mode...
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
100
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
101 sub estExpCov {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
102 use List::Util qw(max);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
103 my $file = shift;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
104 my $kmer = shift;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
105 my $minlen = 3 * $kmer - 1;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
106 my $mincov = 2;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
107 my $fh;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
108 unless ( open IN, $file ) {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
109 croak "Unable to open $file for exp_cov determination.\n";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
110 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
111 my @cov;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
112 while (<IN>) {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
113 chomp;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
114 my @x = split m/\t/;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
115 my $len = scalar @x;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
116 next unless @x >= 7;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
117 next unless $x[1] =~ m/^\d+$/;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
118 next unless $x[1] >= $minlen;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
119
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
120 #add all the short_cov columns..
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
121 my $cov = 0;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
122 for(my $i = 5; $i < $len; $i += 2){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
123 if($x[$i] =~ /\d/){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
124 $cov += $x[$i];
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
125 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
126 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
127 next unless $cov > $mincov;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
128 push @cov, ( ( int($cov) ) x $x[1] );
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
129 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
130
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
131 my %freq_of;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
132 map { $freq_of{$_}++ } @cov;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
133 my $mode = 0;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
134 $freq_of{$mode} = 0; # sentinel
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
135 for my $x ( keys %freq_of ) {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
136 $mode = $x if $freq_of{$x} > $freq_of{$mode};
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
137 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
138 return $mode;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
139 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
140
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
141 # estVelvetMemUse
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
142 # returns the estimated memory usage for velvet in GB
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
143
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
144 sub estVelvetMemUse {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
145 my ($readsize, $genomesize, $numreads, $k) = @_;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
146 my $velvetgmem = -109635 + 18977*$readsize + 86326*$genomesize + 233353*$numreads - 51092*$k;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
147 my $out = ($velvetgmem/1024) / 1024;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
148 return $out;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
149 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
150
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
151 # getReadSizeNum
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
152 # returns the number of reads and average size in the short and shortPaired categories...
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
153
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
154 sub getReadSizeNum {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
155 my $f = shift;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
156 my %reads;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
157 my $num = 0;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
158 my $currentfiletype = "fasta";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
159 #first pull apart the velveth string and get the short and shortpaired filenames...
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
160 my @l = split /\s+/, $f;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
161 my $i = 0;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
162 foreach (@l){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
163 if(/^-/){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
164 if(/^-fasta/){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
165 $currentfiletype = "fasta";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
166 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
167 elsif(/^-fastq/){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
168 $currentfiletype = "fastq";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
169 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
170 elsif(/(-eland)|(-gerald)|(-fasta.gz)|(-fastq.gz)/) {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
171 croak "Cannot estimate memory usage from file types other than fasta or fastq..\n";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
172 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
173 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
174 elsif(-r $_){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
175 my $file = $_;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
176 if($currentfiletype eq "fasta"){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
177 my $x = `grep -c "^>" $file`;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
178 chomp($x);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
179 $num += $x;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
180 my $l = &getReadLength($file, 'Fasta');
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
181 $reads{$l} += $x;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
182 print STDERR "File: $file has $x reads of length $l\n";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
183 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
184 else {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
185 my $x = `grep -c "^@" $file`;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
186 chomp($x);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
187 $num += $x;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
188 my $l = &getReadLength($file, 'Fastq');
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
189 $reads{$l} += $x;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
190 print STDERR "File: $file has $x reads of length $l\n";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
191 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
192 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
193 $i ++;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
194 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
195 my $totlength = 0;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
196 foreach my $k (keys %reads){
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
197 $totlength += ($reads{$k} * $k);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
198 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
199
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
200
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
201 my @results;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
202 push @results, floor($totlength/$num);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
203 push @results, ($num/1000000);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
204 printf STDERR "Total reads: %.1f million. Avg length: %.1f\n",($num/1000000), ($totlength/$num);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
205 return @results;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
206 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
207
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
208 # getReadLength - returns the length of the first read in a file of type fasta or fastq..
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
209 #
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
210 sub getReadLength {
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
211 my ($f, $t) = @_;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
212 my $sio = Bio::SeqIO->new(-file => $f, -format => $t);
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
213 my $s = $sio->next_seq() or croak "Something went bad while reading file $f!\n";
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
214 return $s->length;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
215 }
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
216
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
217 return 1;
50ae1360fbbe Migrated tool version 1.0.0 from old tool shed archive to new tool shed repository
konradpaszkiewicz
parents:
diff changeset
218