annotate export_to_fastq/README @ 0:97792524cc9c default tip

Migrated tool version 0.1 from old tool shed archive to new tool shed repository
author louise
date Tue, 07 Jun 2011 17:21:49 -0400
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
1 Here is the needed class to handle Solexa Export file type.
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
2
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
3 The tool and class were written by Nicolas Delhomme (delhomme@embl.de).
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
4 Released under the GNU GPL: http://www.opensource.org/licenses/gpl-3.0.html
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
5
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
6 The threshold parameter was commented but it can very well be used. Just uncomment the commented code and comment the current command tag in the XML file.
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
7
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
8 If you want to apply this file as a patch, just run:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
9 patch <path_to_galaxy>/lib/galaxy/datatypes/tabular.py README
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
10 ---
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
11
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
12 diff -r 50e249442c5a lib/galaxy/datatypes/tabular.py
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
13 --- a/lib/galaxy/datatypes/tabular.py Thu Apr 07 08:39:07 2011 -0400
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
14 +++ b/lib/galaxy/datatypes/tabular.py Tue May 24 14:16:12 2011 +0200
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
15 @@ -504,3 +504,95 @@
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
16
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
17 def get_track_type( self ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
18 return "FeatureTrack", {"data": "interval_index", "index": "summary_tree"}
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
19 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
20 +class Export( Tabular ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
21 + file_ext = 'export'
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
22 + def __init__(self, **kwd):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
23 + """Initialize export datatype"""
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
24 + Tabular.__init__( self, **kwd )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
25 + self.column_names = ['MACHINE', 'RUN', 'LANE', 'TILE',
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
26 + 'X', 'Y', 'MULTIPLEX', 'PAIRID',
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
27 + 'READ', 'QUALITY', 'CHROMOSOME', 'CONTIG',
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
28 + 'POSITION','STRAND','ALN_QUAL','CHASTITY'
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
29 + ]
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
30 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
31 + def make_html_table( self, dataset, skipchars=[] ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
32 + """Create HTML table, used for displaying peek"""
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
33 + out = ['<table cellspacing="0" cellpadding="3">']
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
34 + try:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
35 + # Generate column header
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
36 + out.append( '<tr>' )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
37 + for i, name in enumerate( self.column_names ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
38 + out.append( '<th>%s.%s</th>' % ( str( i+1 ), name ) )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
39 + # This data type requires at least 16 columns in the data
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
40 + if dataset.metadata.columns - len( self.column_names ) > 0:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
41 + for i in range( len( self.column_names ), dataset.metadata.columns ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
42 + out.append( '<th>%s</th>' % str( i+1 ) )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
43 + out.append( '</tr>' )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
44 + out.append( self.make_html_peek_rows( dataset, skipchars=skipchars ) )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
45 + out.append( '</table>' )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
46 + out = "".join( out )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
47 + except Exception, exc:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
48 + out = "Can't create peek %s" % exc
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
49 + return out
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
50 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
51 + def set_meta( self, dataset, overwrite = True, **kwd ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
52 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
53 + #we'll arbitrarily only use the first 100 data lines in the export file to calculate tabular attributes (column types)
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
54 + #optional metadata values set in Tabular class will be 'None'
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
55 + Tabular.set_meta( self, dataset, overwrite = overwrite, max_data_lines = 100 )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
56 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
57 + def sniff( self, filename ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
58 + """
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
59 + Determines whether the file is in Export format
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
60 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
61 + A file in Export format consists of lines of tab-separated data.
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
62 + It does not have any header
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
63 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
64 + Rules for sniffing as True:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
65 + There must be 16 columns of data on each line
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
66 + Columns 2 to 8 must be numbers
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
67 + Column 16 should be either Y or N
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
68 + We will only check that up to the first 5 alignments are correctly formatted.
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
69 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
70 + """
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
71 + try:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
72 + fh = open( filename )
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
73 + count = 0
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
74 + while True:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
75 + line = fh.readline()
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
76 + line = line.strip()
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
77 + if not line:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
78 + break #EOF
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
79 + if line:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
80 + if line[0] != '@':
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
81 + linePieces = line.split('\t')
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
82 + if len(linePieces) != 22:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
83 + return False
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
84 + try:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
85 + check = int(linePieces[1])
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
86 + check = int(linePieces[2])
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
87 + check = int(linePieces[3])
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
88 + check = int(linePieces[4])
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
89 + check = int(linePieces[5])
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
90 + check = int(linePieces[6])
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
91 + check = int(linePieces[7])
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
92 + assert linePieces[21] in [ 'Y', 'N' ]
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
93 + except ValueError:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
94 + return False
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
95 + count += 1
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
96 + if count == 5:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
97 + return True
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
98 + fh.close()
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
99 + if count < 5 and count > 0:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
100 + return True
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
101 + except:
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
102 + pass
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
103 + return False
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
104 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
105 +class BarcodeSet( Tabular ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
106 + file_ext = 'bs'
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
107 + column_names = ['SAMPLE', 'BARCODE']
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
108 +
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
109 + def sniff( self, filename ):
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
110 + return False
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
111
97792524cc9c Migrated tool version 0.1 from old tool shed archive to new tool shed repository
louise
parents:
diff changeset
112