annotate sra.py @ 0:9f74a22d2060 draft default tip

planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
author mvdbeek
date Wed, 04 Nov 2015 06:57:32 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
1 """
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
2 NCBI sra class
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
3 """
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
4 import logging
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
5 import binascii
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
6 from galaxy.datatypes.data import nice_size
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
7 from galaxy.datatypes.binary import Binary
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
8
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
9 log = logging.getLogger(__name__)
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
10
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
11 class Sra(Binary):
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
12 """ Sequence Read Archive (SRA) """
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
13 file_ext = 'sra'
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
14
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
15 def __init__( self, **kwd ):
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
16 Binary.__init__( self, **kwd )
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
17
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
18 def sniff( self, filename ):
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
19 """ The first 8 bytes of any NCBI sra file is 'NCBI.sra', and the file is binary.
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
20 For details about the format, see http://www.ncbi.nlm.nih.gov/books/n/helpsra/SRA_Overview_BK/#SRA_Overview_BK.4_SRA_Data_Structure
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
21 """
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
22 try:
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
23 header = open(filename).read(8)
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
24 if binascii.b2a_hex(header) == binascii.hexlify('NCBI.sra'):
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
25 return True
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
26 else:
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
27 return False
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
28 except:
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
29 return False
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
30
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
31 def set_peek(self, dataset, is_multi_byte=False):
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
32 if not dataset.dataset.purged:
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
33 dataset.peek = 'Binary sra file'
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
34 dataset.blurb = nice_size(dataset.get_size())
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
35 else:
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
36 dataset.peek = 'file does not exist'
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
37 dataset.blurb = 'file purged from disk'
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
38
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
39 def display_peek(self, dataset):
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
40 try:
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
41 return dataset.peek
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
42 except:
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
43 return 'Binary sra file (%s)' % (nice_size(dataset.get_size()))
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
44
9f74a22d2060 planemo upload for repository https://github.com/ARTbio/tools-artbio/tree/master/tools/sra-tools commit 70fadb7e8972b1db550d0e067584930ce1ec8673-dirty
mvdbeek
parents:
diff changeset
45 Binary.register_sniffable_binary_format('sra', 'sra', Sra)