changeset 9:2bda64d39931 draft

Uploaded v0.0.19, adds blastdbp and pssm-asn1 datatypes.
author peterjc
date Wed, 26 Nov 2014 06:55:48 -0500
parents de11e1a921c4
children 5482a8cd0f36
files README.rst blast.py datatypes_conf.xml
diffstat 3 files changed, 56 insertions(+), 22 deletions(-) [+]
line wrap: on
line diff
--- a/README.rst	Tue Jan 21 13:33:20 2014 -0500
+++ b/README.rst	Wed Nov 26 06:55:48 2014 -0500
@@ -1,10 +1,9 @@
 Galaxy datatypes for NCBI BLAST+ suite
 ======================================
 
-These Galaxy datatypes are copyright 2010-2013 by Peter Cock, The James Hutton
-Institute (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved.
-Contributions/revisions copyright 2012 Edward Kirton. All rights reserved.
-Contributions/revisions copyright 2013 Nicola Soranzo. All rights reserved.
+These Galaxy datatypes are copyright 2010-2014 by Peter Cock (The James Hutton
+Institute, UK) and additional contributors including Edward Kirton, Nicola
+Soranzo, and Bjoern Gruening.
 
 See the licence text below.
 
@@ -29,18 +28,24 @@
 ------- ----------------------------------------------------------------------
 v0.0.11 - Final revision as part of the Galaxy main repository, and the
           first release via the Tool Shed
-v0.0.13 - Uses blast.py instead of xml.py to define the datatypes
+v0.0.13 - Uses ``blast.py`` instead of ``xml.py`` to define the datatypes
 v0.0.14 - Includes datatypes for protein and nucleotide BLAST databases
-          (based on work by Edward Kirton)
+          (``blastdbp`` and ``blastdbn``, based on work by Edward Kirton)
 v0.0.15 - Fixes a MetadataElement bug and includes more of the optional
           BLAST database files (contribution from Nicola Soranzo)
 v0.0.16 - Adopt standard MIT License.
         - Use reStructuredText for this README file.
         - Development moved to GitHub, https://github.com/peterjc/galaxy_blast
         - Nucleotide database definition aware of MegaBLAST index superheader
-v0.0.17 - Add maskinfo-asn1 and maskinfo-asn1-binary sub-datatypes
+v0.0.17 - Add ``maskinfo-asn1`` and ``maskinfo-asn1-binary`` sub-datatypes
+          (contribution from Nicola Soranzo)
 v0.0.18 - Add retries to BLAST XML merge code.
         - Modify display_data method to allow unit tests to function.
+v0.0.19 - Add ``blastdbp`` datatype for BLAST protein domain databases, for use
+          with makeprofiledb and rpsblast (contribution from Bjoern Gruening).
+        - Add ``pssm-asn1`` datatype for Position Specific Scoring Matrices
+          (PSSMs) stored in NCBI's "scoremat" ASN.1 format (usually named
+          as *.smp), used as input files for makeprofiledb.
 ======= ======================================================================
 
 
@@ -54,23 +59,29 @@
 ===================
 
 Normally you would install this via the Galaxy ToolShed, which would move
-the provided blast.py file into a suitable location and process the
-datatypes_conf.xml entry to be combined with your local configuration.
+the provided ``blast.py`` file into a suitable location and process the
+``datatypes_conf.xml`` entries to be combined with your local configuration.
 
-However, if you really want to this should work for a manual install. Add
-the following lines to the datatypes_conf.xml file in the Galaxy main folder::
+However, if you really want to this should work for a manual install. First
+update the ``datatypes_conf.xml`` file in the Galaxy main folder by inserting
+the contents of the ``<registration>`` and ``<sniffers>`` sections from the
+small ``datatypes_conf.xml`` file provided in the tar-ball.
+
+For the ``<registration>`` section you would add several ``<datatype ... />``
+lines, one per new datatype::
 
     <datatype extension="blastxml" type="galaxy.datatypes.blast:BlastXml" mimetype="application/xml" display_in_upload="true"/>
-    <datatype extension="blastdbn" type="galaxy.datatypes.blast:BlastNucDb" mimetype="text/html" display_in_upload="false"/>
-    <datatype extension="blastdbp" type="galaxy.datatypes.blast:BlastProtDb" mimetype="text/html" display_in_upload="false"/>
+    ...
 
-and later in the sniffer section::
+Similarly, some of the new dataypes have ``<sniffer ... />`` lines used to
+automatically recognise the datatype when uploaded into Galaxy::
 
     <sniffer type="galaxy.datatypes.blast:BlastXml"/>
+    ...
 
-Also create the file lib/galaxy/datatypes/blast.py by moving, copying or linking
-the blast.py file provided in this tar-ball.  Finally add 'import blast' near
-the start of file lib/galaxy/datatypes/registry.py (after the other import
+Also create the file ``lib/galaxy/datatypes/blast.py`` by moving, copying or linking
+the ``blast.py`` file provided in this tar-ball.  Finally add ``import blast`` near
+the start of file ``lib/galaxy/datatypes/registry.py`` (after the other import
 lines).
 
 
@@ -84,14 +95,14 @@
 Developers
 ==========
 
-BLAST+ datatypes and wrappers, and other tools were originally developed on the
+These BLAST+ datatypes and associated tools were originally developed on the
 following hg branch: http://bitbucket.org/peterjc/galaxy-central/src/tools
 
 As of July 2013, development is continuing on a dedicated GitHub repository:
 https://github.com/peterjc/galaxy_blast
 
 For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball I use
-the following command from the blast_datatypes  folder::
+the following command from the ``blast_datatypes`` folder::
 
     $ tar -czf blast_datatypes.tar.gz README.rst datatypes_conf.xml blast.py
 
@@ -103,7 +114,7 @@
     blast.py
 
 For development, rather than having a local ToolShed running, I currently
-use a symlink from lib/galaxy/datatypes/blast.py to the actual file as
+use a symlink from ``lib/galaxy/datatypes/blast.py`` to the actual file as
 described above.
 
 
--- a/blast.py	Tue Jan 21 13:33:20 2014 -0500
+++ b/blast.py	Wed Nov 26 06:55:48 2014 -0500
@@ -3,7 +3,7 @@
 """
 
 from galaxy.datatypes.data import get_file_peek
-from galaxy.datatypes.data import Text, Data
+from galaxy.datatypes.data import Text, Data, GenericAsn1
 from galaxy.datatypes.xml import GenericXml
 from galaxy.datatypes.metadata import MetadataElement
 
@@ -180,8 +180,10 @@
             title = "This is a nucleotide BLAST database"
         elif self.file_ext =="blastdbp":
             title = "This is a protein BLAST database"
+        elif self.file_ext =="blastdbd":
+            title = "This is a domain BLAST database"
         else:
-            #Error?                                                                                                                                                                     
+            #Error?
             title = "This is a BLAST database."
         msg = ""
         try:
@@ -259,3 +261,22 @@
 #        self.add_composite_file('blastdb.pab', is_binary=True, optional=True)
 #        self.add_composite_file('blastdb.pac', is_binary=True, optional=True)
 # The last 3 lines should be repeated for each WriteDB column, with filename extensions like ('.pba', '.pbb', '.pbc'), ('.pca', '.pcb', '.pcc'), etc.
+
+
+class BlastDomainDb( _BlastDb, Data ):
+    """Class for domain BLAST database files."""
+    file_ext = 'blastdbd'
+    allow_datatype_change = False
+    composite_type = 'basic'
+
+    def __init__(self, **kwd):
+        Data.__init__(self, **kwd)
+        self.add_composite_file('blastdb.phr', is_binary=True)
+        self.add_composite_file('blastdb.pin', is_binary=True)
+        self.add_composite_file('blastdb.psq', is_binary=True)
+        self.add_composite_file('blastdb.freq', is_binary=True, optional=True)
+        self.add_composite_file('blastdb.loo', is_binary=True, optional=True)
+        self.add_composite_file('blastdb.psd', is_binary=True, optional=True)
+        self.add_composite_file('blastdb.psi', is_binary=True, optional=True)
+        self.add_composite_file('blastdb.rps', is_binary=True, optional=True)
+        self.add_composite_file('blastdb.aux', is_binary=True, optional=True)
--- a/datatypes_conf.xml	Tue Jan 21 13:33:20 2014 -0500
+++ b/datatypes_conf.xml	Wed Nov 26 06:55:48 2014 -0500
@@ -7,8 +7,10 @@
         <datatype extension="blastxml" type="galaxy.datatypes.blast:BlastXml" mimetype="application/xml" display_in_upload="true"/>
         <datatype extension="blastdbn" type="galaxy.datatypes.blast:BlastNucDb" mimetype="text/html" display_in_upload="false"/>
         <datatype extension="blastdbp" type="galaxy.datatypes.blast:BlastProtDb" mimetype="text/html" display_in_upload="false"/>
+        <datatype extension="blastdbd" type="galaxy.datatypes.blast:BlastDomainDb" mimetype="text/html" display_in_upload="false"/>
         <datatype extension="maskinfo-asn1" type="galaxy.datatypes.data:GenericAsn1" mimetype="text/plain" subclass="True" display_in_upload="true" />
         <datatype extension="maskinfo-asn1-binary" type="galaxy.datatypes.binary:GenericAsn1Binary" mimetype="application/octet-stream" subclass="True" display_in_upload="true" />
+        <datatype extension="pssm-asn1" type="galaxy.datatypes.data:GenericAsn1" mimetype="text/plain" subclass="True" display_in_upload="true" />
     </registration>
     <sniffers>
         <sniffer type="galaxy.datatypes.blast:BlastXml"/>