Mercurial > repos > devteam > blast_datatypes
changeset 9:2bda64d39931 draft
Uploaded v0.0.19, adds blastdbp and pssm-asn1 datatypes.
author | peterjc |
---|---|
date | Wed, 26 Nov 2014 06:55:48 -0500 |
parents | de11e1a921c4 |
children | 5482a8cd0f36 |
files | README.rst blast.py datatypes_conf.xml |
diffstat | 3 files changed, 56 insertions(+), 22 deletions(-) [+] |
line wrap: on
line diff
--- a/README.rst Tue Jan 21 13:33:20 2014 -0500 +++ b/README.rst Wed Nov 26 06:55:48 2014 -0500 @@ -1,10 +1,9 @@ Galaxy datatypes for NCBI BLAST+ suite ====================================== -These Galaxy datatypes are copyright 2010-2013 by Peter Cock, The James Hutton -Institute (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved. -Contributions/revisions copyright 2012 Edward Kirton. All rights reserved. -Contributions/revisions copyright 2013 Nicola Soranzo. All rights reserved. +These Galaxy datatypes are copyright 2010-2014 by Peter Cock (The James Hutton +Institute, UK) and additional contributors including Edward Kirton, Nicola +Soranzo, and Bjoern Gruening. See the licence text below. @@ -29,18 +28,24 @@ ------- ---------------------------------------------------------------------- v0.0.11 - Final revision as part of the Galaxy main repository, and the first release via the Tool Shed -v0.0.13 - Uses blast.py instead of xml.py to define the datatypes +v0.0.13 - Uses ``blast.py`` instead of ``xml.py`` to define the datatypes v0.0.14 - Includes datatypes for protein and nucleotide BLAST databases - (based on work by Edward Kirton) + (``blastdbp`` and ``blastdbn``, based on work by Edward Kirton) v0.0.15 - Fixes a MetadataElement bug and includes more of the optional BLAST database files (contribution from Nicola Soranzo) v0.0.16 - Adopt standard MIT License. - Use reStructuredText for this README file. - Development moved to GitHub, https://github.com/peterjc/galaxy_blast - Nucleotide database definition aware of MegaBLAST index superheader -v0.0.17 - Add maskinfo-asn1 and maskinfo-asn1-binary sub-datatypes +v0.0.17 - Add ``maskinfo-asn1`` and ``maskinfo-asn1-binary`` sub-datatypes + (contribution from Nicola Soranzo) v0.0.18 - Add retries to BLAST XML merge code. - Modify display_data method to allow unit tests to function. +v0.0.19 - Add ``blastdbp`` datatype for BLAST protein domain databases, for use + with makeprofiledb and rpsblast (contribution from Bjoern Gruening). + - Add ``pssm-asn1`` datatype for Position Specific Scoring Matrices + (PSSMs) stored in NCBI's "scoremat" ASN.1 format (usually named + as *.smp), used as input files for makeprofiledb. ======= ====================================================================== @@ -54,23 +59,29 @@ =================== Normally you would install this via the Galaxy ToolShed, which would move -the provided blast.py file into a suitable location and process the -datatypes_conf.xml entry to be combined with your local configuration. +the provided ``blast.py`` file into a suitable location and process the +``datatypes_conf.xml`` entries to be combined with your local configuration. -However, if you really want to this should work for a manual install. Add -the following lines to the datatypes_conf.xml file in the Galaxy main folder:: +However, if you really want to this should work for a manual install. First +update the ``datatypes_conf.xml`` file in the Galaxy main folder by inserting +the contents of the ``<registration>`` and ``<sniffers>`` sections from the +small ``datatypes_conf.xml`` file provided in the tar-ball. + +For the ``<registration>`` section you would add several ``<datatype ... />`` +lines, one per new datatype:: <datatype extension="blastxml" type="galaxy.datatypes.blast:BlastXml" mimetype="application/xml" display_in_upload="true"/> - <datatype extension="blastdbn" type="galaxy.datatypes.blast:BlastNucDb" mimetype="text/html" display_in_upload="false"/> - <datatype extension="blastdbp" type="galaxy.datatypes.blast:BlastProtDb" mimetype="text/html" display_in_upload="false"/> + ... -and later in the sniffer section:: +Similarly, some of the new dataypes have ``<sniffer ... />`` lines used to +automatically recognise the datatype when uploaded into Galaxy:: <sniffer type="galaxy.datatypes.blast:BlastXml"/> + ... -Also create the file lib/galaxy/datatypes/blast.py by moving, copying or linking -the blast.py file provided in this tar-ball. Finally add 'import blast' near -the start of file lib/galaxy/datatypes/registry.py (after the other import +Also create the file ``lib/galaxy/datatypes/blast.py`` by moving, copying or linking +the ``blast.py`` file provided in this tar-ball. Finally add ``import blast`` near +the start of file ``lib/galaxy/datatypes/registry.py`` (after the other import lines). @@ -84,14 +95,14 @@ Developers ========== -BLAST+ datatypes and wrappers, and other tools were originally developed on the +These BLAST+ datatypes and associated tools were originally developed on the following hg branch: http://bitbucket.org/peterjc/galaxy-central/src/tools As of July 2013, development is continuing on a dedicated GitHub repository: https://github.com/peterjc/galaxy_blast For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball I use -the following command from the blast_datatypes folder:: +the following command from the ``blast_datatypes`` folder:: $ tar -czf blast_datatypes.tar.gz README.rst datatypes_conf.xml blast.py @@ -103,7 +114,7 @@ blast.py For development, rather than having a local ToolShed running, I currently -use a symlink from lib/galaxy/datatypes/blast.py to the actual file as +use a symlink from ``lib/galaxy/datatypes/blast.py`` to the actual file as described above.
--- a/blast.py Tue Jan 21 13:33:20 2014 -0500 +++ b/blast.py Wed Nov 26 06:55:48 2014 -0500 @@ -3,7 +3,7 @@ """ from galaxy.datatypes.data import get_file_peek -from galaxy.datatypes.data import Text, Data +from galaxy.datatypes.data import Text, Data, GenericAsn1 from galaxy.datatypes.xml import GenericXml from galaxy.datatypes.metadata import MetadataElement @@ -180,8 +180,10 @@ title = "This is a nucleotide BLAST database" elif self.file_ext =="blastdbp": title = "This is a protein BLAST database" + elif self.file_ext =="blastdbd": + title = "This is a domain BLAST database" else: - #Error? + #Error? title = "This is a BLAST database." msg = "" try: @@ -259,3 +261,22 @@ # self.add_composite_file('blastdb.pab', is_binary=True, optional=True) # self.add_composite_file('blastdb.pac', is_binary=True, optional=True) # The last 3 lines should be repeated for each WriteDB column, with filename extensions like ('.pba', '.pbb', '.pbc'), ('.pca', '.pcb', '.pcc'), etc. + + +class BlastDomainDb( _BlastDb, Data ): + """Class for domain BLAST database files.""" + file_ext = 'blastdbd' + allow_datatype_change = False + composite_type = 'basic' + + def __init__(self, **kwd): + Data.__init__(self, **kwd) + self.add_composite_file('blastdb.phr', is_binary=True) + self.add_composite_file('blastdb.pin', is_binary=True) + self.add_composite_file('blastdb.psq', is_binary=True) + self.add_composite_file('blastdb.freq', is_binary=True, optional=True) + self.add_composite_file('blastdb.loo', is_binary=True, optional=True) + self.add_composite_file('blastdb.psd', is_binary=True, optional=True) + self.add_composite_file('blastdb.psi', is_binary=True, optional=True) + self.add_composite_file('blastdb.rps', is_binary=True, optional=True) + self.add_composite_file('blastdb.aux', is_binary=True, optional=True)
--- a/datatypes_conf.xml Tue Jan 21 13:33:20 2014 -0500 +++ b/datatypes_conf.xml Wed Nov 26 06:55:48 2014 -0500 @@ -7,8 +7,10 @@ <datatype extension="blastxml" type="galaxy.datatypes.blast:BlastXml" mimetype="application/xml" display_in_upload="true"/> <datatype extension="blastdbn" type="galaxy.datatypes.blast:BlastNucDb" mimetype="text/html" display_in_upload="false"/> <datatype extension="blastdbp" type="galaxy.datatypes.blast:BlastProtDb" mimetype="text/html" display_in_upload="false"/> + <datatype extension="blastdbd" type="galaxy.datatypes.blast:BlastDomainDb" mimetype="text/html" display_in_upload="false"/> <datatype extension="maskinfo-asn1" type="galaxy.datatypes.data:GenericAsn1" mimetype="text/plain" subclass="True" display_in_upload="true" /> <datatype extension="maskinfo-asn1-binary" type="galaxy.datatypes.binary:GenericAsn1Binary" mimetype="application/octet-stream" subclass="True" display_in_upload="true" /> + <datatype extension="pssm-asn1" type="galaxy.datatypes.data:GenericAsn1" mimetype="text/plain" subclass="True" display_in_upload="true" /> </registration> <sniffers> <sniffer type="galaxy.datatypes.blast:BlastXml"/>