Mercurial > repos > peterjc > blast2go
changeset 9:887adf823bc0 draft
v0.0.10 - Python 3 compatiblity etc (overdue upload)
| author | peterjc | 
|---|---|
| date | Tue, 06 Dec 2022 16:26:16 +0000 | 
| parents | e23b621eb7bb | 
| children | 8664c4c94764 | 
| files | tools/blast2go/README.rst tools/blast2go/blast2go.py tools/blast2go/blast2go.xml tools/blast2go/massage_xml_for_blast2go.py tools/blast2go/repository_dependencies.xml tools/blast2go/tool_dependencies.xml | 
| diffstat | 6 files changed, 93 insertions(+), 87 deletions(-) [+] | 
line wrap: on
 line diff
--- a/tools/blast2go/README.rst Thu Mar 26 11:15:22 2015 -0400 +++ b/tools/blast2go/README.rst Tue Dec 06 16:26:16 2022 +0000 @@ -1,7 +1,7 @@ Galaxy wrapper for Blast2GO for pipelines, b2g4pipe =================================================== -This wrapper is copyright 2011-2014 by Peter Cock, The James Hutton Institute +This wrapper is copyright 2011-2015 by Peter Cock, The James Hutton Institute (formerly SCRI, Scottish Crop Research Institute), UK. All rights reserved. See the licence text below (MIT licence). @@ -16,8 +16,10 @@ http://toolshed.g2.bx.psu.edu/view/peterjc/blast2go -References -========== +Citation +======== + +Please cite the following papers: Peter Cock, Bjoern Gruening, Konrad Paszkiewicz and Leighton Pritchard (2013). Galaxy tools and workflows for sequence analysis with applications @@ -46,8 +48,8 @@ ====================== Installation via the Galaxy Tool Shed should take care of the Galaxy side of -things, including the dependency on 'blast_datatypes' which defines the -'blastxml' file format. However, you will also probably need to configure +things, including the dependency on ``blast_datatypes`` which defines the +``blastxml`` file format. However, you will also probably need to configure the Blast2GO property file(s), for example if you have a local Blast2GO database (which we recommend for speed). @@ -62,30 +64,30 @@ * http://www.blast2go.com/data/blast2go/b2g4pipe_v2.5.zip -You can change the path by setting the B2G4PIPE environement variable to +You can change the path by setting the ``$B2G4PIPE`` environment variable to the desired folder, but by default the script looks for the JAR file here:: /opt/b2g4pipe_v2.5/blast2go.jar -To install the wrapper manually, first install 'blast_datatypes', then +To install the wrapper manually, first install ``blast_datatypes``, then copy or move the following files under the Galaxy tools folder, e.g. in a -tools/blast2go/ folder: +``tools/blast2go/`` folder: -* blast2go.xml (the Galaxy tool definition) -* blast2go.py (the Python wrapper script) -* massage_xml_for_blast2go.py (Python XML reformatting script) -* README.rst (this file) +- ``blast2go.xml`` (the Galaxy tool definition) +- ``blast2go.py`` (the Python wrapper script) +- ``massage_xml_for_blast2go.py`` (Python BLAST XML reformatting script) +- ``README.rst`` (this file) For a manual installation of the wrapper you will also need to modify the -tools_conf.xml file to tell Galaxy to offer the tool. We suggest putting +``tools_conf.xml`` file to tell Galaxy to offer the tool. We suggest putting it next to the NCBI BLAST+ wrappers. Just add the line:: <tool file="blast2go/blast2go.xml" /> -If you wish to run the unit tests, also add this to tools_conf.xml.sample -and move/copy the test-data files under Galaxy's test-data folder. Then:: +If you wish to run the unit tests, also move/copy the ``test-data/`` files +under Galaxy's ``test-data/`` folder. Then:: - $ ./run_functional_tests.sh -id blast2go + $ ./run_tests.sh -id blast2go Configuration @@ -162,6 +164,8 @@ of the Blast2GO command line tool. For now b2g4pipe v2.5 is still available as a free download. - Tool definition now embeds citation information. +v0.0.10 - Reorder XML elements (internal change only). + - Planemo for Tool Shed upload (``.shed.yml``, internal change only). ======= ====================================================================== @@ -175,23 +179,33 @@ As of September 2013, development is continuing on a dedicated GitHub repository: https://github.com/peterjc/galaxy_blast -For making the "Galaxy Tool Shed" http://toolshed.g2.bx.psu.edu/ tarball I use -the following command from the Galaxy root folder:: +For pushing a release to the test or main "Galaxy Tool Shed", use the following +Planemo commands (which requires you have set your Tool Shed access details in +``~/.planemo.yml`` and that you have access rights on the Tool Shed):: - $ tar -czf blast2go.tar.gz tools/blast2go/README.rst tools/blast2go/blast2go.xml tools/blast2go/blast2go.py tools/blast2go/massage_xml_for_blast2go.py tools/blast2go/repository_dependencies.xml tools/blast2go/tool_dependencies.xml tool-data/blast2go.loc.sample test-data/blastp_sample.xml test-data/blastp_sample.blast2go.tabular + $ planemo shed_update -t testtoolshed --check_diff ~/repositories/galaxy_blast/tools/blast2go/ + ... + +or:: + + $ planemo shed_update -t toolshed --check_diff ~/repositories/galaxy_blast/tools/blast2go/ + ... -Check this worked:: +To just build and check the tar ball, use:: - $ tar -tzf blast2go.tar.gz + $ planemo shed_upload --tar_only ~/repositories/galaxy_blast/tools/blast2go/ + ... + $ tar -tzf shed_upload.tar.gz + test-data/blastp_sample.blast2go.tabular + test-data/blastp_sample.xml + tool-data/blast2go.loc.sample tools/blast2go/README.rst + tools/blast2go/blast2go.py tools/blast2go/blast2go.xml - tools/blast2go/blast2go.py tools/blast2go/massage_xml_for_blast2go.py tools/blast2go/repository_dependencies.xml tools/blast2go/tool_dependencies.xml - tool-data/blast2go.loc.sample - test-data/blastp_sample.xml - test-data/blastp_sample.blast2go.tabular + Licence (MIT)
--- a/tools/blast2go/blast2go.py Thu Mar 26 11:15:22 2015 -0400 +++ b/tools/blast2go/blast2go.py Tue Dec 06 16:26:16 2022 +0000 @@ -30,57 +30,52 @@ import os import subprocess -#You may need to edit this to match your local setup, +# You may need to edit this to match your local setup, blast2go_dir = os.environ.get("B2G4PIPE", "/opt/b2g4pipe_v2.5/") blast2go_jar = os.path.join(blast2go_dir, "blast2go.jar") -def stop_err(msg, error_level=1): - """Print error message to stdout and quit with given error level.""" - sys.stderr.write("%s\n" % msg) - sys.exit(error_level) try: from massage_xml_for_blast2go import prepare_xml except ImportError: - stop_err("Missing sister file massage_xml_for_blast2go.py") + sys.exit("Missing sister file massage_xml_for_blast2go.py") if len(sys.argv) != 4: - stop_err("Require three arguments: XML filename, properties filename, output tabular filename") + sys.exit("Require three arguments: XML filename, properties filename, output tabular filename") xml_file, prop_file, tabular_file = sys.argv[1:] -#We should have write access here: +# We should have write access here: tmp_xml_file = tabular_file + ".tmp.xml" if not os.path.isfile(blast2go_jar): - stop_err("Blast2GO JAR file not found: %s" % blast2go_jar) + sys.exit("Blast2GO JAR file not found: %s" % blast2go_jar) if not os.path.isfile(xml_file): - stop_err("Input BLAST XML file not found: %s" % xml_file) + sys.exit("Input BLAST XML file not found: %s" % xml_file) if not os.path.isfile(prop_file): tmp = os.path.join(os.path.split(blast2go_jar)[0], prop_file) if os.path.isfile(tmp): - #The properties file seems to have been given relative to the JAR + # The properties file seems to have been given relative to the JAR prop_file = tmp else: - stop_err("Blast2GO configuration file not found: %s" % prop_file) + sys.exit("Blast2GO configuration file not found: %s" % prop_file) del tmp def run(cmd): - #Avoid using shell=True when we call subprocess to ensure if the Python - #script is killed, so too is the child process. + # Avoid using shell=True when we call subprocess to ensure if the Python + # script is killed, so too is the child process. try: child = subprocess.Popen(cmd, stdout=subprocess.PIPE, stderr=subprocess.PIPE) except Exception, err: - stop_err("Error invoking command:\n%s\n\n%s\n" % (" ".join(cmd), err)) - #Use .communicate as can get deadlocks with .wait(), + sys.exit("Error invoking command:\n%s\n\n%s\n" % (" ".join(cmd), err)) stdout, stderr = child.communicate() return_code = child.returncode - #keep stdout minimal as shown prominently in Galaxy - #Record it in case a silent error needs diagnosis + # keep stdout minimal as shown prominently in Galaxy + # Record it in case a silent error needs diagnosis if stdout: sys.stderr.write("Standard out:\n%s\n\n" % stdout) if stderr: @@ -90,14 +85,14 @@ if return_code: cmd_str = " ".join(cmd) error_msg = "Return code %i from command:\n%s" % (return_code, cmd_str) - elif "Database or network connection (timeout) error" in stdout+stderr: + elif "Database or network connection (timeout) error" in stdout + stderr: error_msg = "Database or network connection (timeout) error" - elif "Annotation of 0 seqs with 0 annots finished." in stdout+stderr: + elif "Annotation of 0 seqs with 0 annots finished." in stdout + stderr: error_msg = "No sequences processed!" if error_msg: print error_msg - stop_err(error_msg) + sys.exit(error_msg) blast2go_classpath = os.path.split(blast2go_jar)[0] @@ -105,30 +100,30 @@ blast2go_classpath = "%s/*:%s/ext/*:" % (blast2go_classpath, blast2go_classpath) prepare_xml(xml_file, tmp_xml_file) -#print "XML file prepared for Blast2GO" +# print "XML file prepared for Blast2GO" -#We will have write access wherever the output should be, -#so we'll ask Blast2GO to use that as the stem for its output -#(it will append .annot to the filename) +# We will have write access wherever the output should be, +# so we'll ask Blast2GO to use that as the stem for its output +# (it will append .annot to the filename) cmd = ["java", "-cp", blast2go_classpath, "es.blast2go.prog.B2GAnnotPipe", "-in", tmp_xml_file, "-prop", prop_file, - "-out", tabular_file, #Used as base name for output files - "-annot", # Generate *.annot tabular file - #NOTE: For v2.3.5 must use -a, for v2.5 must use -annot instead - #"-img", # Generate images, feature not in v2.3.5 + "-out", tabular_file, # Used as base name for output files + "-annot", # Generate *.annot tabular file + # NOTE: For v2.3.5 must use -a, for v2.5 must use -annot instead + # "-img", # Generate images, feature not in v2.3.5 ] -#print " ".join(cmd) +# print " ".join(cmd) run(cmd) -#Remove the temp XML file +# Remove the temp XML file os.remove(tmp_xml_file) out_file = tabular_file + ".annot" if not os.path.isfile(out_file): - stop_err("ERROR - No output annotation file from Blast2GO") + sys.exit("ERROR - No output annotation file from Blast2GO") -#Move the output file where Galaxy expects it to be: +# Move the output file where Galaxy expects it to be: os.rename(out_file, tabular_file) print "Done"
--- a/tools/blast2go/blast2go.xml Thu Mar 26 11:15:22 2015 -0400 +++ b/tools/blast2go/blast2go.xml Tue Dec 06 16:26:16 2022 +0000 @@ -1,16 +1,16 @@ -<tool id="blast2go" name="Blast2GO" version="0.0.9"> +<tool id="blast2go" name="Blast2GO" version="0.0.10"> <description>Maps BLAST results to GO annotation terms</description> <requirements> <requirement type="package" version="2.5">b2g4pipe</requirement> </requirements> - <command interpreter="python"> - blast2go.py "${xml}" "${prop.fields.path}" "${tab}" - </command> <stdio> <!-- Wrapper ensures anything other than zero is an error --> <exit_code range="1:" /> <exit_code range=":-1" /> </stdio> + <command interpreter="python"> + blast2go.py "${xml}" "${prop.fields.path}" "${tab}" + </command> <inputs> <param name="xml" type="data" format="blastxml" label="BLAST XML results" description="You must have run BLAST against a protein database such as the NCBI non-redundant (NR) database. Use BLASTX for nucleotide queries, BLASTP for protein queries." /> <param name="prop" type="select" label="Blast2GO settings" description="One or more configurations can be setup, such as using the Blast2GO team's server in Spain, or a local database.">
--- a/tools/blast2go/massage_xml_for_blast2go.py Thu Mar 26 11:15:22 2015 -0400 +++ b/tools/blast2go/massage_xml_for_blast2go.py Tue Dec 06 16:26:16 2022 +0000 @@ -14,19 +14,14 @@ This script is called from my Galaxy wrapper for Blast2GO for pipelines, available from the Galaxy Tool Shed here: -http://toolshed.g2.bx.psu.edu/view/peterjc/blast2go +http://toolshed.g2.bx.psu.edu/view/peterjc/blast2go This script is under version control here: https://github.com/peterjc/galaxy_blast/tree/master/blast2go """ import sys import os -import subprocess -def stop_err(msg, error_level=1): - """Print error message to stdout and quit with given error level.""" - sys.stderr.write("%s\n" % msg) - sys.exit(error_level) def prepare_xml(original_xml, mangled_xml): """Reformat BLAST XML to suit Blast2GO. @@ -45,19 +40,19 @@ while True: line = in_handle.readline() if not line: - #No hits? - stop_err("Problem with XML file?") + # No hits? + sys.exit("Problem with XML file?") if line.strip() == "<Iteration>": break header += line if "<BlastOutput_program>blastx</BlastOutput_program>" in header: - print "BLASTX output identified" + print("BLASTX output identified") elif "<BlastOutput_program>blastp</BlastOutput_program>" in header: - print "BLASTP output identified" + print("BLASTP output identified") else: in_handle.close() - stop_err("Expect BLASTP or BLASTX output") + sys.exit("Expect BLASTP or BLASTX output") out_handle = open(mangled_xml, "w") out_handle.write(header) @@ -68,25 +63,25 @@ if not line: break elif line.strip() == "<Iteration>": - #Insert footer/header - out_handle.write(footer) - out_handle.write(header) - count += 1 + # Insert footer/header + out_handle.write(footer) + out_handle.write(header) + count += 1 out_handle.write(line) out_handle.close() in_handle.close() - print "Input has %i queries" % count + print("Input has %i queries" % count) if __name__ == "__main__": # Run the conversion... if len(sys.argv) != 3: - stop_err("Require two arguments: XML input filename, XML output filename") + sys.exit("Require two arguments: XML input filename, XML output filename") xml_file, out_xml_file = sys.argv[1:] if not os.path.isfile(xml_file): - stop_err("Input BLAST XML file not found: %s" % xml_file) + sys.exit("Input BLAST XML file not found: %s" % xml_file) prepare_xml(xml_file, out_xml_file)
--- a/tools/blast2go/repository_dependencies.xml Thu Mar 26 11:15:22 2015 -0400 +++ b/tools/blast2go/repository_dependencies.xml Tue Dec 06 16:26:16 2022 +0000 @@ -1,4 +1,4 @@ -<?xml version="1.0"?> +<?xml version="1.0" ?> <repositories description="Requires BLAST XML and database datatype definitions."> - <repository changeset_revision="5482a8cd0f36" name="blast_datatypes" owner="devteam" toolshed="https://toolshed.g2.bx.psu.edu" /> -</repositories> + <repository name="blast_datatypes" owner="devteam" toolshed="https://toolshed.g2.bx.psu.edu" changeset_revision="01b38f20197e"/> +</repositories> \ No newline at end of file
--- a/tools/blast2go/tool_dependencies.xml Thu Mar 26 11:15:22 2015 -0400 +++ b/tools/blast2go/tool_dependencies.xml Tue Dec 06 16:26:16 2022 +0000 @@ -9,13 +9,15 @@ <!-- Galaxy moves into the unzipped folder b2g4pipe --> <action type="shell_command"> cp b2gPipe.properties Spain_2012_August.properties && -sed -i "s/Dbacces.dbname=b2g_apr12/Dbacces.dbname=b2g_aug12/g" Spain_2012_August.properties && -sed -i "s/Dbacces.dbhost=10.10.100.203/Dbacces.dbhost=publicdb.blast2go.com/g" Spain_2012_August.properties +sed -i.bak "s/Dbacces.dbname=b2g_apr12/Dbacces.dbname=b2g_aug12/g" Spain_2012_August.properties && +sed -i.bak "s/Dbacces.dbhost=10.10.100.203/Dbacces.dbhost=publicdb.blast2go.com/g" Spain_2012_August.properties +rm Spain_2012_August.properties.bak </action> <action type="shell_command"> cp b2gPipe.properties Spain_2011_June.properties && -sed -i "s/Dbacces.dbname=b2g_apr12/Dbacces.dbname=b2g_jun11/g" Spain_2011_June.properties && -sed -i "s/Dbacces.dbhost=10.10.100.203/Dbacces.dbhost=publicdb.blast2go.com/g" Spain_2011_June.properties +sed -i.bak "s/Dbacces.dbname=b2g_apr12/Dbacces.dbname=b2g_jun11/g" Spain_2011_June.properties && +sed -i.bak "s/Dbacces.dbhost=10.10.100.203/Dbacces.dbhost=publicdb.blast2go.com/g" Spain_2011_June.properties +rm Spain_2011_June.properties.bak </action> <action type="move_directory_files"><source_directory>.</source_directory><destination_directory>$INSTALL_DIR/</destination_directory></action> <!-- Set environment variable $B2G4PIPE so Python script knows where to look -->
