# HG changeset patch # User fubar # Date 1609806888 0 # Node ID 8ea1133b9d9ab7376ee0c09e60b133dcec6ffa5c # Parent e43c43396a7094d21fc20a86dcff39c878738ea7 Uploaded diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/.planemo.yml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/toolfactory/.planemo.yml Tue Jan 05 00:34:48 2021 +0000 @@ -0,0 +1,23 @@ +## Planemo Global Configuration File. +## Everything in this file is completely optional - these values can all be +## configured via command line options for the corresponding commands. + +## Specify a default galaxy_root for test and server commands here. +galaxy_root: /galaxy-central +## Username used with toolshed(s). +shed_username: galaxy +sheds: + # For each tool shed you wish to target, uncomment key or both email and + # password. + toolshed: + #key: "" + #email: "" + #password: "" + testtoolshed: + #key: "" + #email: "" + #password: "" + local: + key: "fakekey" + email: "admin@galaxy.org" + password: "password" diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/.shed.yml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/toolfactory/.shed.yml Tue Jan 05 00:34:48 2021 +0000 @@ -0,0 +1,13 @@ +name: toolfactory +owner: fubar +description: ToolFactory - tool to make Galaxy tools ready for the toolshed +homepage_url: https://github.com/fubar2/toolfactory +long_description: | + ToolFactory - turn executable packages and R/python/perl/bash scripts into ordinary Galaxy tools + + Creating re-usable tools from scripts: The Galaxy Tool Factory Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team + Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573 +remote_repository_url: https://github.com/fubar2/toolfactory +type: tool_dependency_definition +categories: +- Tool Generators diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/README.md --- a/toolfactory/README.md Fri Dec 11 04:23:48 2020 +0000 +++ b/toolfactory/README.md Tue Jan 05 00:34:48 2021 +0000 @@ -1,55 +1,255 @@ -**Breaking news! Docker container is recommended as at August 2020** +## Breaking news! Docker container at https://github.com/fubar2/toolfactory-galaxy-docker recommended as at December 2020 -A Docker container can be built - see the docker directory. -It is highly recommended for isolation. It also has an integrated toolshed to allow installation of new tools back -into the Galaxy being used to generate them. +## This is the original ToolFactory suitable for non-docker situations. Please use the docker container if you can because it's integrated with a Toolshed... -Built from quay.io/bgruening/galaxy:20.05 but updates the -Galaxy code to the dev branch - it seems to work fine with updated bioblend>=0.14 -with planemo and the right version of gxformat2 needed by the ToolFactory (TF). +# WARNING -The runclean.sh script run from the docker subdirectory of your local clone of this repository -should create a container (eventually) and serve it at localhost:8080 with a toolshed at -localhost:9009. +Install this tool to a throw-away private Galaxy or Docker container ONLY! -Once it's up, please restart Galaxy in the container with -```docker exec [container name] supervisorctl restart galaxy: ``` -Jobs just do not seem to run properly otherwise and the next steps won't work! - -The generated container includes a workflow and 2 sample data sets for the workflow +Please NEVER on a public or production instance where a hostile user may +be able to gain access if they can acquire an administrative account login. -Load the workflow. Adjust the inputs for each as labelled. The perl example counts GC in phiX.fasta. -The python scripts use the rgToolFactory.py as their input - any text file will work but I like the -recursion. The BWA example has some mitochondrial reads and reference. Run the workflow and watch. -This should fill the history with some sample tools you can rerun and play with. -Note that each new tool will have been tested using Planemo. In the workflow, in Galaxy. -Extremely cool to watch. +It only runs for server administrators - the ToolFactory tool will refuse to execute for an ordinary user since +it can install new tools to the Galaxy server it executes on! This is not something you should allow other than +on a throw away instance that is protected from potentially hostile users. -*WARNING* - - Install this tool on a throw-away private Galaxy or Docker container ONLY - Please NEVER on a public or production instance - -*Short Story* +## Short Story Galaxy is easily extended to new applications by adding a new tool. Each new scientific computational package added as -a tool to Galaxy requires some special instructions to be written. This is sometimes termed "wrapping" the package -because the instructions tell Galaxy how to run the package as a new Galaxy tool. Any tool in a Galaxy is -readily available to all the users through a consistent and easy to use interface. +a tool to Galaxy requires an XML document describing how the application interacts with Galaxy. +This is sometimes termed "wrapping" the package because the instructions tell Galaxy how to run the package +as a new Galaxy tool. Any tool that has been wrapped is readily available to all the users through a consistent +and easy to use interface once installed in the local Galaxy server. + +Most Galaxy tool wrappers have been manually prepared by skilled programmers, many using Planemo because it +automates much of the boilerplate and makes the process much easier. +The ToolFactory (TF) now uses Planemo under the hood for testing, but hides the command +line complexities. The user will still need appropriate skills in terms of describing the interface between +Galaxy and the new application, but will be helped by a Galaxy tool form to collect all the needed +settings, together with automated testing and uploading to a toolshed with optional local installation. + + +## ToolFactory generated tools are ordinary Galaxy tools + +A TF generated tool that passes the Planemo test is ready to publish in any Galaxy Toolshed and ready to install in any running Galaxy instance. +They are fully workflow compatible and work exactly like any hand-written tool. The user can select input files of the specified type(s) from their +history and edit each of the specified parameters. The tool form will show all the labels and help text supplied when the tool was built. When the tool +is executed, the dependent binary or script will be passed all the i/o files and parameters as specified, and will write outputs to the specified new +history datasets - just like any other Galaxy tool. + +## Models for tool command line construction + +The key to turning any software package into a Galaxy tool is the automated construction of a suitable command line. + +The TF can build a new tool that will allow the tool user to select input files from their history, set any parameters and when run will send the +new output files to the history as specified when the tool builder completed the form and built the new tool. + +That tool can contain instructions to run any Conda dependency or a system executable like bash. Whether a bash script you have written or +a Conda package like bwa, the executable will expect to find settings for input, output and parameters on a command line. + +These are often passed as "--name value" (argparse style) or in a fixed order (positional style). + +The ToolFactory allows either, or for "filter" applications that process input from STDIN and write processed output to STDOUT. + +The simplest tool model wraps a simple script or Conda dependency package requiring only input and output files, with no user supplied settings illustrated by +the Tacrev demonstration tool found in the Galaxy running in the ToolFactory docker container. It passes a user selected input file from the current history on STDIN +to a bash script. The bash script runs the unix tac utility (reverse cat) piped to the unix rev (reverse lines in a text file) utility. It's a one liner: + +`tac | rev` + +The tool building form allows zero or more Conda package name(s) and version(s) and an optional script to be executed by either a system +executable like ``bash`` or the first of any named Conda dependency package/version. Tacrev uses a tiny bash script shown above and uses the system +bash. Conda bash can be specified if it is important to use the same version consistently for the tool. + +On the tool form, the repeat section allowing zero or more input files was set to be a text file to be selected by the tool user and +in the repeat section allowing one or more outputs, a new output file with special value `STDOUT` as the positional parameter, causes the TF to +generate a command to capture STDOUT and send it to the new history file containing the reversed input text. + +By reversed, we mean really, truly reversed. + +That simple model can be made much more complicated, and can pass inputs and outputs as named or positional parameters, +to allow more complicated scripts or dependent binaries that require: + +1. Any number of input data files selected by the user from existing history data +2. Any number of output data files written to the user's history +3. Any number of user supplied parameters. These can be passed as command line arguments to the script or the dependency package. Either +positional or named (argparse) style command line parameter passing can be used. + +More complex models can be seen in the Sedtest, Pyrevpos and Pyrevargparse tools illustrating positional and argparse parameter passing. + +The most complex demonstration is the Planemo advanced tool tutorial BWA tool. There is one version using a command-override to implement +exactly the same command structure in the Planemo tutorial. A second version uses a bash script and positional parameters to achieve the same +result. Some tool builders may find the bash version more familiar and cleaner but the choice is yours. + +## Overview + +![IHello example ToolFactory tool form](files/hello_toolfactory_form.png?raw=true "Part of the Hello world example ToolFactory tool form") + + +Steps in building a new Galaxy tool are all conducted through Galaxy running in the docker container: + +1. Login to the Galaxy running in the container at http://localhost:8080 using an admin account. They are specified in config/galaxy.yml and + in the documentation at + and the ToolFactory will error out and refuse to run for non-administrative tool builders as a minimal protection from opportunistic hostile use. + +2. Start the TF and fill in the form, providing sample inputs and parameter values to suit the Conda package being wrapped. + +3. Execute the tool to create a new XML tool wrapper using the sample inputs and parameter settings for the inbuilt tool test. Planemo runs twice. + firstly to generate the test outputs and then to perform a proper test. The completed toolshed archive is written to the history + together with the planemo test report. Optionally the new tool archive can be uploaded + to the toolshed running in the same container (http://localhost:9009) and then installed inside the Galaxy in the container for further testing. + +4. If the test fails, rerun the failed history job and correct errors on the tool form before rerunning until everything works correctly. + + + +![How it works](files/TFasIDE.png?raw=true "Overview of the ToolFactory as an Integrated Development Environment") + +## Planning and building new Galaxy tool wrappers. + +It is best to have all the required planning done to wrap any new script or binary before firing up the TF. +Conda is the only current dependency manager supported. Before starting, at the very least, the tool builder will need +to know the required software package name in Conda and the version to use, how the command line for +the package must be constructed, and there must be sample inputs in the working history for each of the required data inputs +for the package, together with values for every parameter to suit these sample inputs. These are required on the TF form +for preparing the inbuilt tool test. That test is run using Planemo, as part of the tool generation process. + +A new tool is specified by filling in the usual Galaxy tool form. + +The form starts with a new tool name. Most tools will need dependency packages and versions +for the executable. Only Conda is currently supported. + +If a script is needed, it can be pasted into a text box and the interpreter named. Available system executables +can be used such as bash, or an interpreter such as python, perl or R can be nominated as conda dependencies +to ensure reproducible analyses. + +The tool form will be generated from the input data and the tool builder supplied parameters. The command line for the +executable is built using positional or argparse (named e.g. --input_file /foo/baz) style +parameters and is completely dependent on the executable. These can include: + +1. Any number of input data sets needed by the executable. Each appears to the tool user on the run form and is included +on the command line for the executable. The tool builder must supply a small representative sample for each one as +an input for the automated tool test. -Most Galaxy tool wrappers have been manually prepared by skilled programmers, many using Planemo because it -automates much of the basic boilerplate and makes the process much easier. The ToolFactory (TF) -uses Planemo under the hood for many functions, but hides the command -line complexities from the TF user. +2. Any number of output data sets generated by the package can be added to the command line and will appear in +the user's history at the end of the job + +3. Any number of text or numeric parameters. Each will appear to the tool user on the run form and are included +on the command line to the executable. The tool builder must supply a suitable representative value for each one as +the value to be used for the automated tool test. + +Once the form is completed, executing the TF will build a new XML tool wrapper +including a functional test based on the sample settings and data. + +If the Planemo test passes, the tool can be optionally uploaded to the local Galaxy used in the image for more testing. + +A local toolshed runs inside the container to allow an automated installation, although any toolshed and any accessible +Galaxy can be specified for this process by editing the default URL and API keys to provide appropriate credentials. + +## Generated Tool Dependency management + +Conda is used for all dependency management although tools that use system utilities like sed, bash or awk +may be available on job execution nodes. Sed and friends are available as Conda (conda-forge) dependencies if necessary. +Versioned Conda dependencies are always baked-in to the tool and will be used for reproducible calculation. + +## Requirements + +These are all managed automagically. The TF relies on galaxyxml to generate tool xml and uses ephemeris and +bioblend to load tools to the toolshed and to Galaxy. Planemo is used for testing and runs in a biocontainer currently at +https://quay.io/fubar2/planemo-biocontainer + +## Caveats + +This docker image requires privileged mode so exposes potential security risks if hostile tool builders gain access. +Please, do not run it in any situation where that is a problem - never, ever on a public facing Galaxy server. +On a laptop or workstation should be fine in a non-hostile environment. + + +## Example generated XML -*More Explanation* +For the bwa-mem example, a supplied bash script is included as a configfile and so has escaped characters. +``` + + + + + Planemo advanced tool building sample bwa mem mapper as a ToolFactory demo + + bwa + samtools + + + tempsam +samtools view -Sb tempsam > temporary_bam_file.bam +samtools sort -o "\$BAMOUT" temporary_bam_file.bam -The TF is an unusual Galaxy tool, designed to allow a skilled user to make new Galaxy tools. +]]> + + + + + + + + + + + + + + + + + tempsam + samtools view -Sb tempsam > temporary_bam_file.bam + samtools sort -o "$BAMOUT" temporary_bam_file.bam + +]]> + + +``` + + + +## More Explanation + +The TF is an unusual Galaxy tool, designed to allow a skilled user to make new Galaxy tools. It appears in Galaxy just like any other tool but outputs include new Galaxy tools generated using instructions provided by the user and the results of Planemo lint and tool testing using small sample inputs provided by the TF user. The small samples become tests built in to the new tool. -It offers a familiar Galaxy form driven way to define how the user of the new tool will +It offers a familiar Galaxy form driven way to define how the user of the new tool will choose input data from their history, and what parameters the new tool user will be able to adjust. The TF user must know, or be able to read, enough about the tool to be able to define the details of the new Galaxy interface and the ToolFactory offers little guidance on that other than some examples. @@ -57,8 +257,8 @@ Tools always depend on other things. Most tools in Galaxy depend on third party scientific packages, so TF tools usually have one or more dependencies. These can be scientific packages such as BWA or scripting languages such as Python and are -usually managed by Conda. If the new tool relies on a system utility such as bash or awk -where the importance of version control on reproducibility is low, these can be used without +managed by Conda. If the new tool relies on a system utility such as bash or awk +where the importance of version control on reproducibility is low, these can be used without Conda management - but remember the potential risks of unmanaged dependencies on computational reproducibility. @@ -72,7 +272,7 @@ environment is actually possible, but not recommended being somewhat clumsy and inefficient. Tools nearly always take one or more data sets from the user's history as input. TF tools -allow the TF user to define what Galaxy datatypes the tool end user will be able to choose and what +allow the TF user to define what Galaxy datatypes the tool end user will be able to choose and what names or positions will be used to pass them on a command line to the package or script. Tools often have various parameter settings. The TF allows the TF user to define how each @@ -83,7 +283,7 @@ Best practice Galaxy tools have one or more automated tests. These should use small sample data sets and specific parameter settings so when the tool is tested, the outputs can be compared with their expected -values. The TF will automatically create a test for the new tool. It will use the sample data sets +values. The TF will automatically create a test for the new tool. It will use the sample data sets chosen by the TF user when they built the new tool. The TF works by exposing *unrestricted* and therefore extremely dangerous scripting @@ -91,108 +291,86 @@ run scripts in R, python, sh and perl. For this reason, a Docker container is available to help manage the associated risks. -*Scripting uses* +## Scripting uses To use a scripting language to create a new tool, you must first prepared and properly test a script. Use small sample data sets for testing. When the script is working correctly, upload the small sample datasets into a new history, start configuring a new ToolFactory tool, and paste the script into the script text box on the TF form. -*Outputs* - -Once the script runs sucessfully, a new Galaxy tool that runs your script -can be generated. Select the "generate" option and supply some help text and -names. The new tool will be generated in the form of a new Galaxy datatype -*tgz* - as the name suggests, it's an archive ready to upload to a -Galaxy ToolShed as a new tool repository. +### Outputs -It is also possible to run a tool to generate test outputs, then test it -using planemo. A toolshed is built in to the Docker container and configured -so a tool can be tested, sent to that toolshed, then installed in the Galaxy -where the TF is running. +The TF will generate the new tool described on the TF form, and test it +using planemo. Optionally if a local toolshed is running, it can be used to +install the new tool back into the generating Galaxy. -If the tool requires a command or test XML override, then planemo is -needed to generate test outputs to make a complete tool, rerun to test -and if required upload to the local toolshed and install in the Galaxy -where the TF is running. +A toolshed is built in to the Docker container and configured +so a tool can be tested, sent to that toolshed, then installed in the Galaxy +where the TF is running using the default toolshed and Galaxy URL and API keys. Once it's in a ToolShed, it can be installed into any local Galaxy server from the server administrative interface. -Once the new tool is installed, local users can run it - each time, the +Once the new tool is installed, local users can run it - each time, the package and/or script that was supplied when it was built will be executed with the input chosen from the user's history, together with user supplied parameters. In other words, the tools you generate with the -ToolFactory run just like any other Galaxy tool. +TF run just like any other Galaxy tool. TF generated tools work as normal workflow components. -*Limitations* +## Limitations The TF is flexible enough to generate wrappers for many common scientific packages but the inbuilt automation will not cope with all possible situations. Users can supply overrides for two tool XML segments - tests and command and the BWA -example in the supplied samples workflow illustrates their use. - -*Installation* +example in the supplied samples workflow illustrates their use. It does not deal with +repeated elements or conditional parameters such as allowing a user to choose to see "simple" +or "advanced" parameters (yet) and there will be plenty of packages it just +won't cover - but it's a quick and efficient tool for the other 90% of cases. Perfect for +that bash one liner you need to get that workflow functioning correctly for this +afternoon's demonstration! -The Docker container is the best way to use the TF because it is preconfigured -to automate new tool testing and has a built in local toolshed where each new tool -is uploaded. If you grab the docker container, it should just work. +## Installation -If you build the container, there are some things to watch out for. Let it run for 10 minutes -or so once you build it - check with top until conda has finished fussing. Once everything quietens -down, find the container with -```docker ps``` -and use -```docker exec [containername] supervisorctl restart galaxy:``` -That colon is not a typographical mistake. -Not restarting after first boot seems to leave the job/worflow system confused and the workflow -just will not run properly until Galaxy has restarted. +The Docker container https://github.com/fubar2/toolfactory-galaxy-docker/blob/main/README.md +is the best way to use the TF because it is preconfigured +to automate new tool testing and has a built in local toolshed where each new tool +is uploaded. If you grab the docker container, it should just work after a restart and you +can run a workflow to generate all the sample tools. Running the samples and rerunning the ToolFactory +jobs that generated them allows you to add fields and experiment to see how things work. -Login as admin@galaxy.org with password "password". Feel free to change it once you are logged in. -There should be a companion toolshed at localhost:9090. The history should have some sample data for -the workflow. - -Run the workflow and make sure the right dataset is selected for each of the input files. Most of the -examples use text files so should run, but the bwa example needs the right ones to work properly. - -When the workflow is finished, you will have half a dozen examples to rerun and play with. They have also -all been tested and installed so you should find them in your tool menu under "Generated Tools" - -It is easy to install without Docker, but you will need to make some +It can be installed like any other tool from the Toolshed, but you will need to make some configuration changes (TODO write a configuration). You can install it most conveniently using the administrative "Search and browse tool sheds" link. Find the Galaxy Main toolshed at https://toolshed.g2.bx.psu.edu/ and search for the toolfactory repository in the Tool Maker section. Open it and review the code and select the option to install it. -Otherwise, if not already there pending an accepted PR, -please add: - -to your local data_types_conf.xml. +If not already there please add: + +``` + +``` + +to your local config/data_types_conf.xml. -*Restricted execution* +## Restricted execution + +The tool factory tool itself will ONLY run for admin users - +people with IDs in config/galaxy.yml "admin_users". -The tool factory tool itself will then be usable ONLY by admin users - -people with IDs in admin_users. **Yes, that's right. ONLY -admin_users can run this tool** Think about it for a moment. If allowed to -run any arbitrary script on your Galaxy server, the only thing that would -impede a miscreant bent on destroying all your Galaxy data would probably -be lack of appropriate technical skills. +*ONLY admin_users can run this tool* -**Generated tool Security** +That doesn't mean it's safe to install on a shared or exposed instance - please don't. + +## Generated tool Security Once you install a generated tool, it's just another tool - assuming the script is safe. They just run normally and their user cannot do anything unusually insecure but please, practice safe toolshed. Read the code before you install any tool. Especially this one - it is really scary. -**Send Code** - -Pull requests and suggestions welcome as git issues please? - -**Attribution** +## Attribution Creating re-usable tools from scripts: The Galaxy Tool Factory Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team @@ -200,12 +378,3 @@ http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref -**Licensing** - -Copyright Ross Lazarus 2010 -ross lazarus at g mail period com - -All rights reserved. - -Licensed under the LGPL - diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/galaxy-tool-test --- a/toolfactory/galaxy-tool-test Fri Dec 11 04:23:48 2020 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,457 +0,0 @@ -#!/usr/bin/env python - -import argparse -import datetime as dt -import json -import logging -import os -import sys -import tempfile -from collections import namedtuple -from concurrent.futures import thread, ThreadPoolExecutor - -import yaml - -from galaxy.tool_util.verify.interactor import ( - DictClientTestConfig, - GalaxyInteractorApi, - verify_tool, -) - -DESCRIPTION = """Script to quickly run a tool test against a running Galaxy instance.""" -DEFAULT_SUITE_NAME = "Galaxy Tool Tests" -ALL_TESTS = -1 -ALL_TOOLS = "*" -ALL_VERSION = "*" -LATEST_VERSION = None - - -TestReference = namedtuple("TestReference", ["tool_id", "tool_version", "test_index"]) -TestException = namedtuple("TestException", ["tool_id", "exception", "was_recorded"]) - - -class Results: - - def __init__(self, default_suitename, test_json, append=False): - self.test_json = test_json or "-" - test_results = [] - test_exceptions = [] - suitename = default_suitename - if append: - assert test_json != "-" - with open(test_json) as f: - previous_results = json.load(f) - test_results = previous_results["tests"] - if "suitename" in previous_results: - suitename = previous_results["suitename"] - self.test_results = test_results - self.test_exceptions = test_exceptions - self.suitename = suitename - - def register_result(self, result): - self.test_results.append(result) - - def register_exception(self, test_exception): - self.test_exceptions.append(test_exception) - - def already_successful(self, test_reference): - test_id = _test_id_for_reference(test_reference) - for test_result in self.test_results: - if test_result.get('id') != test_id: - continue - - has_data = test_result.get('has_data', False) - if has_data: - test_data = test_result.get("data", {}) - if 'status' in test_data and test_data['status'] == 'success': - return True - - return False - - def write(self): - tests = sorted(self.test_results, key=lambda el: el['id']) - n_passed, n_failures, n_skips = 0, 0, 0 - n_errors = len([e for e in self.test_exceptions if not e.was_recorded]) - for test in tests: - has_data = test.get('has_data', False) - if has_data: - test_data = test.get("data", {}) - if 'status' not in test_data: - raise Exception(f"Test result data {test_data} doesn't contain a status key.") - status = test_data['status'] - if status == "success": - n_passed += 1 - elif status == "error": - n_errors += 1 - elif status == "skip": - n_skips += 1 - elif status == "failure": - n_failures += 1 - report_obj = { - 'version': '0.1', - 'suitename': self.suitename, - 'results': { - 'total': n_passed + n_failures + n_skips + n_errors, - 'errors': n_errors, - 'failures': n_failures, - 'skips': n_skips, - }, - 'tests': tests, - } - if self.test_json == "-": - print(json.dumps(report_obj)) - else: - with open(self.test_json, "w") as f: - json.dump(report_obj, f) - - def info_message(self): - messages = [] - passed_tests = self._tests_with_status('success') - messages.append("Passed tool tests ({}): {}".format( - len(passed_tests), - [t["id"] for t in passed_tests] - )) - failed_tests = self._tests_with_status('failure') - messages.append("Failed tool tests ({}): {}".format( - len(failed_tests), - [t["id"] for t in failed_tests] - )) - skiped_tests = self._tests_with_status('skip') - messages.append("Skipped tool tests ({}): {}".format( - len(skiped_tests), - [t["id"] for t in skiped_tests] - )) - errored_tests = self._tests_with_status('error') - messages.append("Errored tool tests ({}): {}".format( - len(errored_tests), - [t["id"] for t in errored_tests] - )) - return "\n".join(messages) - - @property - def success_count(self): - self._tests_with_status('success') - - @property - def skip_count(self): - self._tests_with_status('skip') - - @property - def error_count(self): - return self._tests_with_status('error') + len(self.test_exceptions) - - @property - def failure_count(self): - return self._tests_with_status('failure') - - def _tests_with_status(self, status): - return [t for t in self.test_results if t.get("data", {}).get("status") == status] - - -def test_tools( - galaxy_interactor, - test_references, - results, - log=None, - parallel_tests=1, - history_per_test_case=False, - no_history_cleanup=False, - retries=0, - verify_kwds=None, -): - """Run through tool tests and write report. - - Refactor this into Galaxy in 21.01. - """ - verify_kwds = (verify_kwds or {}).copy() - tool_test_start = dt.datetime.now() - history_created = False - if history_per_test_case: - test_history = None - else: - history_created = True - test_history = galaxy_interactor.new_history(history_name=f"History for {results.suitename}") - verify_kwds.update({ - "no_history_cleanup": no_history_cleanup, - "test_history": test_history, - }) - with ThreadPoolExecutor(max_workers=parallel_tests) as executor: - try: - for test_reference in test_references: - _test_tool( - executor=executor, - test_reference=test_reference, - results=results, - galaxy_interactor=galaxy_interactor, - log=log, - retries=retries, - verify_kwds=verify_kwds, - ) - finally: - # Always write report, even if test was cancelled. - try: - executor.shutdown(wait=True) - except KeyboardInterrupt: - executor._threads.clear() - thread._threads_queues.clear() - results.write() - if log: - log.info("Report written to '%s'", os.path.abspath(results.test_json)) - log.info(results.info_message()) - log.info("Total tool test time: {}".format(dt.datetime.now() - tool_test_start)) - if history_created and not no_history_cleanup: - galaxy_interactor.delete_history(test_history) - - -def _test_id_for_reference(test_reference): - tool_id = test_reference.tool_id - tool_version = test_reference.tool_version - test_index = test_reference.test_index - - if tool_version and tool_id.endswith("/" + tool_version): - tool_id = tool_id[:-len("/" + tool_version)] - - label_base = tool_id - if tool_version: - label_base += "/" + str(tool_version) - - test_id = label_base + "-" + str(test_index) - return test_id - - -def _test_tool( - executor, - test_reference, - results, - galaxy_interactor, - log, - retries, - verify_kwds, -): - tool_id = test_reference.tool_id - tool_version = test_reference.tool_version - test_index = test_reference.test_index - # If given a tool_id with a version suffix, strip it off so we can treat tool_version - # correctly at least in client_test_config. - if tool_version and tool_id.endswith("/" + tool_version): - tool_id = tool_id[:-len("/" + tool_version)] - - test_id = _test_id_for_reference(test_reference) - - def run_test(): - run_retries = retries - job_data = None - job_exception = None - - def register(job_data_): - nonlocal job_data - job_data = job_data_ - - try: - while run_retries >= 0: - job_exception = None - try: - if log: - log.info("Executing test '%s'", test_id) - verify_tool( - tool_id, galaxy_interactor, test_index=test_index, tool_version=tool_version, - register_job_data=register, **verify_kwds - ) - if log: - log.info("Test '%s' passed", test_id) - break - except Exception as e: - if log: - log.warning("Test '%s' failed", test_id, exc_info=True) - - job_exception = e - run_retries -= 1 - finally: - if job_data is not None: - results.register_result({ - "id": test_id, - "has_data": True, - "data": job_data, - }) - if job_exception is not None: - was_recorded = job_data is not None - test_exception = TestException(tool_id, job_exception, was_recorded) - results.register_exception(test_exception) - - executor.submit(run_test) - - -def build_case_references( - galaxy_interactor, - tool_id=ALL_TOOLS, - tool_version=LATEST_VERSION, - test_index=ALL_TESTS, - page_size=0, - page_number=0, - check_against=None, - log=None, -): - test_references = [] - if tool_id == ALL_TOOLS: - tests_summary = galaxy_interactor.get_tests_summary() - for tool_id, tool_versions_dict in tests_summary.items(): - for tool_version, summary in tool_versions_dict.items(): - for test_index in range(summary["count"]): - test_reference = TestReference(tool_id, tool_version, test_index) - test_references.append(test_reference) - else: - assert tool_id - tool_test_dicts = galaxy_interactor.get_tool_tests(tool_id, tool_version=tool_version) or {} - for i, tool_test_dict in enumerate(tool_test_dicts): - this_tool_version = tool_test_dict.get("tool_version", tool_version) - this_test_index = i - if test_index == ALL_TESTS or i == test_index: - test_reference = TestReference(tool_id, this_tool_version, this_test_index) - test_references.append(test_reference) - - if check_against: - filtered_test_references = [] - for test_reference in test_references: - if check_against.already_successful(test_reference): - if log is not None: - log.debug(f"Found successful test for {test_reference}, skipping") - continue - filtered_test_references.append(test_reference) - log.info(f"Skipping {len(test_references)-len(filtered_test_references)} out of {len(test_references)} tests.") - test_references = filtered_test_references - - if page_size > 0: - slice_start = page_size * page_number - slice_end = page_size * (page_number + 1) - test_references = test_references[slice_start:slice_end] - - return test_references - - -def main(argv=None): - if argv is None: - argv = sys.argv[1:] - - args = _arg_parser().parse_args(argv) - log = setup_global_logger(__name__, verbose=args.verbose) - client_test_config_path = args.client_test_config - if client_test_config_path is not None: - log.debug(f"Reading client config path {client_test_config_path}") - with open(client_test_config_path) as f: - client_test_config = yaml.full_load(f) - else: - client_test_config = {} - - def get_option(key): - arg_val = getattr(args, key, None) - if arg_val is None and key in client_test_config: - val = client_test_config.get(key) - else: - val = arg_val - return val - - output_json_path = get_option("output_json") - galaxy_interactor_kwds = { - "galaxy_url": get_option("galaxy_url"), - "master_api_key": get_option("admin_key"), - "api_key": get_option("key"), - "keep_outputs_dir": args.output, - "download_attempts": get_option("download_attempts"), - "download_sleep": get_option("download_sleep"), - } - tool_id = args.tool_id - tool_version = args.tool_version - tools_client_test_config = DictClientTestConfig(client_test_config.get("tools")) - verbose = args.verbose - - galaxy_interactor = GalaxyInteractorApi(**galaxy_interactor_kwds) - results = Results(args.suite_name, output_json_path, append=args.append) - check_against = None if not args.skip_successful else results - test_references = build_case_references( - galaxy_interactor, - tool_id=tool_id, - tool_version=tool_version, - test_index=args.test_index, - page_size=args.page_size, - page_number=args.page_number, - check_against=check_against, - log=log, - ) - log.debug(f"Built {len(test_references)} test references to executed.") - verify_kwds = dict( - client_test_config=tools_client_test_config, - force_path_paste=args.force_path_paste, - skip_with_reference_data=not args.with_reference_data, - quiet=not verbose, - ) - test_tools( - galaxy_interactor, - test_references, - results, - log=log, - parallel_tests=args.parallel_tests, - history_per_test_case=args.history_per_test_case, - no_history_cleanup=args.no_history_cleanup, - verify_kwds=verify_kwds, - ) - exceptions = results.test_exceptions - if exceptions: - exception = exceptions[0] - if hasattr(exception, "exception"): - exception = exception.exception - raise exception - - -def setup_global_logger(name, log_file=None, verbose=False): - formatter = logging.Formatter('%(asctime)s %(levelname)-5s - %(message)s') - console = logging.StreamHandler() - console.setFormatter(formatter) - - logger = logging.getLogger(name) - logger.setLevel(logging.DEBUG if verbose else logging.INFO) - logger.addHandler(console) - - if not log_file: - # delete = false is chosen here because it is always nice to have a log file - # ready if you need to debug. Not having the "if only I had set a log file" - # moment after the fact. - temp = tempfile.NamedTemporaryFile(prefix="ephemeris_", delete=False) - log_file = temp.name - file_handler = logging.FileHandler(log_file) - logger.addHandler(file_handler) - logger.info(f"Storing log file in: {log_file}") - return logger - - -def _arg_parser(): - parser = argparse.ArgumentParser(description=DESCRIPTION) - parser.add_argument('-u', '--galaxy-url', default="http://localhost:8080", help='Galaxy URL') - parser.add_argument('-k', '--key', default=None, help='Galaxy User API Key') - parser.add_argument('-a', '--admin-key', default=None, help='Galaxy Admin API Key') - parser.add_argument('--force_path_paste', default=False, action="store_true", help='This requires Galaxy-side config option "allow_path_paste" enabled. Allows for fetching test data locally. Only for admins.') - parser.add_argument('-t', '--tool-id', default=ALL_TOOLS, help='Tool ID') - parser.add_argument('--tool-version', default=None, help='Tool Version (if tool id supplied). Defaults to just latest version, use * to test all versions') - parser.add_argument('-i', '--test-index', default=ALL_TESTS, type=int, help='Tool Test Index (starting at 0) - by default all tests will run.') - parser.add_argument('-o', '--output', default=None, help='directory to dump outputs to') - parser.add_argument('--append', default=False, action="store_true", help="Extend a test record json (created with --output-json) with additional tests.") - parser.add_argument('--skip-successful', default=False, action="store_true", help="When used with --append, skip previously run successful tests.") - parser.add_argument('-j', '--output-json', default=None, help='output metadata json') - parser.add_argument('--verbose', default=False, action="store_true", help="Verbose logging.") - parser.add_argument('-c', '--client-test-config', default=None, help="Test config YAML to help with client testing") - parser.add_argument('--suite-name', default=DEFAULT_SUITE_NAME, help="Suite name for tool test output") - parser.add_argument('--with-reference-data', dest="with_reference_data", default=False, action="store_true") - parser.add_argument('--skip-with-reference-data', dest="with_reference_data", action="store_false", help="Skip tests the Galaxy server believes use data tables or loc files.") - parser.add_argument('--history-per-suite', dest="history_per_test_case", default=False, action="store_false", help="Create new history per test suite (all tests in same history).") - parser.add_argument('--history-per-test-case', dest="history_per_test_case", action="store_true", help="Create new history per test case.") - parser.add_argument('--no-history-cleanup', default=False, action="store_true", help="Perserve histories created for testing.") - parser.add_argument('--parallel-tests', default=1, type=int, help="Parallel tests.") - parser.add_argument('--retries', default=0, type=int, help="Retry failed tests.") - parser.add_argument('--page-size', default=0, type=int, help="If positive, use pagination and just run one 'page' to tool tests.") - parser.add_argument('--page-number', default=0, type=int, help="If page size is used, run this 'page' of tests - starts with 0.") - parser.add_argument('--download-attempts', default=1, type=int, help="Galaxy may return a transient 500 status code for download if test results are written but not yet accessible.") - parser.add_argument('--download-sleep', default=1, type=int, help="If download attempts is greater than 1, the amount to sleep between download attempts.") - return parser - - -if __name__ == "__main__": - main() diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/images/TFasIDE.png Binary file toolfactory/images/TFasIDE.png has changed diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/images/dynamicScriptTool.png Binary file toolfactory/images/dynamicScriptTool.png has changed diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/images/hello_toolfactory_form.png Binary file toolfactory/images/hello_toolfactory_form.png has changed diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/install_tf_demos.py --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/toolfactory/install_tf_demos.py Tue Jan 05 00:34:48 2021 +0000 @@ -0,0 +1,47 @@ +import argparse +import os +import subprocess +import sys +import urllib.request + +from bioblend import galaxy + +WF = "https://drive.google.com/uc?export=download&id=13xE8o7tucHGNA0qYkEP98FfUGl2wdOU5" +HIST = ( + "https://drive.google.com/uc?export=download&id=1V0ZN9ZBuqcGJvt2AP7s3g0q11uYEhdDB" +) +WF_FILE = "tf_workflow.ga" +HIST_FILE = "tf_history.tgz" + + +def _parser(): + parser = argparse.ArgumentParser() + parser.add_argument( + "-g", "--galaxy", help="URL of target galaxy", default="http://localhost:9090" + ) + parser.add_argument("-a", "--key", help="Galaxy admin key", default=None) + return parser + + +def main(): + """ + load the planemo tool_factory demonstration history and tool generating workflow + fails in planemo served galaxies because there seems to be no user in trans? + """ + args = _parser().parse_args() + urllib.request.urlretrieve(WF, WF_FILE) + urllib.request.urlretrieve(HIST, HIST_FILE) + assert args.key, "Need an administrative key for the target Galaxy supplied please" + wfp = os.path.abspath(WF_FILE) + hp = os.path.abspath(HIST_FILE) + gi = galaxy.GalaxyInstance( + url=args.galaxy, key=args.key, email="planemo@galaxyproject.org" + ) + x = gi.workflows.import_workflow_from_local_path(WF_FILE, publish=True) + print(f"installed {WF_FILE} Returned = {x}\n") + x = gi.histories.import_history(file_path=HIST_FILE) + print(f"installed {HIST_FILE} Returned = {x}\n") + + +if __name__ == "__main__": + main() diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/install_tf_demos_toolshed.tgz Binary file toolfactory/install_tf_demos_toolshed.tgz has changed diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/planemo_install_tfdemo.tar.gz Binary file toolfactory/planemo_install_tfdemo.tar.gz has changed diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/rgToolFactory2.py --- a/toolfactory/rgToolFactory2.py Fri Dec 11 04:23:48 2020 +0000 +++ b/toolfactory/rgToolFactory2.py Tue Jan 05 00:34:48 2021 +0000 @@ -13,18 +13,9 @@ # 1. Fix the toolfactory so it works - done for simplest case # 2. Fix planemo so the toolfactory function works # 3. Rewrite bits using galaxyxml functions where that makes sense - done -# -# uses planemo in a biodocker sort of image as a requirement -# otherwise planemo seems to leak dependencies back into the -# calling venv. Hilarity ensues. - - import argparse import copy -import datetime -import grp -import json import logging import os import re @@ -35,12 +26,9 @@ import tempfile import time - from bioblend import ConnectionError from bioblend import toolshed -import docker - import galaxyxml.tool as gxt import galaxyxml.tool.parameters as gxtp @@ -54,8 +42,9 @@ toolFactoryURL = "https://github.com/fubar2/toolfactory" ourdelim = "~~~" -# --input_files="$intab.input_files~~~$intab.input_CL~~~$intab.input_formats\ -#~~~$intab.input_label~~~$intab.input_help" +# --input_files="$intab.input_files~~~$intab.input_CL~~~ +# $intab.input_formats# ~~~$intab.input_label +# ~~~$intab.input_help" IPATHPOS = 0 ICLPOS = 1 IFMTPOS = 2 @@ -63,7 +52,8 @@ IHELPOS = 4 IOCLPOS = 5 -# --output_files "$otab.history_name~~~$otab.history_format~~~$otab.history_CL~~~$otab.history_test" +# --output_files "$otab.history_name~~~$otab.history_format~~~ +# $otab.history_CL~~~$otab.history_test" ONAMEPOS = 0 OFMTPOS = 1 OCLPOS = 2 @@ -72,7 +62,8 @@ # --additional_parameters="$i.param_name~~~$i.param_value~~~ -# $i.param_label~~~$i.param_help~~~$i.param_type~~~$i.CL~~~i$.param_CLoverride" +# $i.param_label~~~$i.param_help~~~$i.param_type +# ~~~$i.CL~~~i$.param_CLoverride" ANAMEPOS = 0 AVALPOS = 1 ALABPOS = 2 @@ -106,13 +97,21 @@ return '"%s"' % s -html_escape_table = {"&": "&", ">": ">", "<": "<", "$": r"\$","#":"#", "$":"$"} -cheetah_escape_table = {"$": "\$","#":"\#"} +html_escape_table = { + "&": "&", + ">": ">", + "<": "<", + "#": "#", + "$": "$", +} +cheetah_escape_table = {"$": "\\$", "#": "\\#"} + def html_escape(text): """Produce entities within text.""" return "".join([html_escape_table.get(c, c) for c in text]) + def cheetah_escape(text): """Produce entities within text.""" return "".join([cheetah_escape_table.get(c, c) for c in text]) @@ -124,8 +123,8 @@ t = t.replace(">", ">") t = t.replace("<", "<") t = t.replace("\\$", "$") - t = t.replace("$","$") - t = t.replace("#","#") + t = t.replace("$", "$") + t = t.replace("#", "#") return t @@ -137,7 +136,9 @@ if citation.startswith("doi"): citation_tuples.append(("doi", citation[len("doi") :].strip())) else: - citation_tuples.append(("bibtex", citation[len("bibtex") :].strip())) + citation_tuples.append( + ("bibtex", citation[len("bibtex") :].strip()) + ) return citation_tuples @@ -168,7 +169,9 @@ self.executeme = self.args.sysexe else: if self.args.packages: - self.executeme = self.args.packages.split(",")[0].split(":")[0] + self.executeme = ( + self.args.packages.split(",")[0].split(":")[0].strip() + ) else: self.executeme = None aCL = self.cl.append @@ -226,8 +229,12 @@ else: aCL(self.executeme) aXCL(self.executeme) - self.elog = os.path.join(self.repdir, "%s_error_log.txt" % self.tool_name) - self.tlog = os.path.join(self.repdir, "%s_runner_log.txt" % self.tool_name) + self.elog = os.path.join( + self.repdir, "%s_error_log.txt" % self.tool_name + ) + self.tlog = os.path.join( + self.repdir, "%s_runner_log.txt" % self.tool_name + ) if self.args.parampass == "0": self.clsimple() @@ -235,15 +242,15 @@ clsuffix = [] xclsuffix = [] for i, p in enumerate(self.infiles): - if p[IOCLPOS] == "STDIN": + if p[IOCLPOS].upper() == "STDIN": appendme = [ - p[IOCLPOS], + p[ICLPOS], p[ICLPOS], p[IPATHPOS], "< %s" % p[IPATHPOS], ] xappendme = [ - p[IOCLPOS], + p[ICLPOS], p[ICLPOS], p[IPATHPOS], "< $%s" % p[ICLPOS], @@ -258,10 +265,14 @@ self.lastclredirect = [">", p[ONAMEPOS]] self.lastxclredirect = [">", "$%s" % p[OCLPOS]] else: - clsuffix.append([p[ONAMEPOS], p[ONAMEPOS], p[ONAMEPOS], ""]) - xclsuffix.append([p[ONAMEPOS], p[ONAMEPOS], "$%s" % p[ONAMEPOS], ""]) + clsuffix.append([p[OCLPOS], p[ONAMEPOS], p[ONAMEPOS], ""]) + xclsuffix.append( + [p[OCLPOS], p[ONAMEPOS], "$%s" % p[ONAMEPOS], ""] + ) for p in self.addpar: - clsuffix.append([p[AOCLPOS], p[ACLPOS], p[AVALPOS], p[AOVERPOS]]) + clsuffix.append( + [p[AOCLPOS], p[ACLPOS], p[AVALPOS], p[AOVERPOS]] + ) xclsuffix.append( [p[AOCLPOS], p[ACLPOS], '"$%s"' % p[ANAMEPOS], p[AOVERPOS]] ) @@ -290,52 +301,58 @@ self.spacedScript = [f" {x}" for x in rx if x.strip() > ""] art = "%s.%s" % (self.tool_name, self.executeme) artifact = open(art, "wb") - artifact.write(bytes('\n'.join(self.escapedScript),'utf8')) + artifact.write(bytes("\n".join(self.escapedScript), "utf8")) artifact.close() def cleanuppar(self): """ positional parameters are complicated by their numeric ordinal""" - for i, p in enumerate(self.infiles): - infp = copy.copy(p) - if self.args.parampass == "positional": - assert infp[ - ICLPOS - ].isdigit(), "Positional parameters must be ordinal integers - got %s for %s" % ( - infp[ICLPOS], - infp[ILABPOS], + if self.args.parampass == "positional": + for i, p in enumerate(self.infiles): + assert ( + p[ICLPOS].isdigit() or p[ICLPOS].strip().upper() == "STDIN" + ), "Positional parameters must be ordinal integers - got %s for %s" % ( + p[ICLPOS], + p[ILABPOS], ) - icl = infp[ICLPOS] - infp.append(icl) - if infp[ICLPOS].isdigit() or self.args.parampass == "0": - scl = "input%d" % (i + 1) - infp[ICLPOS] = scl - self.infiles[i] = infp - for i, p in enumerate( - self.outfiles - ): - if self.args.parampass == "positional" and p[OCLPOS].upper() != "STDOUT": - assert p[ - OCLPOS - ].isdigit(), "Positional parameters must be ordinal integers - got %s for %s" % ( + for i, p in enumerate(self.outfiles): + assert ( + p[OCLPOS].isdigit() + or p[OCLPOS].strip().upper() == "STDOUT" + ), "Positional parameters must be ordinal integers - got %s for %s" % ( p[OCLPOS], p[ONAMEPOS], ) - p.append(p[OCLPOS]) # keep copy - if p[OOCLPOS].isdigit() or p[OOCLPOS].upper() == "STDOUT": - scl = p[ONAMEPOS] - p[OCLPOS] = scl - self.outfiles[i] = p - for i, p in enumerate(self.addpar): - if self.args.parampass == "positional": + for i, p in enumerate(self.addpar): assert p[ ACLPOS ].isdigit(), "Positional parameters must be ordinal integers - got %s for %s" % ( p[ACLPOS], p[ANAMEPOS], ) + for i, p in enumerate(self.infiles): + infp = copy.copy(p) + icl = infp[ICLPOS] + infp.append(icl) + if ( + infp[ICLPOS].isdigit() + or self.args.parampass == "0" + or infp[ICLPOS].strip().upper() == "STDOUT" + ): + scl = "input%d" % (i + 1) + infp[ICLPOS] = scl + self.infiles[i] = infp + for i, p in enumerate(self.outfiles): + p.append(p[OCLPOS]) # keep copy + if ( + p[OOCLPOS].isdigit() and self.args.parampass != "positional" + ) or p[OOCLPOS].strip().upper() == "STDOUT": + scl = p[ONAMEPOS] + p[OCLPOS] = scl + self.outfiles[i] = p + for i, p in enumerate(self.addpar): p.append(p[ACLPOS]) if p[ACLPOS].isdigit(): - scl = "input%s" % p[ACLPOS] + scl = "param%s" % p[ACLPOS] p[ACLPOS] = scl self.addpar[i] = p @@ -370,7 +387,6 @@ aXCL(self.lastxclredirect[0]) aXCL(self.lastxclredirect[1]) - def clargparse(self): """argparse style""" aCL = self.cl.append @@ -396,7 +412,6 @@ aCL(k) aCL(v) - def getNdash(self, newname): if self.is_positional: ndash = 0 @@ -408,11 +423,17 @@ def doXMLparam(self): """flake8 made me do this...""" - for p in self.outfiles: # --output_files "$otab.history_name~~~$otab.history_format~~~$otab.history_CL~~~$otab.history_test" + for ( + p + ) in ( + self.outfiles + ): # --output_files "$otab.history_name~~~$otab.history_format~~~$otab.history_CL~~~$otab.history_test" newname, newfmt, newcl, test, oldcl = p test = test.strip() ndash = self.getNdash(newcl) - aparm = gxtp.OutputData(name=newname, format=newfmt, num_dashes=ndash, label=newcl) + aparm = gxtp.OutputData( + name=newname, format=newfmt, num_dashes=ndash, label=newcl + ) aparm.positional = self.is_positional if self.is_positional: if oldcl.upper() == "STDOUT": @@ -430,30 +451,30 @@ if test.split(":")[1].isdigit: ld = int(test.split(":")[1]) tp = gxtp.TestOutput( - name=newcl, - value="%s_sample" % newcl, - format=newfmt, - compare= c, - lines_diff=ld, - ) + name=newname, + value="%s_sample" % newname, + format=newfmt, + compare=c, + lines_diff=ld, + ) elif test.startswith("sim_size"): c = "sim_size" tn = test.split(":")[1].strip() - if tn > '': - if '.' in tn: + if tn > "": + if "." in tn: delta = None - delta_frac = min(1.0,float(tn)) + delta_frac = min(1.0, float(tn)) else: delta = int(tn) delta_frac = None tp = gxtp.TestOutput( - name=newcl, - value="%s_sample" % newcl, - format=newfmt, - compare= c, - delta = delta, - delta_frac = delta_frac - ) + name=newname, + value="%s_sample" % newname, + format=newfmt, + compare=c, + delta=delta, + delta_frac=delta_frac, + ) self.testparam.append(tp) for p in self.infiles: newname = p[ICLPOS] @@ -477,7 +498,16 @@ tparm = gxtp.TestParam(name=newname, value="%s_sample" % newname) self.testparam.append(tparm) for p in self.addpar: - newname, newval, newlabel, newhelp, newtype, newcl, override, oldcl = p + ( + newname, + newval, + newlabel, + newhelp, + newtype, + newcl, + override, + oldcl, + ) = p if not len(newlabel) > 0: newlabel = newname ndash = self.getNdash(newname) @@ -563,7 +593,9 @@ Hmmm. How to get the command line into correct order... """ if self.command_override: - self.newtool.command_override = self.command_override # config file + self.newtool.command_override = ( + self.command_override + ) # config file else: self.newtool.command_override = self.xmlcl if self.args.help_text: @@ -571,14 +603,14 @@ safertext = "\n".join([cheetah_escape(x) for x in helptext]) if self.args.script_path: scr = [x for x in self.spacedScript if x.strip() > ""] - scr.insert(0,'\n------\n\n\nScript::\n') + scr.insert(0, "\n------\n\n\nScript::\n") if len(scr) > 300: scr = ( scr[:100] + [" >300 lines - stuff deleted", " ......"] + scr[-100:] ) - scr.append('\n') + scr.append("\n") safertext = safertext + "\n".join(scr) self.newtool.help = safertext else: @@ -591,9 +623,9 @@ requirements = gxtp.Requirements() if self.args.packages: for d in self.args.packages.split(","): - ver = '' - d = d.replace('==',':') - d = d.replace('=',':') + ver = "" + d = d.replace("==", ":") + d = d.replace("=", ":") if ":" in d: packg, ver = d.split(":") else: @@ -610,7 +642,11 @@ self.newtool.inputs = self.tinputs if self.args.script_path: configfiles = gxtp.Configfiles() - configfiles.append(gxtp.Configfile(name="runme", text="\n".join(self.escapedScript))) + configfiles.append( + gxtp.Configfile( + name="runme", text="\n".join(self.escapedScript) + ) + ) self.newtool.configfiles = configfiles tests = gxtp.Tests() test_a = gxtp.Test() @@ -627,7 +663,9 @@ "Cite: Creating re-usable tools from scripts doi:10.1093/bioinformatics/bts573" ) exml0 = self.newtool.export() - exml = exml0.replace(FAKEEXE, "") # temporary work around until PR accepted + exml = exml0.replace( + FAKEEXE, "" + ) # temporary work around until PR accepted if ( self.test_override ): # cannot do this inside galaxyxml as it expects lxml objects for tests @@ -635,7 +673,7 @@ part2 = exml.split("")[1] fixed = "%s\n%s\n%s" % (part1, self.test_override, part2) exml = fixed - #exml = exml.replace('range="1:"', 'range="1000:"') + # exml = exml.replace('range="1:"', 'range="1000:"') xf = open("%s.xml" % self.tool_name, "w") xf.write(exml) xf.write("\n") @@ -657,14 +695,17 @@ else: ste = open(self.elog, "w") if self.lastclredirect: - sto = open(self.lastclredirect[1], "wb") # is name of an output file + sto = open( + self.lastclredirect[1], "wb" + ) # is name of an output file else: if os.path.exists(self.tlog): sto = open(self.tlog, "a") else: sto = open(self.tlog, "w") sto.write( - "## Executing Toolfactory generated command line = %s\n" % scl + "## Executing Toolfactory generated command line = %s\n" + % scl ) sto.flush() subp = subprocess.run( @@ -685,7 +726,9 @@ subp = subprocess.run( self.cl, env=self.ourenv, shell=False, stdout=sto, stdin=sti ) - sto.write("## Executing Toolfactory generated command line = %s\n" % scl) + sto.write( + "## Executing Toolfactory generated command line = %s\n" % scl + ) retval = subp.returncode sto.close() sti.close() @@ -698,112 +741,6 @@ logging.debug("run done") return retval - def copy_to_container(self, src, dest, container): - """Recreate the src directory tree at dest - full path included""" - idir = os.getcwd() - workdir = os.path.dirname(src) - os.chdir(workdir) - _, tfname = tempfile.mkstemp(suffix=".tar") - tar = tarfile.open(tfname, mode="w") - srcb = os.path.basename(src) - tar.add(srcb) - tar.close() - data = open(tfname, "rb").read() - container.put_archive(dest, data) - os.unlink(tfname) - os.chdir(idir) - - def copy_from_container(self, src, dest, container): - """recreate the src directory tree at dest using docker sdk""" - os.makedirs(dest, exist_ok=True) - _, tfname = tempfile.mkstemp(suffix=".tar") - tf = open(tfname, "wb") - bits, stat = container.get_archive(src) - for chunk in bits: - tf.write(chunk) - tf.close() - tar = tarfile.open(tfname, "r") - tar.extractall(dest) - tar.close() - os.unlink(tfname) - - def planemo_biodocker_test(self): - """planemo currently leaks dependencies if used in the same container and gets unhappy after a - first successful run. https://github.com/galaxyproject/planemo/issues/1078#issuecomment-731476930 - - Docker biocontainer has planemo with caches filled to save repeated downloads - - - """ - - def prun(container, tout, cl, user="biodocker"): - rlog = container.exec_run(cl, user=user) - slogl = str(rlog).split("\\n") - slog = "\n".join(slogl) - tout.write(f"## got rlog {slog} from {cl}\n") - - if os.path.exists(self.tlog): - tout = open(self.tlog, "a") - else: - tout = open(self.tlog, "w") - planemoimage = "quay.io/fubar2/planemo-biocontainer" - xreal = "%s.xml" % self.tool_name - repname = f"{self.tool_name}_planemo_test_report.html" - ptestrep_path = os.path.join(self.repdir, repname) - tool_name = self.tool_name - client = docker.from_env() - tvol = client.volumes.create() - tvolname = tvol.name - destdir = "/toolfactory/ptest" - imrep = os.path.join(destdir, repname) - # need to keep the container running so keep it open with sleep - # will stop and destroy it when we are done - container = client.containers.run( - planemoimage, - "sleep 120m", - detach=True, - user="biodocker", - volumes={f"{tvolname}": {"bind": "/toolfactory", "mode": "rw"}}, - ) - cl = f"mkdir -p {destdir}" - prun(container, tout, cl, user="root") - # that's how hard it is to get root on a biodocker container :( - cl = f"rm -rf {destdir}/*" - prun(container, tout, cl, user="root") - ptestpath = os.path.join(destdir, "tfout", xreal) - self.copy_to_container(self.tooloutdir, destdir, container) - cl = "chown -R biodocker /toolfactory" - prun(container, tout, cl, user="root") - rlog = container.exec_run(f"ls -la {destdir}") - ptestcl = f"planemo test --update_test_data --no_cleanup --test_data {destdir}/tfout/test-data --galaxy_root /home/biodocker/galaxy-central {ptestpath}" - try: - rlog = container.exec_run(ptestcl) - # fails because test outputs missing but updates the test-data directory - except: - e = sys.exc_info()[0] - tout.write(f"#### error: {e} from {ptestcl}\n") - cl = f"planemo test --test_output {imrep} --no_cleanup --test_data {destdir}/tfout/test-data --galaxy_root /home/biodocker/galaxy-central {ptestpath}" - try: - prun(container, tout, cl) - except: - e = sys.exc_info()[0] - tout.write(f"#### error: {e} from {ptestcl}\n") - testouts = tempfile.mkdtemp(suffix=None, prefix="tftemp", dir=".") - self.copy_from_container(destdir, testouts, container) - src = os.path.join(testouts, "ptest") - if os.path.isdir(src): - shutil.copytree(src, ".", dirs_exist_ok=True) - src = repname - if os.path.isfile(repname): - shutil.copyfile(src, ptestrep_path) - else: - tout.write(f"No output from run to shutil.copytree in {src}\n") - tout.close() - container.stop() - container.remove() - tvol.remove() - shutil.rmtree(testouts) # leave for debugging - def shedLoad(self): """ use bioblend to create new repository @@ -816,7 +753,9 @@ sto = open(self.tlog, "w") ts = toolshed.ToolShedInstance( - url=self.args.toolshed_url, key=self.args.toolshed_api_key, verify=False + url=self.args.toolshed_url, + key=self.args.toolshed_api_key, + verify=False, ) repos = ts.repositories.get_repositories() rnames = [x.get("name", "?") for x in repos] @@ -840,7 +779,9 @@ category_ids=catID, ) tid = res.get("id", None) - sto.write(f"#create_repository {self.args.tool_name} tid={tid} res={res}\n") + sto.write( + f"#create_repository {self.args.tool_name} tid={tid} res={res}\n" + ) else: i = rnames.index(self.tool_name) tid = rids[i] @@ -882,16 +823,20 @@ ] tout.write("running\n%s\n" % " ".join(cll)) subp = subprocess.run( - cll, env=self.ourenv, cwd=self.ourcwd, shell=False, stderr=tout, stdout=tout + cll, + env=self.ourenv, + cwd=self.ourcwd, + shell=False, + stderr=tout, + stdout=tout, ) tout.write( - "installed %s - got retcode %d\n" % (self.tool_name, subp.returncode) + "installed %s - got retcode %d\n" + % (self.tool_name, subp.returncode) ) tout.close() return subp.returncode - - def writeShedyml(self): """for planemo""" yuser = self.args.user_email.split("@")[0] @@ -950,7 +895,11 @@ % (tdest, self.testdir) ) tf = tarfile.open(self.newtarpath, "w:gz") - tf.add(name=self.tooloutdir, arcname=self.tool_name, filter=exclude_function) + tf.add( + name=self.tooloutdir, + arcname=self.tool_name, + filter=exclude_function, + ) tf.close() shutil.copyfile(self.newtarpath, self.args.new_tool) @@ -990,7 +939,8 @@ def main(): """ - This is a Galaxy wrapper. It expects to be called by a special purpose tool.xml + This is a Galaxy wrapper. + It expects to be called by a special purpose tool.xml """ parser = argparse.ArgumentParser() @@ -1020,35 +970,48 @@ a("--new_tool", default="new_tool") a("--galaxy_url", default="http://localhost:8080") a("--toolshed_url", default="http://localhost:9009") - # make sure this is identical to tool_sheds_conf.xml localhost != 127.0.0.1 so validation fails + # make sure this is identical to tool_sheds_conf.xml + # localhost != 127.0.0.1 so validation fails a("--toolshed_api_key", default="fakekey") a("--galaxy_api_key", default="fakekey") a("--galaxy_root", default="/galaxy-central") a("--galaxy_venv", default="/galaxy_venv") args = parser.parse_args() assert not args.bad_user, ( - 'UNAUTHORISED: %s is NOT authorized to use this tool until Galaxy admin adds %s to "admin_users" in the galaxy.yml Galaxy configuration file' + 'UNAUTHORISED: %s is NOT authorized to use this tool until Galaxy \ +admin adds %s to "admin_users" in the galaxy.yml Galaxy configuration file' % (args.bad_user, args.bad_user) ) - assert args.tool_name, "## Tool Factory expects a tool name - eg --tool_name=DESeq" + assert ( + args.tool_name + ), "## Tool Factory expects a tool name - eg --tool_name=DESeq" assert ( args.sysexe or args.packages - ), "## Tool Factory wrapper expects an interpreter or an executable package" - args.input_files = [x.replace('"', "").replace("'", "") for x in args.input_files] + ), "## Tool Factory wrapper expects an interpreter \ +or an executable package in --sysexe or --packages" + args.input_files = [ + x.replace('"', "").replace("'", "") for x in args.input_files + ] # remove quotes we need to deal with spaces in CL params for i, x in enumerate(args.additional_parameters): - args.additional_parameters[i] = args.additional_parameters[i].replace('"', "") + args.additional_parameters[i] = args.additional_parameters[i].replace( + '"', "" + ) r = ScriptRunner(args) r.writeShedyml() r.makeTool() if args.make_Tool == "generate": - retcode = r.run() # for testing toolfactory itself + retcode = r.run() r.moveRunOutputs() r.makeToolTar() else: - r.planemo_biodocker_test() # test to make outputs and then test + retcode = r.planemo_test(genoutputs=True) # this fails :( - see PR r.moveRunOutputs() r.makeToolTar() + retcode = r.planemo_test(genoutputs=False) + r.moveRunOutputs() + r.makeToolTar() + print(f"second planemo_test returned {retcode}") if args.make_Tool == "gentestinstall": r.shedLoad() r.eph_galaxy_load() diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/rgToolFactory2.xml --- a/toolfactory/rgToolFactory2.xml Fri Dec 11 04:23:48 2020 +0000 +++ b/toolfactory/rgToolFactory2.xml Tue Jan 05 00:34:48 2021 +0000 @@ -1,4 +1,4 @@ - + Scripts into tools v2.0 @@ -73,8 +73,9 @@ - - + + + - @@ -145,8 +146,7 @@ galaxyxml bioblend ephemeris - docker-py - planemo + planemo - + @@ -339,8 +338,8 @@ - - + + diff -r e43c43396a70 -r 8ea1133b9d9a toolfactory/testtf.sh --- a/toolfactory/testtf.sh Fri Dec 11 04:23:48 2020 +0000 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,2 +0,0 @@ -planemo test --no_cleanup --no_dependency_resolution --skip_venv --galaxy_root ~/galaxy ~/galaxy/tools/tool_makers/toolfactory &>foo -