Mercurial > repos > fubar > tool_factory_2
diff README.txt @ 26:db35d39e1de9 draft
Passes planemo test
Uses galaxyxml to generate new tool.
More outputs will be added...
author | fubar |
---|---|
date | Thu, 30 Jul 2020 06:48:45 -0400 |
parents | d98f5a09137f |
children |
line wrap: on
line diff
--- a/README.txt Mon Mar 02 05:18:21 2015 -0500 +++ b/README.txt Thu Jul 30 06:48:45 2020 -0400 @@ -16,63 +16,28 @@ run scripts in R, python, sh and perl over multiple selected input data sets, writing a single new data set as output. -*Differences between TF2 and the original Tool Factory* - -1. TF2 (this one) allows any number of either fixed or user-editable parameters to be defined -for the new tool. If these are editable, the user can change them but otherwise, they are passed -as fixed and invisible parameters for each execution. Obviously, there are substantial security -implications with editable parameters, but these are always sanitized by Galaxy's inbuilt -parameter sanitization so you may need to "unsanitize" characters - eg translate all "__lt__" -into "<" for certain parameters where that is needed. Please practise safe toolshed. - -2. Any number of (the same datatype) of input files may be defined. - -These changes substantially complicate the way your supplied script is supplied with -all the new and variable parameters. Examples in each scripting language are shown -in the tool help +*You have a working r/python/perl/bash script or any executable with positional or argparse style parameters* -*Automated outputs in named sections* - -If your script writes to the current directory path, arbitrary mix of (eg) -pdfs, tabular analysis results and run logs,the tool factory can optionally -auto-generate a linked Html page with separate sections showing a thumbnail -grid for all pdfs and the log text, grouping all artifacts sharing a file -name and log name prefix:: +It can be turned into an ordinary Galaxy tool in minutes, using a Galaxy tool. - eg: if "foo.log" is emitted then *all* other outputs matching foo_* will - all be grouped together - eg - foo_baz.pdf - foo_bar.pdf and - foo_zot.xls - would all be displayed and linked in the same section with foo.log's contents - - to form the "Foo" section of the Html page. Sections appear in alphabetic - order and there are no limits on the number of files or sections. *Automated generation of new Galaxy tools for installation into any Galaxy* -Once a script is working correctly, this tool optionally generates a -new Galaxy tool, effectively freezing the supplied script into a new, -ordinary Galaxy tool that runs it over one or more input files selected by -the user. Generated tools are installed via a tool shed by an administrator +A test is generated using small sample test data inputs and parameter settings you supply. +Once the test case outputs have been produced, they can be used to build a +new Galaxy tool. The supplied script or executable is baked as a requirement +into a new, ordinary Galaxy tool, fully workflow compatible out of the box. +Generated tools are installed via a tool shed by an administrator and work exactly like all other Galaxy tools for your users. -If you use the Html output option, please ensure that sanitize_all_html is -set to False and uncommented in universe_wsgi.ini - it should show:: - - # By default, all tool output served as 'text/html' will be sanitized - sanitize_all_html = False - -This opens potential security risks and may not be acceptable for public -sites where the lack of stylesheets may make Html pages damage onlookers' -eyeballs but should still be correct. - - *More Detail* To use the ToolFactory, you should have prepared a script to paste into a -text box, and a small test input example ready to select from your history +text box, or have a package in mind and a small test input example ready to select from your history to test your new script. +```planemo test rgToolFactory2.xml --galaxy_root ~/galaxy --test_data ~/galaxy/tools/tool_makers/toolfactory/test-data``` works for me + There is an example in each scripting language on the Tool Factory form. You can just cut and paste these to try it out - remember to select the right interpreter please. You'll also need to create a small test data set using @@ -129,12 +94,6 @@ to your local data_types_conf.xml. ) -Of course, R, python, perl etc are needed on your path if you want to test -scripts using those interpreters. Adding new ones to this tool code should -be easy enough. Please make suggestions as bitbucket issues and code. The -HTML file code automatically shrinks R's bloated pdfs, and depends on -ghostscript. The thumbnails require imagemagick . - * Restricted execution * The tool factory tool itself will then be usable ONLY by admin users - people with IDs in admin_users in universe_wsgi.ini **Yes, that's right. ONLY @@ -184,161 +143,6 @@ Licensed under the LGPL if you want to improve it, feel free https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home -Material for our more enthusiastic and voracious readers continues below - -we salute you. - -**Motivation** Simple transformation, filtering or reporting scripts get -written, run and lost every day in most busy labs - even ours where Galaxy is -in use. This 'dark script matter' is pervasive and generally not reproducible. - -**Benefits** For our group, this allows Galaxy to fill that important dark -script gap - all those "small" bioinformatics tasks. Once a user has a working -R (or python or perl) script that does something Galaxy cannot currently do -(eg transpose a tabular file) and takes parameters the way Galaxy supplies -them (see example below), they: - -1. Install the tool factory on a personal private instance - -2. Upload a small test data set - -3. Paste the script into the 'script' text box and iteratively run the -insecure tool on test data until it works right - there is absolutely no -reason to do this anywhere other than on a personal private instance. - -4. Once it works right, set the 'Generate toolshed gzip' option and run -it again. - -5. A toolshed style gzip appears ready to upload and install like any other -Toolshed entry. - -6. Upload the new tool to the toolshed - -7. Ask the local admin to check the new tool to confirm it's not evil and -install it in the local production galaxy - -**Simple examples on the tool form** - -A simple Rscript "filter" showing how the command line parameters can be -handled, takes an input file, does something (transpose in this case) and -writes the results to a new tabular file:: - - # transpose a tabular input file and write as a tabular output file - ourargs = commandArgs(TRUE) - inf = ourargs[1] - outf = ourargs[2] - inp = read.table(inf,head=F,row.names=NULL,sep='\t') - outp = t(inp) - write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F) - -Calculate a multiple test adjusted p value from a column of p values - -for this script to be useful, it needs the right column for the input to be -specified in the code for the given input file type(s) specified when the -tool is generated :: - - # use p.adjust - assumes a HEADER row and column 1 - please fix for any - real use - column = 1 # adjust if necessary for some other kind of input - fdrmeth = 'BH' - ourargs = commandArgs(TRUE) - inf = ourargs[1] - outf = ourargs[2] - inp = read.table(inf,head=T,row.names=NULL,sep='\t') - p = inp[,column] - q = p.adjust(p,method=fdrmeth) - newval = paste(fdrmeth,'p-value',sep='_') - q = data.frame(q) - names(q) = newval - outp = cbind(inp,newval=q) - write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=T) - - - -Another Rscript example without any input file - generates a random heatmap -pdf - you must make sure the option to create an HTML output file is -turned on for this to work. The heatmap will be presented as a thumbnail -linked to the pdf in the resulting HTML page:: - - # note this script takes NO input or output because it generates random data - foo = data.frame(a=runif(100),b=runif(100),c=runif(100),d=runif(100), - e=runif(100),f=runif(100)) - bar = as.matrix(foo) - pdf( "heattest.pdf" ) - heatmap(bar,main='Random Heatmap') - dev.off() - -A Python example that reverses each row of a tabular file. You'll need -to remove the leading spaces for this to work if cut and pasted into the -script box. Note that you can already do this in Galaxy by setting up the -cut columns tool with the correct number of columns in reverse order,but -this script will work for any number of columns so is completely generic:: - -# reverse order of columns in a tabular file -import sys -inp = sys.argv[1] -outp = sys.argv[2] -i = open(inp,'r') -o = open(outp,'w') -for row in i: - rs = row.rstrip().split('\t') - rs.reverse() - o.write('\t'.join(rs)) - o.write('\n') -i.close() -o.close() - - -Galaxy as an IDE for developing API scripts -If you need to develop Galaxy API scripts and you like to live dangerously, -please read on. - -Galaxy as an IDE? -Amazingly enough, blend-lib API scripts run perfectly well *inside* -Galaxy when pasted into a Tool Factory form. No need to generate a new -tool. Galaxy+Tool_Factory = IDE I think we need a new t-shirt. Seriously, -it is actually quite useable. - -Why bother - what's wrong with Eclipse -Nothing. But, compared with developing API scripts in the usual way outside -Galaxy, you get persistence and other framework benefits plus at absolutely -no extra charge, a ginormous security problem if you share the history or -any outputs because they contain the api script with key so development -servers only please! - -Workflow -Fire up the Tool Factory in Galaxy. - -Leave the input box empty, set the interpreter to python, paste and run an -api script - eg working example (substitute the url and key) below. - -It took me a few iterations to develop the example below because I know -almost nothing about the API. I started with very simple code from one of the -samples and after each run, the (edited..) api script is conveniently recreated -using the redo button on the history output item. So each successive version -of the developing api script you run is persisted - ready to be edited and -rerun easily. It is ''very'' handy to be able to add a line of code to the -script and run it, then view the output to (eg) inspect dicts returned by -API calls to help move progressively deeper iteratively. - -Give the below a whirl on a private clone (install the tool factory from -the main toolshed) and try adding complexity with few rerun/edit/rerun cycles. - -Eg tool factory api script -import sys -from blend.galaxy import GalaxyInstance -ourGal = 'http://x.x.x.x:xxxx' -ourKey = 'xxx' -gi = GalaxyInstance(ourGal, key=ourKey) -libs = gi.libraries.get_libraries() -res = [] -# libs looks like -# u'url': u'/galaxy/api/libraries/441d8112651dc2f3', u'id': -u'441d8112651dc2f3', u'name':.... u'Demonstration sample RNA data', -for lib in libs: - res.append('%s:\n' % lib['name']) - res.append(str(gi.libraries.show_library(lib['id'],contents=True))) -outf=open(sys.argv[2],'w') -outf.write('\n'.join(res)) -outf.close() **Attribution** Creating re-usable tools from scripts: The Galaxy Tool Factory