annotate toolfactory/README.md @ 31:69eed330c91f draft

Uploaded
author fubar
date Fri, 07 Aug 2020 07:55:35 -0400
parents 6f48315c32c1
children 4d578c8c1613
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
30
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
1 toolfactory_2
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
2 =============
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
3
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
4 This is an upgrade to the tool factory but with added parameters
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
5 (optionally editable in the generated tool form - otherwise fixed) and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
6 multiple input files.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
7
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
8 Specify any number of parameters - well at
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
9 least up to the limit of your patience with repeat groups.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
10
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
11 Parameter values supplied at tool generation time are defaults and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
12 can be optionally editable by the user - names cannot be changed once
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
13 a tool has been generated.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
14
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
15 If not editable, they act as hidden parameters passed to the script
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
16 and are not editable on the tool form.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
17
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
18 Note! There will be Galaxy default sanitization for all
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
19 user input parameters which your script may need to dance around.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
20
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
21 Any number of input files can be passed to your script, but of course it
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
22 has to deal with them. Both path and metadata name are supplied either in the environment
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
23 (bash/sh) or as command line parameters (python,perl,rscript) that need to be parsed and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
24 dealt with in the script. This is complicated by the common use case of needing file names
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
25 for (eg) column headers, as well as paths. Try the examples are show on the tool factory
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
26 form to see how Galaxy file and user supplied parameter values can be recovered in each
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
27 of the 4 scripting environments supported.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
28
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
29 Best way to deal with multiple outputs is to let the tool factory generate an HTML
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
30 page for your users. It automagically lays out pdf images as thumbnail galleries
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
31 and can have separate results sections gathering all similarly prefixed files, such as
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
32 a Foo section taking text and results from text (foo_whatever.log) and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
33 artifacts (eg foo_MDS_plot.pdf) file names. All artifacts are linked for download.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
34 A copy of the actual script is provided for provenance - be warned, it exposes
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
35 real file paths.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
36
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
37 **WARNING before you start**
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
38
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
39 Install this tool on a private Galaxy ONLY
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
40 Please NEVER on a public or production instance
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
41 Please cite the resource at
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
42 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
43 if you use this tool in your published work.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
44
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
45
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
46 *Short Story*
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
47
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
48 This is an unusual Galaxy tool capable of generating new Galaxy tools.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
49 It works by exposing *unrestricted* and therefore extremely dangerous scripting
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
50 to all designated administrators of the host Galaxy server, allowing them to
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
51 run scripts in R, python, sh and perl over multiple selected input data sets,
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
52 writing a single new data set as output.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
53
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
54 *Differences between TF2 and the original Tool Factory*
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
55
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
56 1. TF2 (this one) allows any number of either fixed or user-editable parameters to be defined
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
57 for the new tool. If these are editable, the user can change them but otherwise, they are passed
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
58 as fixed and invisible parameters for each execution. Obviously, there are substantial security
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
59 implications with editable parameters, but these are always sanitized by Galaxy's inbuilt
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
60 parameter sanitization so you may need to "unsanitize" characters - eg translate all "__lt__"
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
61 into "<" for certain parameters where that is needed. Please practise safe toolshed.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
62
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
63 2. Any number of (the same datatype) of input files may be defined.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
64
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
65 These changes substantially complicate the way your supplied script is supplied with
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
66 all the new and variable parameters. Examples in each scripting language are shown
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
67 in the tool help
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
68
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
69 *Automated outputs in named sections*
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
70
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
71 If your script writes to the current directory path, arbitrary mix of (eg)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
72 pdfs, tabular analysis results and run logs,the tool factory can optionally
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
73 auto-generate a linked Html page with separate sections showing a thumbnail
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
74 grid for all pdfs and the log text, grouping all artifacts sharing a file
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
75 name and log name prefix.if "foo.log" is emitted then *all* other outputs matching foo_* will
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
76 all be grouped together - eg
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
77 - foo_baz.pdf
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
78 - foo_bar.pdf and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
79 - foo_zot.xls
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
80
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
81 would all be displayed and linked in the same section with foo.log's contents to form the "Foo" section of the Html page.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
82 Sections appear in alphabetic order and there are no limits on the number of files or sections.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
83
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
84 *Automated generation of new Galaxy tools for installation into any Galaxy*
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
85
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
86 Once a script is working correctly, this tool optionally generates a
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
87 new Galaxy tool, effectively freezing the supplied script into a new,
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
88 ordinary Galaxy tool that runs it over one or more input files selected by
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
89 the user. Generated tools are installed via a tool shed by an administrator
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
90 and work exactly like all other Galaxy tools for your users.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
91
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
92 If you use the Html output option, please ensure that sanitize_all_html is
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
93 set to False and uncommented in universe_wsgi.ini - it should show
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
94
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
95 By default, all tool output served as 'text/html' will be sanitized
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
96 Change ```sanitize_all_html = False```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
97
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
98 This opens potential security risks and may not be acceptable for public
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
99 sites where the lack of stylesheets may make Html pages damage onlookers'
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
100 eyeballs but should still be correct.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
101
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
102 *More Detail*
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
103
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
104 To use the ToolFactory, you should have prepared a script to paste into a
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
105 text box, and a small test input example ready to select from your history
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
106 to test your new script.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
107
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
108 There is an example in each scripting language on the Tool Factory form. You
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
109 can just cut and paste these to try it out - remember to select the right
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
110 interpreter please. You'll also need to create a small test data set using
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
111 the Galaxy history add new data tool.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
112
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
113 If the script fails somehow, use the "redo" button on the tool output in
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
114 your history to recreate the form complete with broken script. Fix the bug
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
115 and execute again. Rinse, wash, repeat.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
116
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
117 Once the script runs sucessfully, a new Galaxy tool that runs your script
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
118 can be generated. Select the "generate" option and supply some help text and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
119 names. The new tool will be generated in the form of a new Galaxy datatype
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
120 - toolshed.gz - as the name suggests, it's an archive ready to upload to a
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
121 Galaxy ToolShed as a new tool repository.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
122
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
123 Once it's in a ToolShed, it can be installed into any local Galaxy server
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
124 from the server administrative interface.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
125
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
126 Once the new tool is installed, local users can run it - each time, the script
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
127 that was supplied when it was built will be executed with the input chosen
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
128 from the user's history. In other words, the tools you generate with the
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
129 ToolFactory run just like any other Galaxy tool,but run your script every time.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
130
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
131 Tool factory tools are perfect for workflow components. One input, one output,
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
132 no variables.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
133
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
134 *To fully and safely exploit the awesome power* of this tool,
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
135 Galaxy and the ToolShed, you should be a developer installing this
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
136 tool on a private/personal/scratch local instance where you are an
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
137 admin_user. Then, if you break it, you get to keep all the pieces see
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
138 https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
139
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
140 ** Installation **
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
141 This is a Galaxy tool. You can install it most conveniently using the
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
142 administrative "Search and browse tool sheds" link. Find the Galaxy Main
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
143 toolshed at https://toolshed.g2.bx.psu.edu/ and search for the toolfactory
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
144 repository. Open it and review the code and select the option to install it.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
145
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
146
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
147 If you can't get the tool that way, the xml and py files here need to be
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
148 copied into a new tools subdirectory such as tools/toolfactory
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
149 Your tool_conf.xml needs a new entry pointing to the xml \file - something like
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
150 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
151 <section name="Tool building tools" id="toolbuilders">
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
152 <tool file="toolfactory/rgToolFactory.xml"/>
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
153 </section>
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
154 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
155 If not already there (I just added it to datatypes_conf.xml.sample),
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
156 please add:
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
157
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
158 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
159 <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary"
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
160 mimetype="multipart/x-gzip" subclass="True" />
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
161 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
162 to your local data_types_conf.xml.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
163
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
164
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
165 Of course, R, python, perl etc are needed on your path if you want to test
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
166 scripts using those interpreters. Adding new ones to this tool code should
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
167 be easy enough. Please make suggestions as bitbucket issues and code. The
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
168 HTML file code automatically shrinks R's bloated pdfs, and depends on
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
169 ghostscript. The thumbnails require imagemagick .
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
170
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
171 * Restricted execution *
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
172 The tool factory tool itself will then be usable ONLY by admin users -
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
173 people with IDs in admin_users in universe_wsgi.ini **Yes, that's right. ONLY
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
174 admin_users can run this tool** Think about it for a moment. If allowed to
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
175 run any arbitrary script on your Galaxy server, the only thing that would
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
176 impede a miscreant bent on destroying all your Galaxy data would probably
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
177 be lack of appropriate technical skills.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
178
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
179 *What it does* This is a tool factory for simple scripts in python, R and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
180 perl currently. Functional tests are automatically generated. How cool is that.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
181
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
182 LIMITED to simple scripts that read one input from the history. Optionally can
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
183 write one new history dataset, and optionally collect any number of outputs
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
184 into links on an autogenerated HTML index page for the user to navigate -
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
185 useful if the script writes images and output files - pdf outputs are shown
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
186 as thumbnails and R's bloated pdf's are shrunk with ghostscript so that and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
187 imagemagik need to be available.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
188
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
189 Generated tools can be edited and enhanced like any Galaxy tool, so start
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
190 small and build up since a generated script gets you a serious leg up to a
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
191 more complex one.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
192
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
193 *What you do* You paste and run your script, you fix the syntax errors and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
194 eventually it runs. You can use the redo button and edit the script before
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
195 trying to rerun it as you debug - it works pretty well.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
196
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
197 Once the script works on some test data, you can generate a toolshed compatible
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
198 gzip file containing your script ready to run as an ordinary Galaxy tool in
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
199 a repository on your local toolshed. That means safe and largely automated
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
200 installation in any production Galaxy configured to use your toolshed.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
201
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
202 *Generated tool Security* Once you install a generated tool, it's just
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
203 another tool - assuming the script is safe. They just run normally and their
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
204 user cannot do anything unusually insecure but please, practice safe toolshed.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
205 Read the fucking code before you install any tool. Especially this one -
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
206 it is really scary.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
207
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
208 If you opt for an HTML output, you get all the script outputs arranged
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
209 as a single Html history item - all output files are linked, thumbnails for
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
210 all the pdfs. Ugly but really inexpensive.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
211
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
212 Patches and suggestions welcome as bitbucket issues please?
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
213
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
214 copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
215
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
216 all rights reserved
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
217 Licensed under the LGPL if you want to improve it, feel free
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
218 https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
219
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
220 Material for our more enthusiastic and voracious readers continues below -
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
221 we salute you.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
222
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
223 **Motivation** Simple transformation, filtering or reporting scripts get
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
224 written, run and lost every day in most busy labs - even ours where Galaxy is
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
225 in use. This 'dark script matter' is pervasive and generally not reproducible.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
226
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
227 **Benefits** For our group, this allows Galaxy to fill that important dark
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
228 script gap - all those "small" bioinformatics tasks. Once a user has a working
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
229 R (or python or perl) script that does something Galaxy cannot currently do
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
230 (eg transpose a tabular file) and takes parameters the way Galaxy supplies
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
231 them (see example below), they:
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
232
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
233 1. Install the tool factory on a personal private instance
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
234
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
235 2. Upload a small test data set
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
236
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
237 3. Paste the script into the 'script' text box and iteratively run the
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
238 insecure tool on test data until it works right - there is absolutely no
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
239 reason to do this anywhere other than on a personal private instance.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
240
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
241 4. Once it works right, set the 'Generate toolshed gzip' option and run
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
242 it again.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
243
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
244 5. A toolshed style gzip appears ready to upload and install like any other
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
245 Toolshed entry.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
246
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
247 6. Upload the new tool to the toolshed
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
248
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
249 7. Ask the local admin to check the new tool to confirm it's not evil and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
250 install it in the local production galaxy
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
251
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
252
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
253
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
254 **Parameter passing and file inputs**
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
255
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
256 Your script will receive up to 3 named parameters
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
257 INPATHS is a comma separated list of input file paths
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
258 INNAMES is a comma separated list of input file names in the same order
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
259 OUTPATH is optional if a file is being generated, your script should write there
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
260 Your script should open and write files in the provided working directory if you are using the Html
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
261 automatic presentation option.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
262
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
263 Python script command lines will have --INPATHS and --additional_arguments etc. to make it easy to use argparse
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
264
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
265 Rscript will need to use commandArgs(TRUE) - see the example below - additional arguments will
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
266 appear as themselves - eg foo="bar" will mean that foo is defined as "bar" for the script.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
267
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
268 Bash and sh will see any additional parameters on their command lines and the 3 named parameters
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
269 in their environment magically - well, using env on the CL
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
270 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
271 ***python***::
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
272
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
273 # argparse for 3 possible comma separated lists
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
274 # additional parameters need to be parsed !
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
275 # then echo parameters to the output file
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
276 import sys
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
277 import argparse
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
278 argp=argparse.ArgumentParser()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
279 argp.add_argument('--INNAMES',default=None)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
280 argp.add_argument('--INPATHS',default=None)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
281 argp.add_argument('--OUTPATH',default=None)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
282 argp.add_argument('--additional_parameters',default=[],action="append")
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
283 argp.add_argument('otherargs', nargs=argparse.REMAINDER)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
284 args = argp.parse_args()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
285 f= open(args.OUTPATH,'w')
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
286 s = '### args=%s\n' % str(args)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
287 f.write(s)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
288 s = 'sys.argv=%s\n' % sys.argv
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
289 f.write(s)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
290 f.close()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
291
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
292
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
293
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
294 ***Rscript***::
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
295
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
296 # tool factory Rscript parser suggested by Forester
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
297 # http://www.r-bloggers.com/including-arguments-in-r-cmd-batch-mode/
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
298 # additional parameters will appear in the ls() below - they are available
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
299 # to your script
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
300 # echo parameters to the output file
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
301 ourargs = commandArgs(TRUE)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
302 if(length(ourargs)==0){
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
303 print("No arguments supplied.")
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
304 }else{
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
305 for(i in 1:length(ourargs)){
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
306 eval(parse(text=ourargs[[i]]))
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
307 }
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
308 sink(OUTPATH)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
309 cat('INPATHS=',INPATHS,'\n')
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
310 cat('INNAMES=',INNAMES,'\n')
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
311 cat('OUTPATH=',OUTPATH,'\n')
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
312 x=ls()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
313 cat('all objects=',x,'\n')
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
314 sink()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
315 }
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
316 sessionInfo()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
317 print.noquote(date())
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
318
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
319
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
320 ***bash/sh***::
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
321
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
322 # tool factory sets up these environmental variables
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
323 # this example writes those to the output file
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
324 # additional params appear on command line
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
325 if [ ! -f "$OUTPATH" ] ; then
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
326 touch "$OUTPATH"
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
327 fi
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
328 echo "INPATHS=$INPATHS" >> "$OUTPATH"
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
329 echo "INNAMES=$INNAMES" >> "$OUTPATH"
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
330 echo "OUTPATH=$OUTPATH" >> "$OUTPATH"
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
331 echo "CL=$@" >> "$OUTPATH"
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
332
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
333 ***perl***::
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
334
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
335 (my $INPATHS,my $INNAMES,my $OUTPATH ) = @ARGV;
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
336 open(my $fh, '>', $OUTPATH) or die "Could not open file '$OUTPATH' $!";
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
337 print $fh "INPATHS=$INPATHS\n INNAMES=$INNAMES\n OUTPATH=$OUTPATH\n";
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
338 close $fh;
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
339
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
340 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
341
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
342 Galaxy as an IDE for developing API scripts
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
343 If you need to develop Galaxy API scripts and you like to live dangerously,
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
344 please read on.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
345
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
346 Galaxy as an IDE?
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
347 Amazingly enough, blend-lib API scripts run perfectly well *inside*
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
348 Galaxy when pasted into a Tool Factory form. No need to generate a new
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
349 tool. Galaxy+Tool_Factory = IDE I think we need a new t-shirt. Seriously,
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
350 it is actually quite useable.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
351
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
352 Why bother - what's wrong with Eclipse
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
353 Nothing. But, compared with developing API scripts in the usual way outside
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
354 Galaxy, you get persistence and other framework benefits plus at absolutely
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
355 no extra charge, a ginormous security problem if you share the history or
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
356 any outputs because they contain the api script with key so development
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
357 servers only please!
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
358
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
359 Workflow
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
360 Fire up the Tool Factory in Galaxy.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
361
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
362 Leave the input box empty, set the interpreter to python, paste and run an
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
363 api script - eg working example (substitute the url and key) below.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
364
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
365 It took me a few iterations to develop the example below because I know
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
366 almost nothing about the API. I started with very simple code from one of the
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
367 samples and after each run, the (edited..) api script is conveniently recreated
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
368 using the redo button on the history output item. So each successive version
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
369 of the developing api script you run is persisted - ready to be edited and
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
370 rerun easily. It is ''very'' handy to be able to add a line of code to the
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
371 script and run it, then view the output to (eg) inspect dicts returned by
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
372 API calls to help move progressively deeper iteratively.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
373
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
374 Give the below a whirl on a private clone (install the tool factory from
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
375 the main toolshed) and try adding complexity with few rerun/edit/rerun cycles.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
376
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
377 Eg tool factory api script
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
378 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
379 import sys
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
380 from blend.galaxy import GalaxyInstance
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
381 ourGal = 'http://x.x.x.x:xxxx'
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
382 ourKey = 'xxx'
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
383 gi = GalaxyInstance(ourGal, key=ourKey)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
384 libs = gi.libraries.get_libraries()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
385 res = []
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
386 # libs looks like
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
387 # u'url': u'/galaxy/api/libraries/441d8112651dc2f3', u'id':
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
388 u'441d8112651dc2f3', u'name':.... u'Demonstration sample RNA data',
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
389 for lib in libs:
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
390 res.append('%s:\n' % lib['name'])
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
391 res.append(str(gi.libraries.show_library(lib['id'],contents=True)))
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
392 outf=open(sys.argv[2],'w')
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
393 outf.write('\n'.join(res))
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
394 outf.close()
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
395 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
396
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
397 **Attribution**
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
398 Creating re-usable tools from scripts: The Galaxy Tool Factory
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
399 Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
400 Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
401
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
402 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
403
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
404 **Licensing**
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
405 Copyright Ross Lazarus 2010
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
406 ross lazarus at g mail period com
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
407
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
408 All rights reserved.
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
409
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
410 Licensed under the LGPL
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
411
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
412 **screenshot**
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
413
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
414 ![example run](/images/dynamicScriptTool.png)
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
415
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
416
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
417 ```
6f48315c32c1 Uploaded
fubar
parents:
diff changeset
418