1
|
1 # WARNING before you start
|
|
2 # Install on a private Galaxy ONLY
|
|
3 # Please NEVER on a public or production instance
|
|
4
|
|
5 *Short Story*
|
|
6 This is an unusual Galaxy tool that generates very simple but potentially
|
|
7 very useful local Galaxy tools that run the user supplied script (R, python, perl...) over a single input file.
|
|
8 Whenever you run this tool, the ToolFactory, you should have prepared a script to paste into a text box,
|
|
9 and a small test input example ready to select from your history to test your new script
|
|
10
|
|
11 If the script runs sucessfully, a new Galaxy tool that runs your script can be generated.
|
|
12 The new tool is in the form of a special new Galaxy datatype - toolshed.gz - as the name suggests,
|
|
13 it's an archive ready to upload to a Galaxy ToolShed as a new tool repository.
|
|
14
|
|
15 Once it's in a ToolShed, it can be installed into any local Galaxy server from
|
|
16 the server administrative interface.
|
|
17
|
|
18 Once your new tool is installed, local users can run it - each time, the script that was supplied
|
|
19 when it was built will be executed with the input chosen from the user's history. In other words,
|
|
20 the tools you generate with the ToolFactory run just like any other Galaxy tool,
|
|
21 but run your script every time.
|
|
22
|
|
23 *Reasons to read further*
|
|
24
|
|
25 If you use Galaxy to support your research;
|
|
26
|
|
27 You and fellow users are sometimes forced to take data out of Galaxy, process it with ugly
|
|
28 little perl/awk/sed/R... scripts and put it back;
|
|
29
|
|
30 You do this when you can't do some transformation in Galaxy (the 90/10 rule);
|
|
31
|
|
32 You don't have enough developer resources for wrapping dozens of even relatively simple tools;
|
|
33
|
|
34 Your research and your institution would be far better off if those feral scripts were all tucked safely in
|
|
35 your local toolshed and Galaxy histories.
|
|
36
|
|
37 *The good news* If it can be trivially scripted, it can be running safely in your
|
|
38 local Galaxy via your own local toolshed in a few minutes - with functional tests.
|
|
39
|
|
40
|
|
41 *Value proposition* The ToolFactory allows Galaxy to efficiently take over most of your lab's dark script matter,
|
|
42 making it reproducible in Galaxy and shareable through the ToolShed.
|
|
43
|
|
44 That's what this tool does. You paste a simple script and the tool returns
|
|
45 a new, real Galaxy tool, ready to be installed from the local toolshed to local servers.
|
|
46 Scripts can be wrapped and online literally within minutes.
|
|
47
|
|
48 *To fully and safely exploit the awesome power* of this tool, Galaxy and the ToolShed,
|
|
49 you should be a developer installing this tool on a private/personal/scratch local instance where you are an admin_user.
|
|
50 Then, if you break it, you get to keep all the pieces
|
|
51 see https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
|
|
52
|
|
53 ** Installation **
|
|
54 This is a Galaxy tool. You can install it most conveniently using the administrative "Search and browse tool sheds" link.
|
|
55 Find the Galaxy Test toolshed (not main) and search for the toolfactory repository.
|
|
56 Open it and review the code and select the option to install it.
|
|
57
|
|
58 If you can't get the tool that way, the xml and py files here need to be copied into a new tools subdirectory such as tools/toolfactory
|
|
59 Your tool_conf.xml needs a new entry pointing to the xml file - something like::
|
|
60
|
|
61 <section name="Tool building tools" id="toolbuilders">
|
|
62 <tool file="toolfactory/rgToolFactory.xml"/>
|
|
63 </section>
|
|
64
|
|
65 If not already there (I just added it to datatypes_conf.xml.sample), please add:
|
|
66 <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary" mimetype="multipart/x-gzip" subclass="True" />
|
|
67 to your local data_types_conf.xml.
|
|
68
|
|
69 Ensure that html sanitization is set to False and uncommented in universe_wsgi.ini
|
|
70
|
|
71 You'll have to restart the server for the new tool to be available.
|
|
72
|
|
73 Of course, R, python, perl etc are needed on your path if you want to test scripts using those interpreters.
|
|
74 Adding new ones to this tool code should be easy enough. Please make suggestions as bitbucket issues and code.
|
|
75 The HTML file code automatically shrinks R's bloated pdfs, and depends on ghostscript. The thumbnails require imagemagick .
|
|
76
|
|
77 * Restricted execution *
|
|
78 The new tool factory tool will then be usable ONLY by admin users - people with IDs in admin_users in universe_wsgi.ini
|
|
79 **Yes, that's right. ONLY admin_users can run this tool** Think about it for a moment. If allowed to run any
|
|
80 arbitrary script on your Galaxy server, the only thing that would impede a miscreant bent on destroying all your
|
|
81 Galaxy data would probably be lack of appropriate technical skills.
|
|
82
|
|
83 *What it does* This is a tool factory for simple scripts in python, R and perl currently.
|
|
84 Functional tests are automatically generated. How cool is that.
|
|
85
|
|
86 LIMITED to simple scripts that read one input from the history.
|
|
87 Optionally can write one new history dataset,
|
|
88 and optionally collect any number of outputs into links on an autogenerated HTML
|
|
89 index page for the user to navigate - useful if the script writes images and output files - pdf outputs
|
|
90 are shown as thumbnails and R's bloated pdf's are shrunk with ghostscript so that and imagemagik need to
|
|
91 be avaailable.
|
|
92
|
|
93 Generated tools can be edited and enhanced like any Galaxy tool, so start small and build up since
|
|
94 a generated script gets you a serious leg up to a more complex one.
|
|
95
|
|
96 *What you do* You paste and run your script
|
|
97 you fix the syntax errors and eventually it runs
|
|
98 You can use the redo button and edit the script before
|
|
99 trying to rerun it as you debug - it works pretty well.
|
|
100
|
|
101 Once the script works on some test data, you can
|
|
102 generate a toolshed compatible gzip file
|
|
103 containing your script ready to run as an ordinary Galaxy tool in a
|
|
104 repository on your local toolshed. That means safe and largely automated installation in any
|
|
105 production Galaxy configured to use your toolshed.
|
|
106
|
|
107 *Generated tool Security* Once you install a generated tool, it's just
|
|
108 another tool - assuming the script is safe. They just run normally and their user cannot do anything unusually insecure
|
|
109 but please, practice safe toolshed.
|
|
110 Read the fucking code before you install any tool.
|
|
111 Especially this one - it is really scary.
|
|
112
|
|
113 If you opt for an HTML output, you get all the script outputs arranged
|
|
114 as a single Html history item - all output files are linked, thumbnails for all the pdfs.
|
|
115 Ugly but really inexpensive.
|
|
116
|
|
117 Patches and suggestions welcome as bitbucket issues please?
|
|
118
|
|
119 long route to June 2012 product
|
|
120 derived from an integrated script model
|
|
121 called rgBaseScriptWrapper.py
|
|
122 Note to the unwary:
|
|
123 This tool allows arbitrary scripting on your Galaxy as the Galaxy user
|
|
124 There is nothing stopping a malicious user doing whatever they choose
|
|
125 Extremely dangerous!!
|
|
126 Totally insecure. So, trusted users only
|
|
127
|
|
128
|
|
129
|
|
130
|
|
131 copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012
|
|
132
|
|
133 all rights reserved
|
|
134 Licensed under the LGPL if you want to improve it, feel free https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home
|
|
135
|
|
136 Material for our more enthusiastic and voracious readers continues below - we salute you.
|
|
137
|
|
138 **Motivation** Simple transformation, filtering or reporting scripts get written, run and lost every day in most busy labs
|
|
139 - even ours where Galaxy is in use. This 'dark script matter' is pervasive and generally not reproducible.
|
|
140
|
|
141 **Benefits** For our group, this allows Galaxy to fill that important dark script gap - all those "small" bioinformatics
|
|
142 tasks. Once a user has a working R (or python or perl) script that does something Galaxy cannot currently do (eg transpose a
|
|
143 tabular file) and takes parameters the way Galaxy supplies them (see example below), they:
|
|
144
|
|
145 1. Install the tool factory on a personal private instance
|
|
146
|
|
147 2. Upload a small test data set
|
|
148
|
|
149 3. Paste the script into the 'script' text box and iteratively run the insecure tool on test data until it works right -
|
|
150 there is absolutely no reason to do this anywhere other than on a personal private instance.
|
|
151
|
|
152 4. Once it works right, set the 'Generate toolshed gzip' option and run it again.
|
|
153
|
|
154 5. A toolshed style gzip appears ready to upload and install like any other Toolshed entry.
|
|
155
|
|
156 6. Upload the new tool to the toolshed
|
|
157
|
|
158 7. Ask the local admin to check the new tool to confirm it's not evil and install it in the local production galaxy
|
|
159
|
|
160 **Simple examples on the tool form**
|
|
161
|
|
162 A simple Rscript "filter" showing how the command line parameters can be handled, takes an input file,
|
|
163 does something (transpose in this case) and writes the results to a new tabular file::
|
|
164
|
|
165 # transpose a tabular input file and write as a tabular output file
|
|
166 ourargs = commandArgs(TRUE)
|
|
167 inf = ourargs[1]
|
|
168 outf = ourargs[2]
|
|
169 inp = read.table(inf,head=F,row.names=NULL,sep='\t')
|
|
170 outp = t(inp)
|
|
171 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F)
|
|
172
|
|
173 Calculate a multiple test adjusted p value from a column of p values - for this script to be useful,
|
|
174 it needs the right column for the input to be specified in the code for the
|
|
175 given input file type(s) specified when the tool is generated ::
|
|
176
|
|
177 # use p.adjust - assumes a HEADER row and column 1 - please fix for any real use
|
|
178 column = 1 # adjust if necessary for some other kind of input
|
|
179 fdrmeth = 'BH'
|
|
180 ourargs = commandArgs(TRUE)
|
|
181 inf = ourargs[1]
|
|
182 outf = ourargs[2]
|
|
183 inp = read.table(inf,head=T,row.names=NULL,sep='\t')
|
|
184 p = inp[,column]
|
|
185 q = p.adjust(p,method=fdrmeth)
|
|
186 newval = paste(fdrmeth,'p-value',sep='_')
|
|
187 q = data.frame(q)
|
|
188 names(q) = newval
|
|
189 outp = cbind(inp,newval=q)
|
|
190 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=T)
|
|
191
|
|
192
|
|
193
|
|
194 Another Rscript example without any input file - generates a random heatmap pdf - you must make sure the option to create an HTML output file is
|
|
195 turned on for this to work. The heatmap will be presented as a thumbnail linked to the pdf in the resulting HTML page::
|
|
196
|
|
197 # note this script takes NO input or output because it generates random data
|
|
198 foo = data.frame(a=runif(100),b=runif(100),c=runif(100),d=runif(100),e=runif(100),f=runif(100))
|
|
199 bar = as.matrix(foo)
|
|
200 pdf( "heattest.pdf" )
|
|
201 heatmap(bar,main='Random Heatmap')
|
|
202 dev.off()
|
|
203
|
|
204 A Python example that reverses each row of a tabular file. You'll need to remove the leading spaces for this to work if cut
|
|
205 and pasted into the script box. Note that you can already do this in Galaxy by setting up the cut columns tool with the
|
|
206 correct number of columns in reverse order,but this script will work for any number of columns so is completely generic::
|
|
207
|
|
208 # reverse order of columns in a tabular file
|
|
209 import sys
|
|
210 inp = sys.argv[1]
|
|
211 outp = sys.argv[2]
|
|
212 i = open(inp,'r')
|
|
213 o = open(outp,'w')
|
|
214 for row in i:
|
|
215 rs = row.rstrip().split('\t')
|
|
216 rs.reverse()
|
|
217 o.write('\t'.join(rs))
|
|
218 o.write('\n')
|
|
219 i.close()
|
|
220 o.close()
|
|
221
|
|
222
|
|
223 **Attribution** Copyright Ross Lazarus (ross period lazarus at gmail period com) May 2012
|
|
224
|
|
225 All rights reserved.
|
|
226
|
|
227 Licensed under the LGPL
|
|
228
|
|
229
|
|
230 **Obligatory screenshot**
|
|
231
|
|
232 http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png
|
|
233
|