Mercurial > repos > fubar > toolfactory
comparison fubar-galaxytoolfactory-2e68c2a22b43/README.txt @ 2:b55b59435fb1 draft
Now with bash working I think. Special case but working..
author | fubar |
---|---|
date | Mon, 13 Aug 2012 06:27:26 -0400 |
parents | |
children |
comparison
equal
deleted
inserted
replaced
1:87613ace5113 | 2:b55b59435fb1 |
---|---|
1 # WARNING before you start | |
2 # Install this tool on a private Galaxy ONLY | |
3 # Please NEVER on a public or production instance | |
4 | |
5 *Short Story* | |
6 This is an unusual Galaxy tool that generates very simple new Galaxy tools that run the user | |
7 a supplied script (R, python, perl, bash...) over a single input file. | |
8 Whenever you run this tool, the ToolFactory, you should have prepared a script to paste into a text box, | |
9 and a small test input example ready to select from your history to test your new script | |
10 | |
11 If the script runs sucessfully, a new Galaxy tool that runs your script can be generated. | |
12 The new tool is in the form of a special new Galaxy datatype - toolshed.gz - as the name suggests, | |
13 it's an archive ready to upload to a Galaxy ToolShed as a new tool repository. | |
14 | |
15 Once it's in a ToolShed, it can be installed into any local Galaxy server from | |
16 the server administrative interface. | |
17 | |
18 Once your new tool is installed, local users can run it - each time, the script that was supplied | |
19 when it was built will be executed with the input chosen from the user's history. In other words, | |
20 the tools you generate with the ToolFactory run just like any other Galaxy tool, | |
21 but run your script every time. | |
22 | |
23 *Reasons to read further* | |
24 | |
25 If you use Galaxy to support your research; | |
26 | |
27 You and fellow users are sometimes forced to take data out of Galaxy, process it with ugly | |
28 little perl/awk/sed/R... scripts and put it back; | |
29 | |
30 You do this when you can't do some transformation in Galaxy (the 90/10 rule); | |
31 | |
32 You don't have enough developer resources for wrapping dozens of even relatively simple tools; | |
33 | |
34 Your research and your institution would be far better off if those feral scripts were all tucked safely in | |
35 your local toolshed and Galaxy histories. | |
36 | |
37 *The good news* If it can be trivially scripted, it can be running safely in your | |
38 local Galaxy via your own local toolshed in a few minutes - with functional tests. | |
39 | |
40 | |
41 *Value proposition* The ToolFactory allows Galaxy to efficiently take over most of your lab's dark script matter, | |
42 making it reproducible in Galaxy and shareable through the ToolShed. | |
43 | |
44 That's what this tool does. You paste a simple script and the tool returns | |
45 a new, real Galaxy tool, ready to be installed from the local toolshed to local servers. | |
46 Scripts can be wrapped and online literally within minutes. | |
47 | |
48 *To fully and safely exploit the awesome power* of this tool, Galaxy and the ToolShed, | |
49 you should be a developer installing this tool on a private/personal/scratch local instance where you are an admin_user. | |
50 Then, if you break it, you get to keep all the pieces | |
51 see https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home | |
52 | |
53 ** Installation ** | |
54 This is a Galaxy tool. You can install it most conveniently using the administrative "Search and browse tool sheds" link. | |
55 Find the Galaxy Test toolshed (not main) and search for the toolfactory repository. | |
56 Open it and review the code and select the option to install it. | |
57 | |
58 If you can't get the tool that way, the xml and py files here need to be copied into a new tools subdirectory such as tools/toolfactory | |
59 Your tool_conf.xml needs a new entry pointing to the xml file - something like:: | |
60 | |
61 <section name="Tool building tools" id="toolbuilders"> | |
62 <tool file="toolfactory/rgToolFactory.xml"/> | |
63 </section> | |
64 | |
65 If not already there (I just added it to datatypes_conf.xml.sample), please add: | |
66 <datatype extension="toolshed.gz" type="galaxy.datatypes.binary:Binary" mimetype="multipart/x-gzip" subclass="True" /> | |
67 to your local data_types_conf.xml. | |
68 | |
69 Ensure that html sanitization is set to False and uncommented in universe_wsgi.ini | |
70 | |
71 You'll have to restart the server for the new tool to be available. | |
72 | |
73 Of course, R, python, perl etc are needed on your path if you want to test scripts using those interpreters. | |
74 Adding new ones to this tool code should be easy enough. Please make suggestions as bitbucket issues and code. | |
75 The HTML file code automatically shrinks R's bloated pdfs, and depends on ghostscript. The thumbnails require imagemagick . | |
76 | |
77 * Restricted execution * | |
78 The new tool factory tool will then be usable ONLY by admin users - people with IDs in admin_users in universe_wsgi.ini | |
79 **Yes, that's right. ONLY admin_users can run this tool** Think about it for a moment. If allowed to run any | |
80 arbitrary script on your Galaxy server, the only thing that would impede a miscreant bent on destroying all your | |
81 Galaxy data would probably be lack of appropriate technical skills. | |
82 | |
83 *What it does* This is a tool factory for simple scripts in python, R and perl currently. | |
84 Functional tests are automatically generated. How cool is that. | |
85 | |
86 LIMITED to simple scripts that read one input from the history. | |
87 Optionally can write one new history dataset, | |
88 and optionally collect any number of outputs into links on an autogenerated HTML | |
89 index page for the user to navigate - useful if the script writes images and output files - pdf outputs | |
90 are shown as thumbnails and R's bloated pdf's are shrunk with ghostscript so that and imagemagik need to | |
91 be avaailable. | |
92 | |
93 Generated tools can be edited and enhanced like any Galaxy tool, so start small and build up since | |
94 a generated script gets you a serious leg up to a more complex one. | |
95 | |
96 *What you do* You paste and run your script | |
97 you fix the syntax errors and eventually it runs | |
98 You can use the redo button and edit the script before | |
99 trying to rerun it as you debug - it works pretty well. | |
100 | |
101 Once the script works on some test data, you can | |
102 generate a toolshed compatible gzip file | |
103 containing your script ready to run as an ordinary Galaxy tool in a | |
104 repository on your local toolshed. That means safe and largely automated installation in any | |
105 production Galaxy configured to use your toolshed. | |
106 | |
107 *Generated tool Security* Once you install a generated tool, it's just | |
108 another tool - assuming the script is safe. They just run normally and their user cannot do anything unusually insecure | |
109 but please, practice safe toolshed. | |
110 Read the fucking code before you install any tool. | |
111 Especially this one - it is really scary. | |
112 | |
113 If you opt for an HTML output, you get all the script outputs arranged | |
114 as a single Html history item - all output files are linked, thumbnails for all the pdfs. | |
115 Ugly but really inexpensive. | |
116 | |
117 Patches and suggestions welcome as bitbucket issues please? | |
118 | |
119 long route to June 2012 product | |
120 derived from an integrated script model | |
121 called rgBaseScriptWrapper.py | |
122 Note to the unwary: | |
123 This tool allows arbitrary scripting on your Galaxy as the Galaxy user | |
124 There is nothing stopping a malicious user doing whatever they choose | |
125 Extremely dangerous!! | |
126 Totally insecure. So, trusted users only | |
127 | |
128 | |
129 | |
130 | |
131 copyright ross lazarus (ross stop lazarus at gmail stop com) May 2012 | |
132 | |
133 all rights reserved | |
134 Licensed under the LGPL if you want to improve it, feel free https://bitbucket.org/fubar/galaxytoolfactory/wiki/Home | |
135 | |
136 Material for our more enthusiastic and voracious readers continues below - we salute you. | |
137 | |
138 **Motivation** Simple transformation, filtering or reporting scripts get written, run and lost every day in most busy labs | |
139 - even ours where Galaxy is in use. This 'dark script matter' is pervasive and generally not reproducible. | |
140 | |
141 **Benefits** For our group, this allows Galaxy to fill that important dark script gap - all those "small" bioinformatics | |
142 tasks. Once a user has a working R (or python or perl) script that does something Galaxy cannot currently do (eg transpose a | |
143 tabular file) and takes parameters the way Galaxy supplies them (see example below), they: | |
144 | |
145 1. Install the tool factory on a personal private instance | |
146 | |
147 2. Upload a small test data set | |
148 | |
149 3. Paste the script into the 'script' text box and iteratively run the insecure tool on test data until it works right - | |
150 there is absolutely no reason to do this anywhere other than on a personal private instance. | |
151 | |
152 4. Once it works right, set the 'Generate toolshed gzip' option and run it again. | |
153 | |
154 5. A toolshed style gzip appears ready to upload and install like any other Toolshed entry. | |
155 | |
156 6. Upload the new tool to the toolshed | |
157 | |
158 7. Ask the local admin to check the new tool to confirm it's not evil and install it in the local production galaxy | |
159 | |
160 **Simple examples on the tool form** | |
161 | |
162 A simple Rscript "filter" showing how the command line parameters can be handled, takes an input file, | |
163 does something (transpose in this case) and writes the results to a new tabular file:: | |
164 | |
165 # transpose a tabular input file and write as a tabular output file | |
166 ourargs = commandArgs(TRUE) | |
167 inf = ourargs[1] | |
168 outf = ourargs[2] | |
169 inp = read.table(inf,head=F,row.names=NULL,sep='\t') | |
170 outp = t(inp) | |
171 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=F) | |
172 | |
173 Calculate a multiple test adjusted p value from a column of p values - for this script to be useful, | |
174 it needs the right column for the input to be specified in the code for the | |
175 given input file type(s) specified when the tool is generated :: | |
176 | |
177 # use p.adjust - assumes a HEADER row and column 1 - please fix for any real use | |
178 column = 1 # adjust if necessary for some other kind of input | |
179 fdrmeth = 'BH' | |
180 ourargs = commandArgs(TRUE) | |
181 inf = ourargs[1] | |
182 outf = ourargs[2] | |
183 inp = read.table(inf,head=T,row.names=NULL,sep='\t') | |
184 p = inp[,column] | |
185 q = p.adjust(p,method=fdrmeth) | |
186 newval = paste(fdrmeth,'p-value',sep='_') | |
187 q = data.frame(q) | |
188 names(q) = newval | |
189 outp = cbind(inp,newval=q) | |
190 write.table(outp,outf, quote=FALSE, sep="\t",row.names=F,col.names=T) | |
191 | |
192 | |
193 | |
194 Another Rscript example without any input file - generates a random heatmap pdf - you must make sure the option to create an HTML output file is | |
195 turned on for this to work. The heatmap will be presented as a thumbnail linked to the pdf in the resulting HTML page:: | |
196 | |
197 # note this script takes NO input or output because it generates random data | |
198 foo = data.frame(a=runif(100),b=runif(100),c=runif(100),d=runif(100),e=runif(100),f=runif(100)) | |
199 bar = as.matrix(foo) | |
200 pdf( "heattest.pdf" ) | |
201 heatmap(bar,main='Random Heatmap') | |
202 dev.off() | |
203 | |
204 A Python example that reverses each row of a tabular file. You'll need to remove the leading spaces for this to work if cut | |
205 and pasted into the script box. Note that you can already do this in Galaxy by setting up the cut columns tool with the | |
206 correct number of columns in reverse order,but this script will work for any number of columns so is completely generic:: | |
207 | |
208 # reverse order of columns in a tabular file | |
209 import sys | |
210 inp = sys.argv[1] | |
211 outp = sys.argv[2] | |
212 i = open(inp,'r') | |
213 o = open(outp,'w') | |
214 for row in i: | |
215 rs = row.rstrip().split('\t') | |
216 rs.reverse() | |
217 o.write('\t'.join(rs)) | |
218 o.write('\n') | |
219 i.close() | |
220 o.close() | |
221 | |
222 | |
223 **Attribution** Copyright Ross Lazarus (ross period lazarus at gmail period com) May 2012 | |
224 | |
225 All rights reserved. | |
226 | |
227 Licensed under the LGPL | |
228 | |
229 | |
230 **Obligatory screenshot** | |
231 | |
232 http://bitbucket.org/fubar/galaxytoolmaker/src/fda8032fe989/images/dynamicScriptTool.png | |
233 |