comparison toolfactory/README.md @ 119:8ea1133b9d9a draft

Uploaded
author fubar
date Tue, 05 Jan 2021 00:34:48 +0000
parents 68fbdbe35f08
children
comparison
equal deleted inserted replaced
118:e43c43396a70 119:8ea1133b9d9a
1 **Breaking news! Docker container is recommended as at August 2020** 1 ## Breaking news! Docker container at https://github.com/fubar2/toolfactory-galaxy-docker recommended as at December 2020
2 2
3 A Docker container can be built - see the docker directory. 3 ## This is the original ToolFactory suitable for non-docker situations. Please use the docker container if you can because it's integrated with a Toolshed...
4 It is highly recommended for isolation. It also has an integrated toolshed to allow installation of new tools back 4
5 into the Galaxy being used to generate them. 5 # WARNING
6 6
7 Built from quay.io/bgruening/galaxy:20.05 but updates the 7 Install this tool to a throw-away private Galaxy or Docker container ONLY!
8 Galaxy code to the dev branch - it seems to work fine with updated bioblend>=0.14 8
9 with planemo and the right version of gxformat2 needed by the ToolFactory (TF). 9 Please NEVER on a public or production instance where a hostile user may
10 10 be able to gain access if they can acquire an administrative account login.
11 The runclean.sh script run from the docker subdirectory of your local clone of this repository 11
12 should create a container (eventually) and serve it at localhost:8080 with a toolshed at 12 It only runs for server administrators - the ToolFactory tool will refuse to execute for an ordinary user since
13 localhost:9009. 13 it can install new tools to the Galaxy server it executes on! This is not something you should allow other than
14 14 on a throw away instance that is protected from potentially hostile users.
15 Once it's up, please restart Galaxy in the container with 15
16 ```docker exec [container name] supervisorctl restart galaxy: ``` 16 ## Short Story
17 Jobs just do not seem to run properly otherwise and the next steps won't work!
18
19 The generated container includes a workflow and 2 sample data sets for the workflow
20
21 Load the workflow. Adjust the inputs for each as labelled. The perl example counts GC in phiX.fasta.
22 The python scripts use the rgToolFactory.py as their input - any text file will work but I like the
23 recursion. The BWA example has some mitochondrial reads and reference. Run the workflow and watch.
24 This should fill the history with some sample tools you can rerun and play with.
25 Note that each new tool will have been tested using Planemo. In the workflow, in Galaxy.
26 Extremely cool to watch.
27
28 *WARNING*
29
30 Install this tool on a throw-away private Galaxy or Docker container ONLY
31 Please NEVER on a public or production instance
32
33 *Short Story*
34 17
35 Galaxy is easily extended to new applications by adding a new tool. Each new scientific computational package added as 18 Galaxy is easily extended to new applications by adding a new tool. Each new scientific computational package added as
36 a tool to Galaxy requires some special instructions to be written. This is sometimes termed "wrapping" the package 19 a tool to Galaxy requires an XML document describing how the application interacts with Galaxy.
37 because the instructions tell Galaxy how to run the package as a new Galaxy tool. Any tool in a Galaxy is 20 This is sometimes termed "wrapping" the package because the instructions tell Galaxy how to run the package
38 readily available to all the users through a consistent and easy to use interface. 21 as a new Galaxy tool. Any tool that has been wrapped is readily available to all the users through a consistent
39 22 and easy to use interface once installed in the local Galaxy server.
40 Most Galaxy tool wrappers have been manually prepared by skilled programmers, many using Planemo because it 23
41 automates much of the basic boilerplate and makes the process much easier. The ToolFactory (TF) 24 Most Galaxy tool wrappers have been manually prepared by skilled programmers, many using Planemo because it
42 uses Planemo under the hood for many functions, but hides the command 25 automates much of the boilerplate and makes the process much easier.
43 line complexities from the TF user. 26 The ToolFactory (TF) now uses Planemo under the hood for testing, but hides the command
44 27 line complexities. The user will still need appropriate skills in terms of describing the interface between
45 *More Explanation* 28 Galaxy and the new application, but will be helped by a Galaxy tool form to collect all the needed
46 29 settings, together with automated testing and uploading to a toolshed with optional local installation.
47 The TF is an unusual Galaxy tool, designed to allow a skilled user to make new Galaxy tools. 30
31
32 ## ToolFactory generated tools are ordinary Galaxy tools
33
34 A TF generated tool that passes the Planemo test is ready to publish in any Galaxy Toolshed and ready to install in any running Galaxy instance.
35 They are fully workflow compatible and work exactly like any hand-written tool. The user can select input files of the specified type(s) from their
36 history and edit each of the specified parameters. The tool form will show all the labels and help text supplied when the tool was built. When the tool
37 is executed, the dependent binary or script will be passed all the i/o files and parameters as specified, and will write outputs to the specified new
38 history datasets - just like any other Galaxy tool.
39
40 ## Models for tool command line construction
41
42 The key to turning any software package into a Galaxy tool is the automated construction of a suitable command line.
43
44 The TF can build a new tool that will allow the tool user to select input files from their history, set any parameters and when run will send the
45 new output files to the history as specified when the tool builder completed the form and built the new tool.
46
47 That tool can contain instructions to run any Conda dependency or a system executable like bash. Whether a bash script you have written or
48 a Conda package like bwa, the executable will expect to find settings for input, output and parameters on a command line.
49
50 These are often passed as "--name value" (argparse style) or in a fixed order (positional style).
51
52 The ToolFactory allows either, or for "filter" applications that process input from STDIN and write processed output to STDOUT.
53
54 The simplest tool model wraps a simple script or Conda dependency package requiring only input and output files, with no user supplied settings illustrated by
55 the Tacrev demonstration tool found in the Galaxy running in the ToolFactory docker container. It passes a user selected input file from the current history on STDIN
56 to a bash script. The bash script runs the unix tac utility (reverse cat) piped to the unix rev (reverse lines in a text file) utility. It's a one liner:
57
58 `tac | rev`
59
60 The tool building form allows zero or more Conda package name(s) and version(s) and an optional script to be executed by either a system
61 executable like ``bash`` or the first of any named Conda dependency package/version. Tacrev uses a tiny bash script shown above and uses the system
62 bash. Conda bash can be specified if it is important to use the same version consistently for the tool.
63
64 On the tool form, the repeat section allowing zero or more input files was set to be a text file to be selected by the tool user and
65 in the repeat section allowing one or more outputs, a new output file with special value `STDOUT` as the positional parameter, causes the TF to
66 generate a command to capture STDOUT and send it to the new history file containing the reversed input text.
67
68 By reversed, we mean really, truly reversed.
69
70 That simple model can be made much more complicated, and can pass inputs and outputs as named or positional parameters,
71 to allow more complicated scripts or dependent binaries that require:
72
73 1. Any number of input data files selected by the user from existing history data
74 2. Any number of output data files written to the user's history
75 3. Any number of user supplied parameters. These can be passed as command line arguments to the script or the dependency package. Either
76 positional or named (argparse) style command line parameter passing can be used.
77
78 More complex models can be seen in the Sedtest, Pyrevpos and Pyrevargparse tools illustrating positional and argparse parameter passing.
79
80 The most complex demonstration is the Planemo advanced tool tutorial BWA tool. There is one version using a command-override to implement
81 exactly the same command structure in the Planemo tutorial. A second version uses a bash script and positional parameters to achieve the same
82 result. Some tool builders may find the bash version more familiar and cleaner but the choice is yours.
83
84 ## Overview
85
86 ![IHello example ToolFactory tool form](files/hello_toolfactory_form.png?raw=true "Part of the Hello world example ToolFactory tool form")
87
88
89 Steps in building a new Galaxy tool are all conducted through Galaxy running in the docker container:
90
91 1. Login to the Galaxy running in the container at http://localhost:8080 using an admin account. They are specified in config/galaxy.yml and
92 in the documentation at
93 and the ToolFactory will error out and refuse to run for non-administrative tool builders as a minimal protection from opportunistic hostile use.
94
95 2. Start the TF and fill in the form, providing sample inputs and parameter values to suit the Conda package being wrapped.
96
97 3. Execute the tool to create a new XML tool wrapper using the sample inputs and parameter settings for the inbuilt tool test. Planemo runs twice.
98 firstly to generate the test outputs and then to perform a proper test. The completed toolshed archive is written to the history
99 together with the planemo test report. Optionally the new tool archive can be uploaded
100 to the toolshed running in the same container (http://localhost:9009) and then installed inside the Galaxy in the container for further testing.
101
102 4. If the test fails, rerun the failed history job and correct errors on the tool form before rerunning until everything works correctly.
103
104
105
106 ![How it works](files/TFasIDE.png?raw=true "Overview of the ToolFactory as an Integrated Development Environment")
107
108 ## Planning and building new Galaxy tool wrappers.
109
110 It is best to have all the required planning done to wrap any new script or binary before firing up the TF.
111 Conda is the only current dependency manager supported. Before starting, at the very least, the tool builder will need
112 to know the required software package name in Conda and the version to use, how the command line for
113 the package must be constructed, and there must be sample inputs in the working history for each of the required data inputs
114 for the package, together with values for every parameter to suit these sample inputs. These are required on the TF form
115 for preparing the inbuilt tool test. That test is run using Planemo, as part of the tool generation process.
116
117 A new tool is specified by filling in the usual Galaxy tool form.
118
119 The form starts with a new tool name. Most tools will need dependency packages and versions
120 for the executable. Only Conda is currently supported.
121
122 If a script is needed, it can be pasted into a text box and the interpreter named. Available system executables
123 can be used such as bash, or an interpreter such as python, perl or R can be nominated as conda dependencies
124 to ensure reproducible analyses.
125
126 The tool form will be generated from the input data and the tool builder supplied parameters. The command line for the
127 executable is built using positional or argparse (named e.g. --input_file /foo/baz) style
128 parameters and is completely dependent on the executable. These can include:
129
130 1. Any number of input data sets needed by the executable. Each appears to the tool user on the run form and is included
131 on the command line for the executable. The tool builder must supply a small representative sample for each one as
132 an input for the automated tool test.
133
134 2. Any number of output data sets generated by the package can be added to the command line and will appear in
135 the user's history at the end of the job
136
137 3. Any number of text or numeric parameters. Each will appear to the tool user on the run form and are included
138 on the command line to the executable. The tool builder must supply a suitable representative value for each one as
139 the value to be used for the automated tool test.
140
141 Once the form is completed, executing the TF will build a new XML tool wrapper
142 including a functional test based on the sample settings and data.
143
144 If the Planemo test passes, the tool can be optionally uploaded to the local Galaxy used in the image for more testing.
145
146 A local toolshed runs inside the container to allow an automated installation, although any toolshed and any accessible
147 Galaxy can be specified for this process by editing the default URL and API keys to provide appropriate credentials.
148
149 ## Generated Tool Dependency management
150
151 Conda is used for all dependency management although tools that use system utilities like sed, bash or awk
152 may be available on job execution nodes. Sed and friends are available as Conda (conda-forge) dependencies if necessary.
153 Versioned Conda dependencies are always baked-in to the tool and will be used for reproducible calculation.
154
155 ## Requirements
156
157 These are all managed automagically. The TF relies on galaxyxml to generate tool xml and uses ephemeris and
158 bioblend to load tools to the toolshed and to Galaxy. Planemo is used for testing and runs in a biocontainer currently at
159 https://quay.io/fubar2/planemo-biocontainer
160
161 ## Caveats
162
163 This docker image requires privileged mode so exposes potential security risks if hostile tool builders gain access.
164 Please, do not run it in any situation where that is a problem - never, ever on a public facing Galaxy server.
165 On a laptop or workstation should be fine in a non-hostile environment.
166
167
168 ## Example generated XML
169
170 For the bwa-mem example, a supplied bash script is included as a configfile and so has escaped characters.
171 ```
172 <tool name="bwatest" id="bwatest" version="0.01">
173 <!--Cite: Creating re-usable tools from scripts doi:10.1093/bioinformatics/bts573-->
174 <!--Source in git at: https://github.com/fubar2/toolfactory-->
175 <!--Created by admin@galaxy.org at 30/11/2020 07:12:10 using the Galaxy Tool Factory.-->
176 <description>Planemo advanced tool building sample bwa mem mapper as a ToolFactory demo</description>
177 <requirements>
178 <requirement version="0.7.15" type="package">bwa</requirement>
179 <requirement version="1.3" type="package">samtools</requirement>
180 </requirements>
181 <configfiles>
182 <configfile name="runme"><![CDATA[
183 REFFILE=\$1
184 FASTQ=\$2
185 BAMOUT=\$3
186 rm -f "refalias"
187 ln -s "\$REFFILE" "refalias"
188 bwa index -a is "refalias"
189 bwa mem -t "2" -v 1 "refalias" "\$FASTQ" > tempsam
190 samtools view -Sb tempsam > temporary_bam_file.bam
191 samtools sort -o "\$BAMOUT" temporary_bam_file.bam
192
193 ]]></configfile>
194 </configfiles>
195 <version_command/>
196 <command><![CDATA[bash
197 $runme
198 $input1
199 $input2
200 $bam_output]]></command>
201 <inputs>
202 <param optional="false" label="Reference sequence for bwa to map the fastq reads against" help="" format="fasta" multiple="false" type="data" name="input1" argument="input1"/>
203 <param optional="false" label="Reads as fastqsanger to align to the reference sequence" help="" format="fastqsanger" multiple="false" type="data" name="input2" argument="input2"/>
204 </inputs>
205 <outputs>
206 <data name="bam_output" format="bam" label="bam_output" hidden="false"/>
207 </outputs>
208 <tests>
209 <test>
210 <output name="bam_output" value="bam_output_sample" compare="sim_size" format="bam" delta_frac="0.1"/>
211 <param name="input1" value="input1_sample"/>
212 <param name="input2" value="input2_sample"/>
213 </test>
214 </tests>
215 <help><![CDATA[
216
217 **What it Does**
218
219 Planemo advanced tool building sample bwa mem mapper
220
221 Reimagined as a bash script for a ToolFactory demonstration
222
223
224 ------
225
226 Script::
227
228 REFFILE=$1
229 FASTQ=$2
230 BAMOUT=$3
231 rm -f "refalias"
232 ln -s "$REFFILE" "refalias"
233 bwa index -a is "refalias"
234 bwa mem -t "2" -v 1 "refalias" "$FASTQ" > tempsam
235 samtools view -Sb tempsam > temporary_bam_file.bam
236 samtools sort -o "$BAMOUT" temporary_bam_file.bam
237
238 ]]></help>
239 </tool>
240
241 ```
242
243
244
245 ## More Explanation
246
247 The TF is an unusual Galaxy tool, designed to allow a skilled user to make new Galaxy tools.
48 It appears in Galaxy just like any other tool but outputs include new Galaxy tools generated 248 It appears in Galaxy just like any other tool but outputs include new Galaxy tools generated
49 using instructions provided by the user and the results of Planemo lint and tool testing using 249 using instructions provided by the user and the results of Planemo lint and tool testing using
50 small sample inputs provided by the TF user. The small samples become tests built in to the new tool. 250 small sample inputs provided by the TF user. The small samples become tests built in to the new tool.
51 251
52 It offers a familiar Galaxy form driven way to define how the user of the new tool will 252 It offers a familiar Galaxy form driven way to define how the user of the new tool will
53 choose input data from their history, and what parameters the new tool user will be able to adjust. 253 choose input data from their history, and what parameters the new tool user will be able to adjust.
54 The TF user must know, or be able to read, enough about the tool to be able to define the details of 254 The TF user must know, or be able to read, enough about the tool to be able to define the details of
55 the new Galaxy interface and the ToolFactory offers little guidance on that other than some examples. 255 the new Galaxy interface and the ToolFactory offers little guidance on that other than some examples.
56 256
57 Tools always depend on other things. Most tools in Galaxy depend on third party 257 Tools always depend on other things. Most tools in Galaxy depend on third party
58 scientific packages, so TF tools usually have one or more dependencies. These can be 258 scientific packages, so TF tools usually have one or more dependencies. These can be
59 scientific packages such as BWA or scripting languages such as Python and are 259 scientific packages such as BWA or scripting languages such as Python and are
60 usually managed by Conda. If the new tool relies on a system utility such as bash or awk 260 managed by Conda. If the new tool relies on a system utility such as bash or awk
61 where the importance of version control on reproducibility is low, these can be used without 261 where the importance of version control on reproducibility is low, these can be used without
62 Conda management - but remember the potential risks of unmanaged dependencies on computational 262 Conda management - but remember the potential risks of unmanaged dependencies on computational
63 reproducibility. 263 reproducibility.
64 264
65 The TF user can optionally supply a working script where scripting is 265 The TF user can optionally supply a working script where scripting is
66 required and the chosen dependency is a scripting language such as Python or a system 266 required and the chosen dependency is a scripting language such as Python or a system
70 the new tool is run. It is highly recommended that scripts and their command lines be developed 270 the new tool is run. It is highly recommended that scripts and their command lines be developed
71 and tested until proven to work before the TF is invoked. Galaxy as a software development 271 and tested until proven to work before the TF is invoked. Galaxy as a software development
72 environment is actually possible, but not recommended being somewhat clumsy and inefficient. 272 environment is actually possible, but not recommended being somewhat clumsy and inefficient.
73 273
74 Tools nearly always take one or more data sets from the user's history as input. TF tools 274 Tools nearly always take one or more data sets from the user's history as input. TF tools
75 allow the TF user to define what Galaxy datatypes the tool end user will be able to choose and what 275 allow the TF user to define what Galaxy datatypes the tool end user will be able to choose and what
76 names or positions will be used to pass them on a command line to the package or script. 276 names or positions will be used to pass them on a command line to the package or script.
77 277
78 Tools often have various parameter settings. The TF allows the TF user to define how each 278 Tools often have various parameter settings. The TF allows the TF user to define how each
79 parameter will appear on the tool form to the end user, and what names or positions will be 279 parameter will appear on the tool form to the end user, and what names or positions will be
80 used to pass them on the command line to the package. At present, parameters are limited to 280 used to pass them on the command line to the package. At present, parameters are limited to
81 simple text and number fields. Pull requests for other kinds of parameters that galaxyxml 281 simple text and number fields. Pull requests for other kinds of parameters that galaxyxml
82 can handle are welcomed. 282 can handle are welcomed.
83 283
84 Best practice Galaxy tools have one or more automated tests. These should use small sample data sets and 284 Best practice Galaxy tools have one or more automated tests. These should use small sample data sets and
85 specific parameter settings so when the tool is tested, the outputs can be compared with their expected 285 specific parameter settings so when the tool is tested, the outputs can be compared with their expected
86 values. The TF will automatically create a test for the new tool. It will use the sample data sets 286 values. The TF will automatically create a test for the new tool. It will use the sample data sets
87 chosen by the TF user when they built the new tool. 287 chosen by the TF user when they built the new tool.
88 288
89 The TF works by exposing *unrestricted* and therefore extremely dangerous scripting 289 The TF works by exposing *unrestricted* and therefore extremely dangerous scripting
90 to all designated administrators of the host Galaxy server, allowing them to 290 to all designated administrators of the host Galaxy server, allowing them to
91 run scripts in R, python, sh and perl. For this reason, a Docker container is 291 run scripts in R, python, sh and perl. For this reason, a Docker container is
92 available to help manage the associated risks. 292 available to help manage the associated risks.
93 293
94 *Scripting uses* 294 ## Scripting uses
95 295
96 To use a scripting language to create a new tool, you must first prepared and properly test a script. Use small sample 296 To use a scripting language to create a new tool, you must first prepared and properly test a script. Use small sample
97 data sets for testing. When the script is working correctly, upload the small sample datasets 297 data sets for testing. When the script is working correctly, upload the small sample datasets
98 into a new history, start configuring a new ToolFactory tool, and paste the script into the script text box on the TF form. 298 into a new history, start configuring a new ToolFactory tool, and paste the script into the script text box on the TF form.
99 299
100 *Outputs* 300 ### Outputs
101 301
102 Once the script runs sucessfully, a new Galaxy tool that runs your script 302 The TF will generate the new tool described on the TF form, and test it
103 can be generated. Select the "generate" option and supply some help text and 303 using planemo. Optionally if a local toolshed is running, it can be used to
104 names. The new tool will be generated in the form of a new Galaxy datatype 304 install the new tool back into the generating Galaxy.
105 *tgz* - as the name suggests, it's an archive ready to upload to a 305
106 Galaxy ToolShed as a new tool repository. 306 A toolshed is built in to the Docker container and configured
107
108 It is also possible to run a tool to generate test outputs, then test it
109 using planemo. A toolshed is built in to the Docker container and configured
110 so a tool can be tested, sent to that toolshed, then installed in the Galaxy 307 so a tool can be tested, sent to that toolshed, then installed in the Galaxy
111 where the TF is running. 308 where the TF is running using the default toolshed and Galaxy URL and API keys.
112
113 If the tool requires a command or test XML override, then planemo is
114 needed to generate test outputs to make a complete tool, rerun to test
115 and if required upload to the local toolshed and install in the Galaxy
116 where the TF is running.
117 309
118 Once it's in a ToolShed, it can be installed into any local Galaxy server 310 Once it's in a ToolShed, it can be installed into any local Galaxy server
119 from the server administrative interface. 311 from the server administrative interface.
120 312
121 Once the new tool is installed, local users can run it - each time, the 313 Once the new tool is installed, local users can run it - each time, the
122 package and/or script that was supplied when it was built will be executed with the input chosen 314 package and/or script that was supplied when it was built will be executed with the input chosen
123 from the user's history, together with user supplied parameters. In other words, the tools you generate with the 315 from the user's history, together with user supplied parameters. In other words, the tools you generate with the
124 ToolFactory run just like any other Galaxy tool. 316 TF run just like any other Galaxy tool.
125 317
126 TF generated tools work as normal workflow components. 318 TF generated tools work as normal workflow components.
127 319
128 320
129 *Limitations* 321 ## Limitations
130 322
131 The TF is flexible enough to generate wrappers for many common scientific packages 323 The TF is flexible enough to generate wrappers for many common scientific packages
132 but the inbuilt automation will not cope with all possible situations. Users can 324 but the inbuilt automation will not cope with all possible situations. Users can
133 supply overrides for two tool XML segments - tests and command and the BWA 325 supply overrides for two tool XML segments - tests and command and the BWA
134 example in the supplied samples workflow illustrates their use. 326 example in the supplied samples workflow illustrates their use. It does not deal with
135 327 repeated elements or conditional parameters such as allowing a user to choose to see "simple"
136 *Installation* 328 or "advanced" parameters (yet) and there will be plenty of packages it just
137 329 won't cover - but it's a quick and efficient tool for the other 90% of cases. Perfect for
138 The Docker container is the best way to use the TF because it is preconfigured 330 that bash one liner you need to get that workflow functioning correctly for this
331 afternoon's demonstration!
332
333 ## Installation
334
335 The Docker container https://github.com/fubar2/toolfactory-galaxy-docker/blob/main/README.md
336 is the best way to use the TF because it is preconfigured
139 to automate new tool testing and has a built in local toolshed where each new tool 337 to automate new tool testing and has a built in local toolshed where each new tool
140 is uploaded. If you grab the docker container, it should just work. 338 is uploaded. If you grab the docker container, it should just work after a restart and you
141 339 can run a workflow to generate all the sample tools. Running the samples and rerunning the ToolFactory
142 If you build the container, there are some things to watch out for. Let it run for 10 minutes 340 jobs that generated them allows you to add fields and experiment to see how things work.
143 or so once you build it - check with top until conda has finished fussing. Once everything quietens 341
144 down, find the container with 342 It can be installed like any other tool from the Toolshed, but you will need to make some
145 ```docker ps```
146 and use
147 ```docker exec [containername] supervisorctl restart galaxy:```
148 That colon is not a typographical mistake.
149 Not restarting after first boot seems to leave the job/worflow system confused and the workflow
150 just will not run properly until Galaxy has restarted.
151
152 Login as admin@galaxy.org with password "password". Feel free to change it once you are logged in.
153 There should be a companion toolshed at localhost:9090. The history should have some sample data for
154 the workflow.
155
156 Run the workflow and make sure the right dataset is selected for each of the input files. Most of the
157 examples use text files so should run, but the bwa example needs the right ones to work properly.
158
159 When the workflow is finished, you will have half a dozen examples to rerun and play with. They have also
160 all been tested and installed so you should find them in your tool menu under "Generated Tools"
161
162 It is easy to install without Docker, but you will need to make some
163 configuration changes (TODO write a configuration). You can install it most conveniently using the 343 configuration changes (TODO write a configuration). You can install it most conveniently using the
164 administrative "Search and browse tool sheds" link. Find the Galaxy Main 344 administrative "Search and browse tool sheds" link. Find the Galaxy Main
165 toolshed at https://toolshed.g2.bx.psu.edu/ and search for the toolfactory 345 toolshed at https://toolshed.g2.bx.psu.edu/ and search for the toolfactory
166 repository in the Tool Maker section. Open it and review the code and select the option to install it. 346 repository in the Tool Maker section. Open it and review the code and select the option to install it.
167 347
168 Otherwise, if not already there pending an accepted PR, 348 If not already there please add:
169 please add: 349
170 <datatype extension="tgz" type="galaxy.datatypes.binary:Binary" 350 ```
171 mimetype="multipart/x-gzip" subclass="True" /> 351 <datatype extension="tgz" type="galaxy.datatypes.binary:Binary" mimetype="multipart/x-gzip" subclass="True" />
172 to your local data_types_conf.xml. 352 ```
173 353
174 354 to your local config/data_types_conf.xml.
175 *Restricted execution* 355
176 356
177 The tool factory tool itself will then be usable ONLY by admin users - 357 ## Restricted execution
178 people with IDs in admin_users. **Yes, that's right. ONLY 358
179 admin_users can run this tool** Think about it for a moment. If allowed to 359 The tool factory tool itself will ONLY run for admin users -
180 run any arbitrary script on your Galaxy server, the only thing that would 360 people with IDs in config/galaxy.yml "admin_users".
181 impede a miscreant bent on destroying all your Galaxy data would probably 361
182 be lack of appropriate technical skills. 362 *ONLY admin_users can run this tool*
183 363
184 **Generated tool Security** 364 That doesn't mean it's safe to install on a shared or exposed instance - please don't.
365
366 ## Generated tool Security
185 367
186 Once you install a generated tool, it's just 368 Once you install a generated tool, it's just
187 another tool - assuming the script is safe. They just run normally and their 369 another tool - assuming the script is safe. They just run normally and their
188 user cannot do anything unusually insecure but please, practice safe toolshed. 370 user cannot do anything unusually insecure but please, practice safe toolshed.
189 Read the code before you install any tool. Especially this one - it is really scary. 371 Read the code before you install any tool. Especially this one - it is really scary.
190 372
191 **Send Code** 373 ## Attribution
192
193 Pull requests and suggestions welcome as git issues please?
194
195 **Attribution**
196 374
197 Creating re-usable tools from scripts: The Galaxy Tool Factory 375 Creating re-usable tools from scripts: The Galaxy Tool Factory
198 Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team 376 Ross Lazarus; Antony Kaspi; Mark Ziemann; The Galaxy Team
199 Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573 377 Bioinformatics 2012; doi: 10.1093/bioinformatics/bts573
200 378
201 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref 379 http://bioinformatics.oxfordjournals.org/cgi/reprint/bts573?ijkey=lczQh1sWrMwdYWJ&keytype=ref
202 380
203 **Licensing**
204
205 Copyright Ross Lazarus 2010
206 ross lazarus at g mail period com
207
208 All rights reserved.
209
210 Licensed under the LGPL
211