comparison tools/protein_analysis/promoter2.xml @ 7:9b45a8743100 draft

Uploaded v0.1.0, which adds a wrapper for Promoter 2.0 (DNA tool) and enables use of Galaxy's <parallelism> tag for SignalP, TMHMM X Promoter wrappers.
author peterjc
date Mon, 30 Jul 2012 10:25:07 -0400
parents
children 976a5f2833cd
comparison
equal deleted inserted replaced
6:a290c6d4e658 7:9b45a8743100
1 <tool id="promoter2" name="Promoter 2.0" version="0.0.1">
2 <description>Find eukaryotic PolII promoters in DNA sequences</description>
3 <!-- If job splitting is enabled, break up the query file into parts -->
4 <!-- Using 2000 per chunk so 4 threads each doing 500 is ideal -->
5 <parallelism method="basic" split_inputs="fasta_file" split_mode="to_size" split_size="2000" merge_outputs="tabular_file"></parallelism>
6 <command interpreter="python">
7 promoter2.py "\$NSLOTS" $fasta_file $tabular_file
8 ##Set the number of threads in the runner entry in universe_wsgi.ini
9 ##which (on SGE at least) will set the $NSLOTS environment variable.
10 ##If the environment variable isn't set, get "", and defaults to one.
11 </command>
12 <inputs>
13 <param name="fasta_file" type="data" format="fasta" label="FASTA file of DNA sequences"/>
14 </inputs>
15 <outputs>
16 <data name="tabular_file" format="tabular" label="Promoter2 on ${fasta_file.name}" />
17 </outputs>
18 <requirements>
19 <requirement type="binary">promoter</requirement>
20 </requirements>
21 <help>
22
23 **What it does**
24
25 This calls the Promoter 2.0 tool for prediction of eukaryotic PolII promoters sequences using a Neural Network (NN) model.
26
27 The input is a FASTA file of nucleotide sequences (e.g. upstream regions of your genes), and the output is tabular with five columns (one row per promoter):
28
29 1. Sequence identifier (first word of FASTA header)
30 2. Promoter position, e.g. 600
31 3. Promoter score, e.g. 1.063
32 4. Promoter likelihood, e.g. Highly likely prediction
33
34 The scores are classified very simply as follows:
35
36 ========= ========================
37 Score Description
38 --------- ------------------------
39 below 0.5 ignored
40 0.5 - 0.8 Marginal prediction
41 0.8 - 1.0 Medium likely prediction
42 above 1.0 Highly likely prediction
43 ========= ========================
44
45 Internally the input FASTA file is divided into parts (to allow multiple processors to be used), and the raw output is reformatted into this tabular layout suitable for downstream analysis within Galaxy.
46
47 **References**
48
49 Knudsen.
50 Promoter2.0: for the recognition of PolII promoter sequences.
51 Bioinformatics, 15:356-61, 1999.
52 http://dx.doi.org/10.1093/bioinformatics/15.5.356
53
54 http://www.cbs.dtu.dk/services/Promoter/output.php
55
56 </help>
57 </tool>