0
|
1 <tool id="DimontPredictor" name="DimontPredictor" version="0.1" force_history_refresh="true">
|
|
2 <description>for predicting binding sites using a Dimont model</description>
|
|
3 <command>java -Xms256M -Xmx2G -jar \$JAR_PATH/DimontPredictorWeb.jar --run $script_file $summary $summary.id $__new_file_path__ $summary.extra_files_path</command>
|
|
4 <inputs>
|
|
5 <param type="text" size="40" name="DimontPredictor_jobname" label="Job name" value="" optional="true" help="Please enter a name for your job that should be used in the history (optional)">
|
|
6 </param>
|
|
7 <param type="data" format="xml" name="DimontPredictor_ps_Dimont" label="<hr />Dimont" help="The trained Dimont classifier, i.e. the "Dimont" output of a previous Dimont run." value="" optional="false">
|
|
8 </param>
|
|
9
|
|
10 <param type="data" format="fasta" name="DimontPredictor_ps_Input_sequences" label="<hr />Input sequences" help="The input sequences for de-novo motif discovery (can be uploaded using "GetData" -> "Upload File"), annotated FastA format. The required format is described in the help section." value="" optional="false">
|
|
11 </param>
|
|
12
|
|
13 <param type="text" size="40" name="DimontPredictor_ps_Value_tag" label="Value tag" help="The tag for the value information in the FastA-annotation of the input file" value="" optional="false">
|
|
14 </param>
|
|
15
|
|
16 <param type="text" size="40" name="DimontPredictor_ps_Weighting_factor" label="Weighting factor" help="The value for weighting the data; either a value between 0 and 1, or a description relative to the standard deviation (e.g. +4sd)" value="0.2" optional="false">
|
|
17 </param>
|
|
18
|
|
19 <param type="float" name="DimontPredictor_ps_p_value" label="<hr />p-value" help="The maximum p-value allowed for predicted binding sites" value="0.0010" optional="false">
|
|
20 <validator type="in_range" min="0.0" max="1.0" message="Value is not in the specified range [0.0, 1.0]."/></param>
|
|
21
|
|
22 </inputs>
|
|
23 <requirements>
|
|
24 <requirement type="set_environment">JAR_PATH</requirement>
|
|
25 <requirement type="binary" version=">=1.6">java</requirement>
|
|
26 </requirements>
|
|
27 <configfiles>
|
|
28 <configfile name="script_file">
|
|
29 <DimontPredictor_ps_Dimont>
|
|
30 <value>
|
|
31 ${DimontPredictor_ps_Dimont}</value>
|
|
32 <extension>
|
|
33 ${DimontPredictor_ps_Dimont.ext}</extension>
|
|
34 </DimontPredictor_ps_Dimont>
|
|
35
|
|
36 <DimontPredictor_ps_Input_sequences>
|
|
37 <value>
|
|
38 ${DimontPredictor_ps_Input_sequences}</value>
|
|
39 <extension>
|
|
40 ${DimontPredictor_ps_Input_sequences.ext}</extension>
|
|
41 </DimontPredictor_ps_Input_sequences>
|
|
42
|
|
43 <DimontPredictor_ps_Value_tag>
|
|
44 ${DimontPredictor_ps_Value_tag}</DimontPredictor_ps_Value_tag>
|
|
45
|
|
46 <DimontPredictor_ps_Weighting_factor>
|
|
47 ${DimontPredictor_ps_Weighting_factor}</DimontPredictor_ps_Weighting_factor>
|
|
48
|
|
49 <DimontPredictor_ps_p_value>
|
|
50 ${DimontPredictor_ps_p_value}</DimontPredictor_ps_p_value>
|
|
51
|
|
52 </configfile>
|
|
53 </configfiles>
|
|
54 <outputs>
|
|
55 <data format="html" name="summary" label="#if str($DimontPredictor_jobname) == '' then $tool.name + ' on ' + $on_string else $DimontPredictor_jobname#">
|
|
56 </data>
|
|
57 </outputs>
|
|
58 <tests>
|
|
59 <test>
|
|
60 <param name="DimontPredictor_jobname" value="Test" />
|
|
61 <param name="DimontPredictor_ps_Dimont" value="predictor_test.xml" ftype="xml" />
|
|
62 <param name="DimontPredictor_ps_Input_sequences" value="dimont_test.fasta" />
|
|
63 <param name="DimontPredictor_ps_Value_tag" label="Value tag" value="maxT" />
|
|
64 <param name="DimontPredictor_ps_Weighting_factor" value="0.2" />
|
|
65 <param name="DimontPredictor_ps_p_value" value="0.0010" />
|
|
66 <output name="summary" file="TestPred/TestPred_html.html" />
|
|
67 </test>
|
|
68 </tests>
|
|
69 <help>
|
|
70 **DimontPredictor** allows for predicting binding sites in new data using a previously trained Dimont model. For training a Dimont model see tool "Dimont".
|
|
71
|
|
72 This tool may be useful if you, for instance, want to predict binding sites of a previously discovered motifs in other data sets, or if you want to try different p-values for filtering predictions.
|
|
73
|
|
74 Input sequences must be supplied in an annotated FastA format as a file uploaded by the "Upload File" task in section "Get Data" of Galaxy or generated using the "Dimont Data Extractor" tool.
|
|
75 In the annotation of each sequence, you need to provide a value that reflects the confidence that this sequence is bound by the factor of interest.
|
|
76 Such confidences may be peak statistics (e.g., number of fragments under a peak) for ChIP data or signal intensities for PBM data.
|
|
77
|
|
78 For instance, an annotated FastA file for ChIP-exo data could look like::
|
|
79
|
|
80 > signal: 515
|
|
81 ggccatgtgtatttttttaaatttccac...
|
|
82 > signal: 199
|
|
83 GGTCCCCTGGGAGGATGGGGACGTGCTG...
|
|
84 ...
|
|
85
|
|
86 where the confidence for the first two sequences amounts to 515 and 199, respectively.
|
|
87 The FastA comment may contain additional annotations of the format ``key1 : value1; key2: value2;...``.
|
|
88 We also provide an example_ input file.
|
|
89
|
|
90 Accordingly, you would need to set the parameter "Value tag" to ``signal`` for this input file.
|
|
91
|
|
92 The parameter "Weighting factor" defines the proportion of sequences that you expect to be bound by the targeted factor with high confidence. For ChIP data, the default value of ``0.2`` typically works well.
|
|
93 For PBM data, containing a large number of unspecific probes, this parameter should be set to a lower value, e.g. ``0.01``.
|
|
94
|
|
95 The parameter "p-value" defines a threshold on the p-values of predicted binding sites, and only binding sites with a lower p-value are reported by DimontPredictor.
|
|
96 The Dimont tool uses a p-value threshold of ``1E-3``, which is also the default value of DimontPredictor.
|
|
97
|
|
98 You can also install this web-application within your local Galaxy server. Instructions can be found at the Dimont_ page of Jstacs.
|
|
99 There you can also download a command line version of DimontPredictor.
|
|
100
|
1
|
101 If you use Dimont, please cite
|
|
102
|
|
103 \J. Grau, S. Posch, I. Grosse, and J. Keilwagen. A general approach for discriminative de-novo motif discovery from high-throughput data. *Nucleic Acids Research*, 41(21):e197, 2013.
|
|
104
|
0
|
105 If you experience problems using DimontPredictor, please contact_ us.
|
|
106
|
|
107 .. _example: http://www.jstacs.de/downloads/dimont-example.fa
|
|
108 .. _Dimont: http://jstacs.de/index.php/Dimont
|
|
109 .. _contact: mailto:grau@informatik.uni-halle.de
|
|
110 </help>
|
|
111 </tool>
|
|
112
|