comparison RNAShapes.xml @ 3:7adb2518f6e9 draft default tip

planemo upload for repository https://github.com/bgruening/galaxytools/tree/master/tools/rna_tools/rna_shapes commit 1527e05bcd748a2b3cef22e0e356697066a55635
author rnateam
date Sat, 11 Nov 2017 15:07:41 -0500
parents f33190f18ee6
children
comparison
equal deleted inserted replaced
2:f33190f18ee6 3:7adb2518f6e9
1 <?xml version='1.0' encoding='UTF-8'?> 1 <tool id="RNAshapes" name="RNAshapes" version="3.3.0">
2 <tool id="RNAshapes" name="RNAshapes" version="3.2.2">
3 <description>RNA Secondary structure prediction</description> 2 <description>RNA Secondary structure prediction</description>
4 <macros> 3 <macros>
5 <token name="@EXECUTABLE@">RNAshapes</token> 4 <token name="@EXECUTABLE@">RNAshapes</token>
6 <macro name="macro_param_shape_level"> 5 <macro name="macro_param_shape_level">
7 <param name="param_shape_level" type="select" label="Shape Level" help="(--shapeLevel) 1: most concrete shape, ... 5: most abstract shape."> 6 <param name="param_shape_level" type="select" label="Shape Level" help="(--shapeLevel) 1: most concrete shape, ... 5: most abstract shape.">
63 </macro> 62 </macro>
64 <macro name="macro_param_structure_probs"> 63 <macro name="macro_param_structure_probs">
65 <param name="param_structure_probs" type="boolean" truevalue="1" falsevalue="0" label="Structure Probabilities" help="(--structureProbs) In addition to free energy also the probability of structures is calculated."/> 64 <param name="param_structure_probs" type="boolean" truevalue="1" falsevalue="0" label="Structure Probabilities" help="(--structureProbs) In addition to free energy also the probability of structures is calculated."/>
66 </macro> 65 </macro>
67 </macros> 66 </macros>
68
69 <requirements> 67 <requirements>
70 <requirement type="binary">@EXECUTABLE@</requirement> 68 <requirement type="binary">@EXECUTABLE@</requirement>
71 <requirement type="package" version="3.2.5">rnashapes</requirement> 69 <requirement type="package" version="3.3.0">rnashapes</requirement>
72 </requirements> 70 </requirements>
73 <stdio> 71 <command detect_errors="aggressive">
74 <exit_code range="1:"/> 72 <![CDATA[
75 <exit_code range=":-1"/>
76 <regex match="Error:"/>
77 <regex match="Exception:"/>
78 </stdio>
79
80 <command>
81 <![CDATA[
82 RNAshapes 73 RNAshapes
83 --mode $param_cond_mode.param_mode 74 --mode $param_cond_mode.param_mode
84 #if $param_cond_mode.param_mode != 'outside': 75 #if $param_cond_mode.param_mode != 'outside':
85 > $param_out 76 > $param_out
86 #end if 77 #end if
228 </test> 219 </test>
229 </tests> 220 </tests>
230 221
231 <help> 222 <help>
232 223
233 **What id does** 224 **What it does**
234 225
235 RNA secondary structure predictions 226 This tool predicts RNA secondary structures. RNAshape abstraction maps structures to a tree-like domain of shapes, retaining adjacency and nesting of structural features, but disregarding helix lengths. Shape abstraction integrates well with dynamic programming algorithms, and hence it can be applied during structure prediction rather than afterwards. This avoids exponential explosion and can still give us a non-heuristic and complete account of properties of the molecule's folding space.
236 227
237 ----- 228
238 229 **Input**
239 **modes** 230
240 231 RNA sequence(s): A (multiple) FASTA file, containing RNA primary sequences.
241 + **mfe**: Computes the single energetically most stable secondary structure for the given RNA sequence. Co-optimal results will be suppressed, i.e. should different prediction have the same best energy value, just an arbitrary one out of them will be reported. This resembles the function of the program "RNAfold" of the Vienna group. If you only use "mfe" mode, consider switching to RNAfold, because their implementation is much faster, due to sophisticated low level C optimisations. 232 RNA secondary structure: A Vienna dot-bracket formatted string, representing a seconday RNA structure.
242 + **subopt**: Often, the biological relevant structure is hidden among suboptimal predictions. In "subopt" mode, you can also inspect all suboptimal solutions up to a given threshold (see parameters --absoluteDeviation and --relativeDeviation). Duplicates might appear when using grammar "microstate", due to its semantic ambiguity according Vienna-Dot-Bracket strings. 233 RNA sequence: Exactly one RNA primary sequence.
243 + **shapes**: Output of "subopt" mode is crowded by many very similar answers, which make it hard to focus to the "important" changes. The abstract shape concept groups similar answers together and reports only the best answer within such a group. Due to abstraction, suboptimal analyses can be done more thorough, by ignoring boring differences. (see parameter --shapeLevel) 234 RNA family: A family of at least two potentially related RNA sequences. This is not an alignment, since sequences can have different lengths.
244 + **probs**: Structure probabilities are strictly correlated to their energy values. Grouped together into shape classes, their probabilities add up. Often a shape class with many members of worse energy becomes more probable than the shape containing the mfe structure but not much more members. 235
245 + **sample**: Probabilistic sampling based on partition function. This mode combines stochastic sampling with a-posteriori shape abstraction. A sample from the structure space holds M structures together with their shapes, on which classification is performed. The probability of a shape can then be approximated by its frequency in the sample. 236
246 + **cast**: This mode is the RNAcast approach. For a family of RNA sequences, this method independently enumerates the near-optimal abstract shape space, and predicts as the consensus an abstract shape common to all sequences. For each sequence, it delivers the thermodynamically best structure which has this common shape. Input is a multiple fasta file, which should contain at least two sequences. Output is sorted by "score" of common shapes, i.e. summed free energy of all sequences. R is the rank (= list position) of the shape in individual sequence analysis. 237 **Parameters**
247 + **eval**: Evaluates the free energy of an RNA molecule in fixed secondary structure, similar to RNAeval from the Vienna group. Multiple answers stem from semantic ambiguity of the underlying grammar. It might happen, that your given structure is not a structure for the sequence. Maybe your settings are too restrictive, e.g. not allowing lonely base-pairs (--allowLP). If you input a (multiple) FASTA file, RNAshapes assumes that exactly first half of the contents of each entry is RNA sequence, second half is the according structure. Whitespaces are ignored. 238
248 + **abstract**: Converts a Vienna-Dot-Bracket representation of a secondary structure into a shape string. 239 **Calculation Modes**
249 + **outside**: Applies the "outside"-algorithm to compute probabilities for all base pairs (i,j), based on the partition function. Output is a PostScript file, visualizing these probabilities as a "dot plot". The "dot plot" shows a matrix of squares with area proportional to the base pair probabilities in the upper right half. For each pair (i,j) with probability above --bppmThreshold there is a line of the form i j sqrt(p) ubox in the PostScript file, so that they can be easily extracted. 240
250 + **mea**: Finds the secondary structure with the maximal sum of base-pair probabilities (MEA=maximal expected accuracy). The equivalent Vienna Package name is the 'centroid secondary structure', defined as 'The centroid structure is the structure with the minimum total base-pair distance to all structures in the thermodynamic ensemble.'. 241 - **shapes**: Output of "subopt" mode is crowded by many very similar answers, which make it hard to focus to the "important" changes. The abstract shape concept groups similar answers together and reports only the best answer within such a group. Due to abstraction, suboptimal analyses can be done more thorough, by ignoring boring differences. (see parameter --shapeLevel)
251 242
252 ----- 243 - **mfe**: Computes the single energetically most stable secondary structure for the given RNA sequence. Co-optimal results will be suppressed, i.e. should different prediction have the same best energy value, just an arbitrary one out of them will be reported. This resembles the function of the program "RNAfold" of the Vienna group. If you only use "mfe" mode, consider switching to RNAfold, because their implementation is much faster, due to sophisticated low level C optimisations.
253 244
254 **grammar** 245 - **subopt**: Often, the biological relevant structure is hidden among suboptimal predictions. In "subopt" mode, you can also inspect all suboptimal solutions up to a given threshold (see parameters --absoluteDeviation and --relativeDeviation). Duplicates might appear when using grammar "microstate", due to its semantic ambiguity according Vienna-Dot-Bracket strings.
246
247 - **probs**: Structure probabilities are strictly correlated to their energy values. Grouped together into shape classes, their probabilities add up. Often a shape class with many members of worse energy becomes more probable than the shape containing the mfe structure but not much more members.
248
249 - **sample**: Probabilistic sampling based on partition function. This mode combines stochastic sampling with a-posteriori shape abstraction. A sample from the structure space holds M structures together with their shapes, on which classification is performed. The probability of a shape can then be approximated by its frequency in the sample.
250
251 - **cast**: This mode is the RNAcast approach. For a family of RNA sequences, this method independently enumerates the near-optimal abstract shape space, and predicts as the consensus an abstract shape common to all sequences. For each sequence, it delivers the thermodynamically best structure which has this common shape. Input is a multiple fasta file, which should contain at least two sequences. Output is sorted by "score" of common shapes, i.e. summed free energy of all sequences. R is the rank (= list position) of the shape in individual sequence analysis.
252
253 - **eval**: Evaluates the free energy of an RNA molecule in fixed secondary structure, similar to RNAeval from the Vienna group. Multiple answers stem from semantic ambiguity of the underlying grammar. It might happen, that your given structure is not a structure for the sequence. Maybe your settings are too restrictive, e.g. not allowing lonely base-pairs (--allowLP). If you input a (multiple) FASTA file, RNAshapes assumes that exactly first half of the contents of each entry is RNA sequence, second half is the according structure. Whitespaces are ignored.
254
255 - **abstract**: Converts a Vienna-Dot-Bracket representation of a secondary structure into a shape string.
256
257 - **outside**: Applies the "outside"-algorithm to compute probabilities for all base pairs (i,j), based on the partition function. Output is a PostScript file, visualizing these probabilities as a "dot plot". The "dot plot" shows a matrix of squares with area proportional to the base pair probabilities in the upper right half. For each pair (i,j) with probability above --bppmThreshold there is a line of the form i j sqrt(p) ubox in the PostScript file, so that they can be easily extracted.
258
259 - **mea**: Finds the secondary structure with the maximal sum of base-pair probabilities (MEA=maximal expected accuracy). The equivalent Vienna Package name is the 'centroid secondary structure', defined as 'The centroid structure is the structure with the minimum total base-pair distance to all structures in the thermodynamic ensemble.'.
260
261
262 **Window Size**
263
264 Activates window mode and computes substrings of size i for the input. After computation for the first i bases is done, the window is pushed j bases to the right and the next computation is startet. j is set by --windowIncrement. i must be a non-zero positive integer, smaller than the input length.
265
266
267 **windowIncrement**
268
269 If --windowSize is given, this parameter sets the offset for the next window to j bases. j must be a non-zero positive integer, smaller than --windowSize.
270
271
272 **Dangling End Energies**
255 273
256 How to treat "dangling end" energies for bases adjacent to helices in free ends and multi-loops. 274 How to treat "dangling end" energies for bases adjacent to helices in free ends and multi-loops.
257 275
258 + **nodangle**: (-d 0 in Vienna package) ignores dangling energies altogether. 276 - **nodangle**: (-d 0 in Vienna package) ignores dangling energies altogether.
259 + **overdangle**: (-d 2 in Vienna package) always dangles bases onto helices, even if they are part of neighboring helices themselves. Seems to be wrong, but could perform surprisingly well. 277 - **overdangle**: (-d 2 in Vienna package) always dangles bases onto helices, even if they are part of neighboring helices themselves. Seems to be wrong, but could perform surprisingly well.
260 + **microstate**: (-d 1 in Vienna package) correct optimisation of all dangling possibilities, unfortunately this results in an semantically ambiguous search space regarding Vienna-Dot-Bracket notations. 278 - **microstate**: (-d 1 in Vienna package) correct optimisation of all dangling possibilities, unfortunately this results in an semantically ambiguous search space regarding Vienna-Dot-Bracket notations.
261 + **macrostate**: (no correspondens in Vienna package) same as microstate, while staying unambiguous. Unfortunately, mfe computation violates Bellman's principle of optimality. Default is "macrostate". 279 - **macrostate**: (no correspondens in Vienna package) same as microstate, while staying unambiguous. Unfortunately, mfe computation violates Bellman's principle of optimality. Default is "macrostate".
262 280
263 -----
264
265 **windowSize**
266
267 Activates window mode and computes substrings of size i for the input. After computation for the first i bases is done, the window is pushed j bases to the right and the next computation is startet. j is set by --windowIncrement. i must be a non-zero positive integer, smaller than the input length.
268
269 **windowIncrement**
270
271 If --windowSize is given, this parameter sets the offset for the next window to j bases. j must be a non-zero positive integer, smaller than --windowSize.
272
273 -----
274 281
275 For more information, visit http://bibiserv2.cebitec.uni-bielefeld.de/rnashapes?id=rnashapes_rnashapes_manual_manual 282 For more information, visit http://bibiserv2.cebitec.uni-bielefeld.de/rnashapes?id=rnashapes_rnashapes_manual_manual
283
276 </help> 284 </help>
285
277 <citations> 286 <citations>
278 <citation type="doi">doi:10.1093/bioinformatics/btu649</citation> 287 <citation type="doi">doi:10.1093/bioinformatics/btu649</citation>
279 </citations> 288 </citations>
280 </tool> 289 </tool>