# HG changeset patch # User bjoern-gruening # Date 1423823683 18000 # Node ID 6d37d9c1e9bbe6b32acbac31c847a8fe311daa4c Imported from capsule None diff -r 000000000000 -r 6d37d9c1e9bb RNAShapes.xml --- /dev/null Thu Jan 01 00:00:00 1970 +0000 +++ b/RNAShapes.xml Fri Feb 13 05:34:43 2015 -0500 @@ -0,0 +1,263 @@ + + + RNA Secondary structure prediction + + RNAshapes + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + @EXECUTABLE@ + RNAshapes + + + + + + + + + + RNAshapes + --mode $param_cond_mode.param_mode + #if $param_cond_mode.param_mode == 'outside': + --dotplot $param_out_dotplot + #else: + > $param_out + #end if + #if $varExists('param_cond_mode.param_shape_level'): + --shapeLevel $param_cond_mode.param_shape_level + #end if + #if $varExists('param_cond_mode.param_low_prob_filter'): + --lowProbFilter $param_cond_mode.param_low_prob_filter + #end if + #if $varExists('param_cond_mode.param_bppm_threshold'): + --bppmThreshold $param_cond_mode.param_bppm_threshold + #end if + #if $varExists('param_cond_mode.param_out_low_prob_filter'): + --outputLowProbFilter $param_cond_mode.param_out_low_prob_filter + #end if + #if $varExists('param_cond_mode.param_show_samples'): + --showSamples $param_cond_mode.param_show_samples + #end if + #if $varExists('param_cond_mode.param_prob_decimals'): + --probDecimals $param_cond_mode.param_prob_decimals + #end if + #if $varExists('param_cond_mode.param_num_samples'): + --numSamples $param_cond_mode.param_num_samples + #end if + #if $varExists('param_cond_mode.param_absolute_deviation') and $param_cond_mode.param_absolute_deviation: + --absoluteDeviation $param_cond_mode.param_absolute_deviation + #end if + #if $varExists('param_cond_mode.param_relative_deviation') and $param_cond_mode.param_relative_deviation: + --relativeDeviation $param_cond_mode.param_relative_deviation + #end if + #if $varExists('param_cond_mode.param_window_size') and $param_cond_mode.param_window_size: + --windowSize $param_cond_mode.param_window_size + #end if + #if $varExists('param_cond_mode.param_window_increment') and $param_cond_mode.param_window_increment: + --windowIncrement $param_cond_mode.param_window_increment + #end if + #if $varExists('param_cond_mode.param_structure_probs'): + --structureProbs $param_cond_mode.param_structure_probs + #end if + --grammar $param_grammar + --allowLP $param_allow_lp + $param_in + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + (param_cond_mode['param_mode'] != 'outside') + + + (param_cond_mode['param_mode'] == 'outside') + + + + +**What id does** + +RNA secondary structure predictions + +----- + +**modes** + ++ **mfe**: Computes the single energetically most stable secondary structure for the given RNA sequence. Co-optimal results will be suppressed, i.e. should different prediction have the same best energy value, just an arbitrary one out of them will be reported. This resembles the function of the program "RNAfold" of the Vienna group. If you only use "mfe" mode, consider switching to RNAfold, because their implementation is much faster, due to sophisticated low level C optimisations. ++ **subopt**: Often, the biological relevant structure is hidden among suboptimal predictions. In "subopt" mode, you can also inspect all suboptimal solutions up to a given threshold (see parameters --absoluteDeviation and --relativeDeviation). Duplicates might appear when using grammar "microstate", due to its semantic ambiguity according Vienna-Dot-Bracket strings. ++ **shapes**: Output of "subopt" mode is crowded by many very similar answers, which make it hard to focus to the "important" changes. The abstract shape concept groups similar answers together and reports only the best answer within such a group. Due to abstraction, suboptimal analyses can be done more thorough, by ignoring boring differences. (see parameter --shapeLevel) ++ **probs**: Structure probabilities are strictly correlated to their energy values. Grouped together into shape classes, their probabilities add up. Often a shape class with many members of worse energy becomes more probable than the shape containing the mfe structure but not much more members. ++ **sample**: Probabilistic sampling based on partition function. This mode combines stochastic sampling with a-posteriori shape abstraction. A sample from the structure space holds M structures together with their shapes, on which classification is performed. The probability of a shape can then be approximated by its frequency in the sample. ++ **cast**: This mode is the RNAcast approach. For a family of RNA sequences, this method independently enumerates the near-optimal abstract shape space, and predicts as the consensus an abstract shape common to all sequences. For each sequence, it delivers the thermodynamically best structure which has this common shape. Input is a multiple fasta file, which should contain at least two sequences. Output is sorted by "score" of common shapes, i.e. summed free energy of all sequences. R is the rank (= list position) of the shape in individual sequence analysis. ++ **eval**: Evaluates the free energy of an RNA molecule in fixed secondary structure, similar to RNAeval from the Vienna group. Multiple answers stem from semantic ambiguity of the underlying grammar. It might happen, that your given structure is not a structure for the sequence. Maybe your settings are too restrictive, e.g. not allowing lonely base-pairs (--allowLP). If you input a (multiple) FASTA file, RNAshapes assumes that exactly first half of the contents of each entry is RNA sequence, second half is the according structure. Whitespaces are ignored. ++ **abstract**: Converts a Vienna-Dot-Bracket representation of a secondary structure into a shape string. ++ **outside**: Applies the "outside"-algorithm to compute probabilities for all base pairs (i,j), based on the partition function. Output is a PostScript file, visualizing these probabilities as a "dot plot". The "dot plot" shows a matrix of squares with area proportional to the base pair probabilities in the upper right half. For each pair (i,j) with probability above --bppmThreshold there is a line of the form i j sqrt(p) ubox in the PostScript file, so that they can be easily extracted. ++ **mea**: Finds the secondary structure with the maximal sum of base-pair probabilities (MEA=maximal expected accuracy). The equivalent Vienna Package name is the 'centroid secondary structure', defined as 'The centroid structure is the structure with the minimum total base-pair distance to all structures in the thermodynamic ensemble.'. + +----- + +**grammar** + +How to treat "dangling end" energies for bases adjacent to helices in free ends and multi-loops. + ++ **nodangle**: (-d 0 in Vienna package) ignores dangling energies altogether. ++ **overdangle**: (-d 2 in Vienna package) always dangles bases onto helices, even if they are part of neighboring helices themselves. Seems to be wrong, but could perform surprisingly well. ++ **microstate**: (-d 1 in Vienna package) correct optimisation of all dangling possibilities, unfortunately this results in an semantically ambiguous search space regarding Vienna-Dot-Bracket notations. ++ **macrostate**: (no correspondens in Vienna package) same as microstate, while staying unambiguous. Unfortunately, mfe computation violates Bellman's principle of optimality. Default is "macrostate". + +----- + +**windowSize** + +Activates window mode and computes substrings of size i for the input. After computation for the first i bases is done, the window is pushed j bases to the right and the next computation is startet. j is set by --windowIncrement. i must be a non-zero positive integer, smaller than the input length. + +**windowIncrement** + +If --windowSize is given, this parameter sets the offset for the next window to j bases. j must be a non-zero positive integer, smaller than --windowSize. + +----- + +For more information, visit http://bibiserv2.cebitec.uni-bielefeld.de/rnashapes?id=rnashapes_rnashapes_manual_manual + + + doi:10.1093/bioinformatics/btu649 + +