comparison MUMmer/mummer_maxmatch.xml @ 0:61f30d177448 default tip

initial commit on Mummer toolsuite on toolshed
author eric
date Tue, 31 Mar 2015 14:19:49 +0200
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:61f30d177448
1 <tool id="mummer_maxmatch" name="MUMmer MaxMatch" version="0.9.alx" force_history_refresh="True">
2 <description>: Maximal exact sequence matching</description>
3 <command>
4 <!-- update this path to the installed location -->
5 $tool.cmd
6 #if $tool.cmd=="mummer":
7 $tool.cmd_extra
8 $tool.mum_ref_in
9 $tool.mum_q_in
10 #end if
11 #if $tool.cmd=="repeat-match":
12 -n $tool.rm_n
13 #if $tool.rm_E=="yes":
14 -E
15 #end if
16 $tool.cmd_extra
17 $tool.in_seq
18 #end if
19 #if $tool.cmd=="exact-tandems":
20 $tool.in_seq
21 $tool.et_minl
22 #end if
23 <!-- unfortunate somehow error state gets set also on succesfull jobs. Pipe io stderr to dev/null -->
24 2&gt;&amp;-
25 > $out_tool
26
27 </command>
28 <inputs>
29 <conditional name="tool">
30 <param name="cmd" type="select" value="mummer" label="MUMmer maximal matching" help="Algorithms are run with default parameters (none). For specific args see help below" >
31 <option value="mummer">mummer</option>
32 <option value="repeat-match">repeat-match</option>
33 <option value="exact-tandems">exact-tandems</option>
34 </param>
35 <when value="mummer">
36 <param name="mum_ref_in" type="data" format="fasta" label="Reference FastA file" />
37 <param name="mum_q_in" type="data" format="fasta" label="Query (multi) FastA sequence" />
38 <param name="cmd_extra" type="text" size="40" value="" label="Extra cmd line options" help="See specific cmd line options below for each tool" />
39 </when>
40 <when value="repeat-match">
41 <param name="in_seq" type="data" format="fasta" label="FastA sequence file" />
42 <param name="rm_n" type="text" size="5" value="20" label="Minimum exact match length [-n]" />
43 <param name="rm_E" type="select" value="no" label="Use exhaustive (slow) search to find matches [-E]" >
44 <option value="no">No</option>
45 <option value="yes">Yes</option>
46 </param>
47 <param name="cmd_extra" type="text" size="40" value="" label="Extra cmd line options" help="-n and -E are configured above. More specific cmd line options in help below." />
48 </when>
49 <when value="exact-tandems">
50 <param name="in_seq" type="data" format="fasta" label="FastA sequence file" />
51 <param name="et_minl" type="text" size="5" value="20" label="Minimum length" />
52 </when>
53 </conditional>
54 </inputs>
55 <outputs>
56 <data name="out_tool" format="text" label="Max exact match output" />
57 </outputs>
58 <requirements>
59 <!-- <requirement type="set_environment" version="3.23">MUMMER_PATH</requirement> -->
60 <requirement type="package" version="4.6.4">gnuplot</requirement>
61 <requirement type="package" version="3.23">mummer</requirement>
62 </requirements>
63 <tests>
64 <test>
65 </test>
66 </tests>
67 <help>
68 |
69
70
71 **Reference**
72 =============
73
74 - **MUMmer MaxExactMatch Galaxy tool wrapper:** Alex Bossers, CVI of Wageningen UR, The Netherlands.
75
76 - **MUMmer suite v3.22:** http://mummer.sourceforge.net
77
78 - **MUMmer tutorials:** http://mummer.sourceforge.net/examples/
79
80 Please do not use any of the command line options that modify prefixes or file names. As obvious
81 they are quite useless within galaxy and are likely to fail the routine!
82
83 If you found these tools/wrappers usefull in your research, please acknowledge our work. If you improve
84 or modify the wrappers please add instead of substitute yourself into the acknowlegement section :)
85
86
87
88 **MUMmer Maximal exact matching**
89 =================================
90
91 The heart of the MUMmer package is its suffix tree based maximal matching routines. These can be
92 used for repeat detection within a single sequence as is done by *repeat-match* and *exact-tandems*,
93 or can be used for the alignment of two or more sequences as is done by *mummer*.
94
95 Mummer
96 ------
97
98 mummer is a suffix tree algorithm designed to find maximal exact matches of some minimum length
99 between two input sequences. by default mummer will only find maximal matches that are unique in
100 the entire set of reference sequences. The match lists produced by mummer can be used alone to
101 generate alignment dot plots, or can be passed on to the clustering algorithms for the identification
102 of longer non-exact regions of conservation. These match lists have great versatility because they
103 contain huge amounts of information and can be passed forward to other interpretation programs for
104 clustering, analysis, searching, etc.
105
106
107 Repeat-match
108 ------------
109
110 repeat-match is a suffix tree algorithm designed to find maximal exact repeats within a single input
111 sequence. It uses a similar algorithm to mummer, but altered slightly to find maximal exact matches
112 within a single sequence.
113
114 Output formatting varies depending on the command line parameters and the output can be quite large.
115 The standard output format that results from running repeat-match with default parameters is as follows:
116 ::
117
118 Long Exact Matches:
119 Start1 Start2 Length
120 4919485 4919506r 22
121
122 The three columns are the first position of the repeat, the second position of the repeat, and the
123 length of the repeat respectively. Reverse complement repeat positions are denoted by an 'r'
124 following the Start2 position, and are relative to the forward strand of the sequence.
125
126
127 Exact-tandems
128 -------------
129
130 exact-tandems is a wrapper script for the repeat-match program. It provides a list of exact tandem
131 repeats within a single input sequence. As with repeat-match the sequence file should contain only
132 one sequence in FastA format, however if multiple sequences exist the first one will be used. The
133 sequence may contain any set of upper and lowercase characters, thus DNA and protein sequence are
134 both allowed and matching is case insensitive. The minimum match length parameter should be a
135 positive integer, this value will be passed to the repeat-match program via the -n option.
136
137 The output format of exact-tandems is as follows:
138 ::
139
140 Finding matches
141 Tandem repeats
142 Start Extent UnitLen Copies
143 416173 150 45 3.3
144
145 The four columns are the first position of the tandem, the extent of the repeat region, the length
146 of each tandem repeat unit, and the number of repeat units respectively.
147
148
149
150 **Manuals and CMD line options (specific for each tool!):**
151 ===========================================================
152
153 **Mummer**
154
155 http://mummer.sourceforge.net/manual/#mummer
156
157 **Repeat-match**
158
159 http://mummer.sourceforge.net/manual/#repeat
160
161 **exact-tandems**
162
163 http://mummer.sourceforge.net/manual/#exact
164
165 |
166 |
167
168 </help>
169 </tool>
170