Mercurial > repos > ktnyt > gembassy
comparison GEMBASSY-1.0.3/doc/html/gshuffleseq.html @ 2:8947fca5f715 draft default tip
Uploaded
author | ktnyt |
---|---|
date | Fri, 26 Jun 2015 05:21:44 -0400 |
parents | 84a17b3fad1f |
children |
comparison
equal
deleted
inserted
replaced
1:84a17b3fad1f | 2:8947fca5f715 |
---|---|
1 <!--START OF HEADER - DON'T ALTER --> | |
2 | |
3 <HTML> | |
4 <HEAD> | |
5 <TITLE> EMBOSS: gshuffleseq </TITLE> | |
6 </HEAD> | |
7 <BODY BGCOLOR="#FFFFFF" text="#000000"> | |
8 | |
9 | |
10 | |
11 <table align=center border=0 cellspacing=0 cellpadding=0> | |
12 <tr><td valign=top> | |
13 <A HREF="/" ONMOUSEOVER="self.status='Go to the EMBOSS home page';return true"><img border=0 src="http://soap.g-language.org/gembassy/emboss_explorer/manual/emboss_icon.jpg" alt="" width=150 height=48></a> | |
14 </td> | |
15 <td align=left valign=middle> | |
16 <b><font size="+6"> | |
17 gshuffleseq | |
18 </font></b> | |
19 </td></tr> | |
20 </table> | |
21 <br> | |
22 <p> | |
23 | |
24 | |
25 <!--END OF HEADER--> | |
26 | |
27 | |
28 | |
29 | |
30 | |
31 | |
32 <H2> Function </H2> | |
33 Create randomized sequence with conserved k-mer composition | |
34 <!-- | |
35 DON'T WRITE ANYTHING HERE. | |
36 IT IS DONE FOR YOU. | |
37 --> | |
38 | |
39 | |
40 | |
41 | |
42 <H2>Description</H2> | |
43 <p> | |
44 gshuffleseq shuffles and randomizes the given sequence, conserving the<br /> | |
45 nucleotide/peptide k-mer content of the original sequence.<br /> | |
46 <br /> | |
47 For k=1, i.e. shuffling sequencing preserving single nucleotide composition,<br /> | |
48 Fisher-Yates Algorithm is employed.<br /> | |
49 For k>1, shuffling preserves all k-mers (all k where k=1~k). For example,<br /> | |
50 k=3 preserves all triplet, doublet, and single nucleotide composition.<br /> | |
51 Algorithm for k-mer preserved shuffling is non-trivial, which is solved<br /> | |
52 by graph theoretical approach with Eulerian random walks in the graph of<br /> | |
53 k-1-mers. See Jiang et al., Kandel et al., and Propp et al., for details<br /> | |
54 of this algorithm.<br /> | |
55 <br /> | |
56 G-language SOAP service is provided by the<br /> | |
57 Institute for Advanced Biosciences, Keio University.<br /> | |
58 The original web service is located at the following URL:<br /> | |
59 <br /> | |
60 http://www.g-language.org/wiki/soap<br /> | |
61 <br /> | |
62 WSDL(RPC/Encoded) file is located at:<br /> | |
63 <br /> | |
64 http://soap.g-language.org/g-language.wsdl<br /> | |
65 <br /> | |
66 Documentation on G-language Genome Analysis Environment methods are<br /> | |
67 provided at the Document Center<br /> | |
68 <br /> | |
69 http://ws.g-language.org/gdoc/<br /> | |
70 <br /> | |
71 | |
72 </p> | |
73 | |
74 <H2>Usage</H2> | |
75 | |
76 Here is a sample session with gshuffleseq | |
77 | |
78 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
79 | |
80 % gshuffleseq tsw:hbb_human | |
81 Create randomized sequence with conserved k-mer composition | |
82 output sequence [hbb_human.fasta]: | |
83 | |
84 </pre></td></tr></table> | |
85 | |
86 Go to the <a href="#input">input files</a> for this example<br> | |
87 Go to the <a href="#output">output files</a> for this example<br><br> | |
88 | |
89 <h2>Command line arguments</h2> | |
90 | |
91 <table border cellspacing=0 cellpadding=3 bgcolor="#ccccff"> | |
92 <tr bgcolor="#FFFFCC"> | |
93 <th align="left">Qualifier</th> | |
94 <th align="left">Type</th> | |
95 <th align="left">Description</th> | |
96 <th align="left">Allowed values</th> | |
97 <th align="left">Default</th> | |
98 </tr> | |
99 | |
100 <tr bgcolor="#FFFFCC"> | |
101 <th align="left" colspan=5>Standard (Mandatory) qualifiers</th> | |
102 </tr> | |
103 | |
104 <tr bgcolor="#FFFFCC"> | |
105 <td>[-sequence]<br>(Parameter 1)</td> | |
106 <td>seqall</td> | |
107 <td>Sequence(s) filename and optional format, or reference (input USA)</td> | |
108 <td>Readable sequence(s)</td> | |
109 <td><b>Required</b></td> | |
110 </tr> | |
111 | |
112 <tr bgcolor="#FFFFCC"> | |
113 <td>[-outseq]<br>(Parameter 2)</td> | |
114 <td>seqout</td> | |
115 <td>Sequence filename and optional format (output USA)</td> | |
116 <td>Writeable sequence</td> | |
117 <td><i><*></i>.<i>format</i></td> | |
118 </tr> | |
119 | |
120 <tr bgcolor="#FFFFCC"> | |
121 <th align="left" colspan=5>Additional (Optional) qualifiers</th> | |
122 </tr> | |
123 | |
124 <tr> | |
125 <td colspan=5>(none)</td> | |
126 </tr> | |
127 | |
128 <tr bgcolor="#FFFFCC"> | |
129 <th align="left" colspan=5>Advanced (Unprompted) qualifiers</th> | |
130 </tr> | |
131 | |
132 <tr bgcolor="#FFFFCC"> | |
133 <td>-k</td> | |
134 <td>integer</td> | |
135 <td>Sequence k-mer to preserve composition</td> | |
136 <td>Any integer value</td> | |
137 <td>1</td> | |
138 </tr> | |
139 | |
140 </table> | |
141 | |
142 | |
143 <h2 id="input">Input file format</h2> | |
144 | |
145 <p> | |
146 The database definitions for following commands are available at<br /> | |
147 http://soap.g-language.org/kbws/embossrc<br /> | |
148 <br /> | |
149 gshuffleseq reads one or more nucleotide or protein sequences.<br /> | |
150 <br /> | |
151 | |
152 </p> | |
153 | |
154 <h2 id="output">Output file format</h2> | |
155 | |
156 <p> | |
157 The output from gshuffleseq is to .<br /> | |
158 <br /> | |
159 File: hbb_human.fasta<br /> | |
160 <br /> | |
161 <table width="90%"><tr><td bgcolor="#CCFFCC"> | |
162 >HBB_HUMAN P68871 Hemoglobin subunit beta (Beta-globin) (Hemoglobin beta chain) (LVV-hemorphin-7)<br /> | |
163 KGWLDLVAGAAHFVRRLKMLLEVDWAAHEERVGTSNPNNALKNEAADVEVHSPTHVNPTQ<br /> | |
164 LVLVQVGFGTLHLQGVECPKPKPGGVALKPVAHLLAMKECTLVALGSDFYVDHGSDGEDK<br /> | |
165 GFKAYVLATSFFAYTNFLHGKVKHVLF<br /> | |
166 </td></tr></table> | |
167 | |
168 </p> | |
169 | |
170 <h2>Data files</h2> | |
171 | |
172 <p> | |
173 None. | |
174 </p> | |
175 | |
176 <h2>Notes</h2> | |
177 | |
178 <p> | |
179 None. | |
180 </p> | |
181 | |
182 <h2>References</h2> | |
183 | |
184 <pre> | |
185 Fisher R.A. and Yates F. (1938) "Example 12", Statistical Tables, London | |
186 | |
187 Durstenfeld R. (1964) "Algorithm 235: Random permutation", CACM 7(7):420 | |
188 | |
189 Jiang M., Anderson J., Gillespie J., and Mayne M. (2008) "uShuffle: | |
190 a useful tool for shuffling biological sequences while preserving the | |
191 k-let counts", BMC Bioinformatics 9:192 | |
192 | |
193 Kandel D., Matias Y., Unver R., and Winker P. (1996) "Shuffling biological | |
194 sequences", Discrete Applied Mathematics 71(1-3):171-185 | |
195 | |
196 Propp J.G. and Wilson D.B. (1998) "How to get a perfectly random sample | |
197 from a generic Markov chain and generate a random spanning tree of a | |
198 directed graph", Journal of Algorithms 27(2):170-217 | |
199 | |
200 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and | |
201 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench | |
202 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306. | |
203 | |
204 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for | |
205 large-scale analysis of high-throughput omics data, J. Pest Sci., | |
206 31, 7. | |
207 | |
208 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome | |
209 Analysis Environment with REST and SOAP Web Service Interfaces, | |
210 Nucleic Acids Res., 38, W700-W705. | |
211 | |
212 </pre> | |
213 | |
214 <h2>Warnings</h2> | |
215 | |
216 <p> | |
217 None. | |
218 </p> | |
219 | |
220 <h2>Diagnostic Error Messages</h2> | |
221 | |
222 <p> | |
223 None. | |
224 </p> | |
225 | |
226 <h2>Exit status</h2> | |
227 | |
228 <p> | |
229 It always exits with a status of 0. | |
230 </p> | |
231 | |
232 <h2>Known bugs</h2> | |
233 | |
234 <p> | |
235 None. | |
236 </p> | |
237 | |
238 <h2>See also</h2> | |
239 | |
240 <table border cellpadding=4 bgcolor="#FFFFF0"><tr><th>Program name</th> | |
241 <th>Description</th></tr> | |
242 | |
243 <tr> | |
244 <td><a href="shuffleseq.html">shuffleseq</a></td> | |
245 <td>Shuffles a set of sequences maintaining composition</td> | |
246 </tr> | |
247 | |
248 </table> | |
249 | |
250 <h2>Author(s)</h2> | |
251 | |
252 <pre> | |
253 Hidetoshi Itaya (celery@g-language.org) | |
254 Institute for Advanced Biosciences, Keio University | |
255 252-0882 Japan | |
256 | |
257 Kazuharu Arakawa (gaou@sfc.keio.ac.jp) | |
258 Institute for Advanced Biosciences, Keio University | |
259 252-0882 Japan</pre> | |
260 | |
261 <h2>History</h2> | |
262 | |
263 2012 - Written by Hidetoshi Itaya | |
264 | |
265 <h2>Target users</h2> | |
266 | |
267 This program is intended to be used by everyone and everything, from | |
268 naive users to embedded scrips. | |
269 | |
270 <h2>Comments</h2> | |
271 | |
272 None. | |
273 | |
274 </BODY> | |
275 </HTML> |