Mercurial > repos > ktnyt > gembassy
comparison GEMBASSY-1.0.3/doc/html/genret.html @ 2:8947fca5f715 draft default tip
Uploaded
| author | ktnyt |
|---|---|
| date | Fri, 26 Jun 2015 05:21:44 -0400 |
| parents | 84a17b3fad1f |
| children |
comparison
equal
deleted
inserted
replaced
| 1:84a17b3fad1f | 2:8947fca5f715 |
|---|---|
| 1 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" | |
| 2 "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd"> | |
| 3 <html xmlns='http://www.w3.org/1999/xhtml' xml:lang='en' lang='en'> | |
| 4 <head> | |
| 5 <title>EMBOSS: genret manual</title> | |
| 6 <link rel='stylesheet' type='text/css' href='/gembassy/emboss_explorer/style/emboss.css' /> | |
| 7 | |
| 8 </head> | |
| 9 <body> | |
| 10 <div id='manual'> | |
| 11 <!-- tfm output starts here --> | |
| 12 | |
| 13 | |
| 14 | |
| 15 | |
| 16 <table align=center border=0 cellspacing=0 cellpadding=0> | |
| 17 <tr><td valign=top> | |
| 18 <A HREF="/" ONMOUSEOVER="self.status='Go to the EMBOSS home page';return true"><img border=0 src="/gembassy/emboss_explorer/manual/emboss_icon.jpg" alt="" width=150 height=48></a> | |
| 19 </td> | |
| 20 <td align=left valign=middle> | |
| 21 <b><font size="+6"> | |
| 22 genret | |
| 23 </font></b> | |
| 24 </td></tr> | |
| 25 </table> | |
| 26 <br> | |
| 27 <p> | |
| 28 | |
| 29 | |
| 30 <!--END OF HEADER--> | |
| 31 | |
| 32 | |
| 33 | |
| 34 | |
| 35 | |
| 36 | |
| 37 <H2> Function </H2> | |
| 38 Retrieves various gene related information from genome flatfile | |
| 39 <!-- | |
| 40 DON'T WRITE ANYTHING HERE. | |
| 41 IT IS DONE FOR YOU. | |
| 42 --> | |
| 43 | |
| 44 | |
| 45 | |
| 46 | |
| 47 <H2>Description</H2> | |
| 48 <p> | |
| 49 genret reads in one or more genome flatfiles and retrieves various data from<br /> | |
| 50 the input file. It is a wrapper program to the G-language REST service,<br /> | |
| 51 where a method is specified by giving a string to the "method" qualifier. By<br /> | |
| 52 default, genret will parse the input file to retrieve the accession ID<br /> | |
| 53 (or name) of the genome to query G-language REST service. By setting the<br /> | |
| 54 "accid" qualifier to false (or 0), genret will instead parse the sequence<br /> | |
| 55 and features of the genome to create a GenBank formatted flatfile and upload<br /> | |
| 56 the file to the G-language web server. Using the file uploaded, genret will<br /> | |
| 57 execute the method provided.<br /> | |
| 58 <br /> | |
| 59 genret is able to perform a variety of tasks, incluing the retrieval of<br /> | |
| 60 sequence upstream, downstream, or around the start or stop codon,<br /> | |
| 61 translated gene sequences search of gene data by keyword.<br /> | |
| 62 <br /> | |
| 63 Details on G-language REST service is available from the wiki page<br /> | |
| 64 <br /> | |
| 65 http://www.g-language.org/wiki/rest<br /> | |
| 66 <br /> | |
| 67 Documentation on G-language Genome Analysis Environment methods are<br /> | |
| 68 provided at the Document Center<br /> | |
| 69 <br /> | |
| 70 http://ws.g-language.org/gdoc/<br /> | |
| 71 <br /> | |
| 72 | |
| 73 </p> | |
| 74 | |
| 75 <H2>Usage</H2> | |
| 76 | |
| 77 Here is a sample session with genret<br><br> | |
| 78 | |
| 79 Retrieving sequences upstream, downstream, or around the start/stop codons. | |
| 80 The following example shows the retrieval of sequence around the start | |
| 81 codons of all genes.<br><br> | |
| 82 | |
| 83 Genes to access are specified by regular expression. '*' stands for every | |
| 84 gene.<br><br> | |
| 85 | |
| 86 Available methods are:<br> | |
| 87 after_startcodon<br> | |
| 88 after_stopcodon<br> | |
| 89 around_startcodon<br> | |
| 90 around_stopcodon<br> | |
| 91 before_startcodon<br> | |
| 92 before_stopcodon<br><br> | |
| 93 | |
| 94 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
| 95 | |
| 96 % genret | |
| 97 Retrieves various gene related information from genome flatfile | |
| 98 Input nucleotide sequence(s): refseqn:NC_000913 | |
| 99 Gene name(s) to lookup [*]: | |
| 100 Feature to access: around_startcodon | |
| 101 Full text output file [nc_000913.around_startcodon]: | |
| 102 | |
| 103 </pre></td></tr></table> | |
| 104 | |
| 105 Go to the <a href="#input">input files</a> for this example<br> | |
| 106 Go to the <a href="#output">output files</a> for this example<br><br> | |
| 107 | |
| 108 Example 2<br><br> | |
| 109 | |
| 110 Using flat text as target genes. The names can be split with with a space, comma, or vertical bar.<br><br> | |
| 111 | |
| 112 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
| 113 | |
| 114 % genret | |
| 115 Retrieves various gene related information from genome flatfile | |
| 116 Input nucleotide sequence(s): refseqn:NC_000913 | |
| 117 List of gene name(s) to report [*]: recA,recB | |
| 118 Name of gene feature to access: translation | |
| 119 Sequence output file [nc_000913.translation.genret]: stdout | |
| 120 >recA | |
| 121 MAIDENKQKALAAALGQIEKQFGKGSIMRLGEDRSMDVETISTGSLSLDIALGAGGLPMGR | |
| 122 IVEIYGPESSGKTTLTLQVIAAAQREGKTCAFIDAEHALDPIYARKLGVDIDNLLCSQPDT | |
| 123 GEQALEICDALARSGAVDVIVVDSVAALTPKAEIEGEIGDSHMGLAARMMSQAMRKLAGNL | |
| 124 KQSNTLLIFINQIRMKIGVMFGNPETTTGGNALKFYASVRLDIRRIGAVKEGENVVGSETR | |
| 125 VKVVKNKIAAPFKQAEFQILYGEGINFYGELVDLGVKEKLIEKAGAWYSYKGEKIGQGKAN | |
| 126 ATAWLKDNPETAKEIEKKVRELLLSNPNSTPDFSVDDSEGVAETNEDF | |
| 127 >recB | |
| 128 MSDVAETLDPLRLPLQGERLIEASAGTGKTFTIAALYLRLLLGLGGSAAFPRPLTVEELLV | |
| 129 VTFTEAATAELRGRIRSNIHELRIACLRETTDNPLYERLLEEIDDKAQAAQWLLLAERQMD | |
| 130 EAAVFTIHGFCQRMLNLNAFESGMLFEQQLIEDESLLRYQACADFWRRHCYPLPREIAQVV | |
| 131 FETWKGPQALLRDINRYLQGEAPVIKAPPPDDETLASRHAQIVARIDTVKQQWRDAVGELD | |
| 132 ALIESSGIDRRKFNRSNQAKWIDKISAWAEEETNSYQLPESLEKFSQRFLEDRTKAGGETP | |
| 133 RHPLFEAIDQLLAEPLSIRDLVITRALAEIRETVAREKRRRGELGFDDMLSRLDSALRSES | |
| 134 GEVLAAAIRTRFPVAMIDEFQDTDPQQYRIFRRIWHHQPETALLLIGDPKQAIYAFRGADI | |
| 135 FTYMKARSEVHAHYTLDTNWRSAPGMVNSVNKLFSQTDDAFMFREIPFIPVKSAGKNQALR | |
| 136 FVFKGETQPAMKMWLMEGESCGVGDYQSTMAQVCAAQIRDWLQAGQRGEALLMNGDDARPV | |
| 137 RASDISVLVRSRQEAAQVRDALTLLEIPSVYLSNRDSVFETLEAQEMLWLLQAVMTPEREN | |
| 138 TLRSALATSMMGLNALDIETLNNDEHAWDVVVEEFDGYRQIWRKRGVMPMLRALMSARNIA | |
| 139 ENLLATAGGERRLTDILHISELLQEAGTQLESEHALVRWLSQHILEPDSNASSQQMRLESD | |
| 140 KHLVQIVTIHKSKGLEYPLVWLPFITNFRVQEQAFYHDRHSFEAVLDLNAAPESVDLAEAE | |
| 141 RLAEDLRLLYVALTRSVWHCSLGVAPLVRRRGDKKGDTDVHQSALGRLLQKGEPQDAAGLR | |
| 142 TCIEALCDDDIAWQTAQTGDNQPWQVNDVSTAELNAKTLQRLPGDNWRVTSYSGLQQRGHG | |
| 143 IAQDLMPRLDVDAAGVASVVEEPTLTPHQFPRGASPGTFLHSLFEDLDFTQPVDPNWVREK | |
| 144 LELGGFESQWEPVLTEWITAVLQAPLNETGVSLSQLSARNKQVEMEFYLPISEPLIASQLD | |
| 145 TLIRQFDPLSAGCPPLEFMQVRGMLKGFIDLVFRHEGRYYLLDYKSNWLGEDSSAYTQQAM | |
| 146 AAAMQAHRYDLQYQLYTLALHRYLRHRIADYDYEHHFGGVIYLFLRGVDKEHPQQGIYTTR | |
| 147 PNAGLIALMDEMFAGMTLEEA | |
| 148 | |
| 149 </pre></td></tr></table> | |
| 150 | |
| 151 Go to the <a href="#input">input files</a> for this example<br> | |
| 152 Go to the <a href="#output">output files</a> for this example<br><br> | |
| 153 | |
| 154 Example 3<br><br> | |
| 155 | |
| 156 Using a file with a list of gene names. | |
| 157 The following example will retrieve the strand direction for each gene | |
| 158 listed in the "gene_list.txt" file. String prefixed with an "@" or "list::" | |
| 159 will be interpreted as file names.<br><br> | |
| 160 | |
| 161 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
| 162 | |
| 163 % genret | |
| 164 Retrieves various gene features from genome flatfile | |
| 165 Input nucleotide sequence(s): refseqn:NC_000913 | |
| 166 List of gene name(s) to report [*]: @gene_list.txt | |
| 167 Name of gene feature to access: direction | |
| 168 Full text output file [nc_000913.direction]: stdout | |
| 169 gene,direction | |
| 170 thrA,direct | |
| 171 thrB,direct | |
| 172 thrC,direct | |
| 173 | |
| 174 </pre></td></tr></table> | |
| 175 | |
| 176 Example 4<br><br> | |
| 177 | |
| 178 Retrieving translations of coding sequences.<br> | |
| 179 The following example will retrieve the translated protein sequence of | |
| 180 the "recA" gene.<br><br> | |
| 181 | |
| 182 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
| 183 | |
| 184 % genret | |
| 185 Retrieves various gene related information from genome flatfile | |
| 186 Input nucleotide sequence(s): refseqn:NC_000913 | |
| 187 Gene name(s) to lookup [*]: recA | |
| 188 Feature to access: translation | |
| 189 Full text output file [nc_000913.translation]: stdout | |
| 190 >recA | |
| 191 MAIDENKQKALAAALGQIEKQFGKGSIMRLGEDRSMDVETISTGSLSLDIALGAGGLPMGR | |
| 192 IVEIYGPESSGKTTLTLQVIAAAQREGKTCAFIDAEHALDPIYARKLGVDIDNLLCSQPDT | |
| 193 GEQALEICDALARSGAVDVIVVDSVAALTPKAEIEGEIGDSHMGLAARMMSQAMRKLAGNL | |
| 194 KQSNTLLIFINQIRMKIGVMFGNPETTTGGNALKFYASVRLDIRRIGAVKEGENVVGSETR | |
| 195 VKVVKNKIAAPFKQAEFQILYGEGINFYGELVDLGVKEKLIEKAGAWYSYKGEKIGQGKAN | |
| 196 ATAWLKDNPETAKEIEKKVRELLLSNPNSTPDFSVDDSEGVAETNEDF | |
| 197 | |
| 198 </pre></td></tr></table> | |
| 199 | |
| 200 Go to the <a href="#input">input files</a> for this example<br> | |
| 201 Go to the <a href="#output">output files</a> for this example<br><br> | |
| 202 | |
| 203 Example 5<br><br> | |
| 204 | |
| 205 Retrieving feature information of the genes.<br> | |
| 206 The following example will retrieve the start positions for each gene. | |
| 207 The values for the keys in GenBank format is available for retrieval. | |
| 208 (ex. start end direction GO* etc.)<br> | |
| 209 Positions will be returned with a 1 start value.<br><br> | |
| 210 | |
| 211 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
| 212 | |
| 213 % genret | |
| 214 Retrieves various gene related information from genome flatfile | |
| 215 Input nucleotide sequence(s): refseqn:NC_000913 | |
| 216 Gene name(s) to lookup [*]: | |
| 217 Feature to access: start | |
| 218 Full text output file [nc_000913.start]: | |
| 219 | |
| 220 </pre></td></tr></table> | |
| 221 | |
| 222 Go to the <a href="#input">input files</a> for this example<br> | |
| 223 Go to the <a href="#output">output files</a> for this example<br><br> | |
| 224 | |
| 225 Example 6<br><br> | |
| 226 | |
| 227 Passing extra arguments to the methods.<br> | |
| 228 The following example shows the retrieval of 30 base pairs around the | |
| 229 start codon of the "recA" gene. By default, the "around_startcodon" method | |
| 230 returns 200 base pairs around the start codon. Using the "-argument" | |
| 231 qualifier allows the user to change this value.<br><br> | |
| 232 | |
| 233 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
| 234 | |
| 235 % genret refseqn:NC_000913 recA around_startcodon -argument 30,30 stdout | |
| 236 Retrieves various gene features from genome flatfile | |
| 237 >recA | |
| 238 ccggtattacccggcatgacaggagtaaaaatggctatcgacgaaaacaaacagaaagcgt | |
| 239 tg | |
| 240 | |
| 241 </pre></td></tr></table> | |
| 242 | |
| 243 Go to the <a href="#input">input files</a> for this example<br> | |
| 244 Go to the <a href="#output">output files</a> for this example<br><br> | |
| 245 | |
| 246 Example 7<br><br> | |
| 247 | |
| 248 Re-annotating a flatfile. | |
| 249 genret supports re-annotation of a genome flatfile via Restauro-G | |
| 250 service developed by our team. | |
| 251 The original software is available at [<a href="http://restauro-g.iab.keio.ac.jp/">http://restauro-g.iab.keio.ac.jp</a>].<br><br> | |
| 252 | |
| 253 <table width="90%"><tr><td bgcolor="#CCFFFF"><pre> | |
| 254 | |
| 255 % genret refseqn:NC_000913 '*' annotate nc_000913-annotate.gbk | |
| 256 Retrieves various gene features from genome flatfile | |
| 257 | |
| 258 </pre></td></tr></table> | |
| 259 | |
| 260 Go to the <a href="#input">input files</a> for this example<br> | |
| 261 Go to the <a href="#output">output files</a> for this example<br><br> | |
| 262 | |
| 263 <h2>Command line arguments</h2> | |
| 264 | |
| 265 <table border cellspacing=0 cellpadding=3 bgcolor="#ccccff"> | |
| 266 <tr bgcolor="#FFFFCC"> | |
| 267 <th align="left">Qualifier</th> | |
| 268 <th align="left">Type</th> | |
| 269 <th align="left">Description</th> | |
| 270 <th align="left">Allowed values</th> | |
| 271 <th align="left">Default</th> | |
| 272 </tr> | |
| 273 | |
| 274 <tr bgcolor="#FFFFCC"> | |
| 275 <th align="left" colspan=5>Standard (Mandatory) qualifiers</th> | |
| 276 </tr> | |
| 277 | |
| 278 <tr bgcolor="#FFFFCC"> | |
| 279 <td>[-sequence]<br>(Parameter 1)</td> | |
| 280 <td>seqall</td> | |
| 281 <td>Nucleotide sequence(s) filename and optional format, or reference (input USA)</td> | |
| 282 <td>Readable sequence(s)</td> | |
| 283 <td><b>Required</b></td> | |
| 284 </tr> | |
| 285 | |
| 286 <tr bgcolor="#FFFFCC"> | |
| 287 <td>[-gene]<br>(Parameter 2)</td> | |
| 288 <td>string</td> | |
| 289 <td>List of gene name(s) to report</td> | |
| 290 <td>Any string</td> | |
| 291 <td>*</td> | |
| 292 </tr> | |
| 293 | |
| 294 <tr bgcolor="#FFFFCC"> | |
| 295 <td>[-access]<br>(Parameter 3)</td> | |
| 296 <td>string</td> | |
| 297 <td>Name of gene feature to access</td> | |
| 298 <td>Any word</td> | |
| 299 <td> </td> | |
| 300 </tr> | |
| 301 | |
| 302 <tr bgcolor="#FFFFCC"> | |
| 303 <td>[-outfile]<br>(Parameter 4)</td> | |
| 304 <td>outfile</td> | |
| 305 <td>Sequence output file</td> | |
| 306 <td>Output file</td> | |
| 307 <td><i><*></i>.genret</td> | |
| 308 </tr> | |
| 309 | |
| 310 <tr bgcolor="#FFFFCC"> | |
| 311 <th align="left" colspan=5>Additional (Optional) qualifiers</th> | |
| 312 </tr> | |
| 313 | |
| 314 <tr> | |
| 315 <td colspan=5>(none)</td> | |
| 316 </tr> | |
| 317 | |
| 318 <tr bgcolor="#FFFFCC"> | |
| 319 <th align="left" colspan=5>Advanced (Unprompted) qualifiers</th> | |
| 320 </tr> | |
| 321 | |
| 322 <tr bgcolor="#FFFFCC"> | |
| 323 <td>-argument</td> | |
| 324 <td>string</td> | |
| 325 <td>Extra arguments to pass to method</td> | |
| 326 <td>Any string</td> | |
| 327 <td> </td> | |
| 328 </tr> | |
| 329 | |
| 330 <tr bgcolor="#FFFFCC"> | |
| 331 <td>-[no]accid</td> | |
| 332 <td>boolean</td> | |
| 333 <td>Include to use sequence accession ID as query</td> | |
| 334 <td>Boolean value Yes/No</td> | |
| 335 <td>Yes</td> | |
| 336 </tr> | |
| 337 | |
| 338 </table> | |
| 339 | |
| 340 | |
| 341 <h2 id="input">Input file format</h2> | |
| 342 | |
| 343 <p> | |
| 344 Database definitions for the examples are included in the embossrc_template<br /> | |
| 345 file of the Keio Bioinformatcs Web Service (KBWS) package.<br /> | |
| 346 <br /> | |
| 347 Input files for usage example 4<br /> | |
| 348 <br /> | |
| 349 File: gene_list.txt<br /> | |
| 350 <br /> | |
| 351 thrA<br /> | |
| 352 thrB<br /> | |
| 353 thrC<br /> | |
| 354 <br /> | |
| 355 | |
| 356 </p> | |
| 357 | |
| 358 <h2 id="output">Output file format</h2> | |
| 359 | |
| 360 <p> | |
| 361 Output files for usage example 1<br /> | |
| 362 <br /> | |
| 363 File: nc_000913.around_startcodon<br /> | |
| 364 <br /> | |
| 365 <table width="90%"><tr><td bgcolor="#CCFFCC"> | |
| 366 >thrL<br /> | |
| 367 cgtgagtaaattaaaattttattgacttaggtcactaaatactttaaccaatataggcata<br /> | |
| 368 gcgcacagacagataaaaattacagagtacacaacatccatgaaacgcattagcaccacca<br /> | |
| 369 ttaccaccaccatcaccattaccacaggtaacggtgcgggctgacgcgtacaggaaacaca<br /> | |
| 370 gaaaaaagcccgcacctgac<br /> | |
| 371 >thrA<br /> | |
| 372 aggtaacggtgcgggctgacgcgtacaggaaacacagaaaaaagcccgcacctgacagtgc<br /> | |
| 373 gggctttttttttcgaccaaaggtaacgaggtaacaaccatgcgagtgttgaagttcggcg<br /> | |
| 374 gtacatcagtggcaaatgcagaacgttttctgcgtgttgccgatattctggaaagcaatgc<br /> | |
| 375 caggcaggggcaggtggcca<br /> | |
| 376 <br /> | |
| 377 <font color=red>[Part of this file has been deleted for brevity]</font><br /> | |
| 378 <br /> | |
| 379 >yjjY<br /> | |
| 380 tgcatgtttgctacctaaattgccaactaaatcgaaacaggaagtacaaaagtccctgacc<br /> | |
| 381 tgcctgatgcatgctgcaaattaacatgatcggcgtaacatgactaaagtacgtaattgcg<br /> | |
| 382 ttcttgatgcactttccatcaacgtcaacaacatcattagcttggtcgtgggtactttccc<br /> | |
| 383 tcaggacccgacagtgtcaa<br /> | |
| 384 >yjtD<br /> | |
| 385 tttttctgcgacttacgttaagaatttgtaaattcgcaccgcgtaataagttgacagtgat<br /> | |
| 386 cacccggttcgcggttatttgatcaagaagagtggcaatatgcgtataacgattattctgg<br /> | |
| 387 tcgcacccgccagagcagaaaatattggggcagcggcgcgggcaatgaaaacgatggggtt<br /> | |
| 388 tagcgatctgcggattgtcg<br /> | |
| 389 </td></tr></table> | |
| 390 <br /> | |
| 391 Output files for usage example 3<br /> | |
| 392 <br /> | |
| 393 File: nc_000913.start<br /> | |
| 394 <br /> | |
| 395 <table width="90%"><tr><td bgcolor="#CCFFCC"> | |
| 396 gene,start<br /> | |
| 397 thrL,190<br /> | |
| 398 thrA,337<br /> | |
| 399 thrB,2801<br /> | |
| 400 thrC,3734<br /> | |
| 401 yaaX,5234<br /> | |
| 402 yaaA,5683<br /> | |
| 403 yaaJ,6529<br /> | |
| 404 talB,8238<br /> | |
| 405 mog,9306<br /> | |
| 406 <br /> | |
| 407 <font color=red>[Part of this file has been deleted for brevity]</font><br /> | |
| 408 <br /> | |
| 409 yjjX,4631256<br /> | |
| 410 ytjC,4631820<br /> | |
| 411 rob,4632464<br /> | |
| 412 creA,4633544<br /> | |
| 413 creB,4634030<br /> | |
| 414 creC,4634719<br /> | |
| 415 creD,4636201<br /> | |
| 416 arcA,4637613<br /> | |
| 417 yjjY,4638425<br /> | |
| 418 yjtD,4638965<br /> | |
| 419 </td></tr></table><br /> | |
| 420 | |
| 421 Output files for usage example 7<br /> | |
| 422 <br /> | |
| 423 File: ecoli-annotate.gbk<br /> | |
| 424 <br /> | |
| 425 <table width="90%"><tr><td bgcolor="#CCFFCC"> | |
| 426 LOCUS NC_000913 4639675 bp DNA circular BCT 25-OCT-2010<br /> | |
| 427 DEFINITION Escherichia coli str. K-12 substr. MG1655 chromosome, complete<br /> | |
| 428 genome.<br /> | |
| 429 ACCESSION NC_000913<br /> | |
| 430 VERSION NC_000913.2 GI:49175990<br /> | |
| 431 DBLINK Project: 57779<br /> | |
| 432 KEYWORDS .<br /> | |
| 433 SOURCE Escherichia coli str. K-12 substr. MG1655<br /> | |
| 434 ORGANISM Escherichia coli str. K-12 substr. MG1655<br /> | |
| 435 Bacteria; Proteobacteria; Gammaproteobacteria; Enterobacteriales;<br /> | |
| 436 <br /> | |
| 437 <font color="red">[Part of this file has been deleted for brevity]</font><br /> | |
| 438 <br /> | |
| 439 CDS 2801..3733<br /> | |
| 440 /EC_number="2.7.1.39"<br /> | |
| 441 /codon_start="1"<br /> | |
| 442 /db_xref="GI:16127997"<br /> | |
| 443 /db_xref="ASAP:ABE-0000010"<br /> | |
| 444 /db_xref="UniProtKB/Swiss-Prot:P00547"<br /> | |
| 445 /db_xref="ECOCYC:EG10999"<br /> | |
| 446 /db_xref="EcoGene:EG10999"<br /> | |
| 447 /db_xref="GeneID:947498"<br /> | |
| 448 /function="enzyme; Amino acid biosynthesis: Threonine"<br /> | |
| 449 /function="1.5.1.8 metabolism; building block<br /> | |
| 450 biosynthesis; amino acids; threonine"<br /> | |
| 451 /function="7.1 location of gene products; cytoplasm"<br /> | |
| 452 /gene="thrB"<br /> | |
| 453 /gene_synonym="ECK0003; JW0002"<br /> | |
| 454 /locus_tag="b0003"<br /> | |
| 455 /note="GO_component: GO:0005737 - cytoplasm; GO_process:<br /> | |
| 456 GO:0009088 - threonine biosynthetic process"<br /> | |
| 457 /product="homoserine kinase"<br /> | |
| 458 /protein_id="NP_414544.1"<br /> | |
| 459 /rs_com="FUNCTION: Catalyzes the ATP-dependent<br /> | |
| 460 phosphorylation of L- homoserine to L-homoserine<br /> | |
| 461 phosphate (By similarity)."<br /> | |
| 462 /rs_com="CATALYTIC ACTIVITY: ATP + L-homoserine = ADP +<br /> | |
| 463 O-phospho-L- homoserine."<br /> | |
| 464 /rs_com="PATHWAY: Amino-acid biosynthesis; L-threonine<br /> | |
| 465 biosynthesis; L- threonine from L-aspartate: step 4/5."<br /> | |
| 466 /rs_com="SUBCELLULAR LOCATION: Cytoplasm (Potential)."<br /> | |
| 467 /rs_com="SIMILARITY: Belongs to the GHMP kinase family.<br /> | |
| 468 Homoserine kinase subfamily."<br /> | |
| 469 /rs_des="RecName: Full=Homoserine kinase; Short=HK;<br /> | |
| 470 Short=HSK; EC=2.7.1.39;"<br /> | |
| 471 /rs_protein="Level 1: similar to KHSE_ECODH 1.7e-180"<br /> | |
| 472 /rs_xr="EMBL; CP000948; ACB01208.1; -; Genomic_DNA."<br /> | |
| 473 /rs_xr="RefSeq; YP_001728986.1; -."<br /> | |
| 474 /rs_xr="ProteinModelPortal; B1XBC8; -."<br /> | |
| 475 /rs_xr="SMR; B1XBC8; 2-308."<br /> | |
| 476 /rs_xr="EnsemblBacteria; EBESCT00000012034;<br /> | |
| 477 EBESCP00000011562; EBESCG00000011096."<br /> | |
| 478 /rs_xr="GeneID; 6058639; -."<br /> | |
| 479 /rs_xr="GenomeReviews; CP000948_GR; ECDH10B_0003."<br /> | |
| 480 /rs_xr="KEGG; ecd:ECDH10B_0003; -."<br /> | |
| 481 /rs_xr="HOGENOM; HBG646290; -."<br /> | |
| 482 /rs_xr="OMA; GSAHADN; -."<br /> | |
| 483 /rs_xr="ProtClustDB; PRK01212; -."<br /> | |
| 484 /rs_xr="BioCyc; ECOL316385:ECDH10B_0003-MONOMER; -."<br /> | |
| 485 /rs_xr="GO; GO:0005737; C:cytoplasm;<br /> | |
| 486 IEA:UniProtKB-SubCell."<br /> | |
| 487 /rs_xr="GO; GO:0005524; F:ATP binding; IEA:UniProtKB-KW."<br /> | |
| 488 /rs_xr="GO; GO:0004413; F:homoserine kinase activity;<br /> | |
| 489 IEA:EC."<br /> | |
| 490 /rs_xr="GO; GO:0009088; P:threonine biosynthetic process;<br /> | |
| 491 IEA:UniProtKB-KW."<br /> | |
| 492 /rs_xr="HAMAP; MF_00384; Homoser_kinase; 1; -."<br /> | |
| 493 /rs_xr="InterPro; IPR006204; GHMP_kinase."<br /> | |
| 494 /rs_xr="InterPro; IPR013750; GHMP_kinase_C."<br /> | |
| 495 /rs_xr="InterPro; IPR006203; GHMP_knse_ATP-bd_CS."<br /> | |
| 496 /rs_xr="InterPro; IPR000870; Homoserine_kin."<br /> | |
| 497 /rs_xr="InterPro; IPR020568; Ribosomal_S5_D2-typ_fold."<br /> | |
| 498 /rs_xr="InterPro; IPR014721;<br /> | |
| 499 Ribosomal_S5_D2-typ_fold_subgr."<br /> | |
| 500 /rs_xr="Gene3D; G3DSA:3.30.230.10;<br /> | |
| 501 Ribosomal_S5_D2-type_fold; 1."<br /> | |
| 502 /rs_xr="Pfam; PF08544; GHMP_kinases_C; 1."<br /> | |
| 503 /rs_xr="Pfam; PF00288; GHMP_kinases_N; 1."<br /> | |
| 504 /rs_xr="PIRSF; PIRSF000676; Homoser_kin; 1."<br /> | |
| 505 /rs_xr="PRINTS; PR00958; HOMSERKINASE."<br /> | |
| 506 /rs_xr="SUPFAM; SSF54211; Ribosomal_S5_D2-typ_fold; 1."<br /> | |
| 507 /rs_xr="TIGRFAMs; TIGR00191; thrB; 1."<br /> | |
| 508 /rs_xr="PROSITE; PS00627; GHMP_KINASES_ATP; 1."<br /> | |
| 509 /transl_table="11"<br /> | |
| 510 /translation="MVKVYAPASSANMSVGFDVLGAAVTPVDGALLGDVVTVEAAETF<br /> | |
| 511 SLNNLGRFADKLPSEPRENIVYQCWERFCQELGKQIPVAMTLEKNMPIGSGLGSSACS<br /> | |
| 512 VVAALMAMNEHCGKPLNDTRLLALMGELEGRISGSIHYDNVAPCFLGGMQLMIEENDI<br /> | |
| 513 ISQQVPGFDEWLWVLAYPGIKVSTAEARAILPAQYRRQDCIAHGRHLAGFIHACYSRQ<br /> | |
| 514 PELAAKLMKDVIAEPYRERLLPGFRQARQAVAEIGAVASGISGSGPTLFALCDKPETA<br /> | |
| 515 QRVADWLGKNYLQNQEGFVHICRLDTAGARVLEN"<br /> | |
| 516 <br /> | |
| 517 <font color="red">[Part of this file has been deleted for brevity]</font><br /> | |
| 518 <br /> | |
| 519 4639201 gcgcagtcgg gcgaaatatc attactacgc cacgccagtt gaactggtgc cgctgttaga<br /> | |
| 520 4639261 ggaaaaatct tcatggatga gccatgccgc gctggtgttt ggtcgcgaag attccgggtt<br /> | |
| 521 4639321 gactaacgaa gagttagcgt tggctgacgt tcttactggt gtgccgatgg tggcggatta<br /> | |
| 522 4639381 tccttcgctc aatctggggc aggcggtgat ggtctattgc tatcaattag caacattaat<br /> | |
| 523 4639441 acaacaaccg gcgaaaagtg atgcaacggc agaccaacat caactgcaag ctttacgcga<br /> | |
| 524 4639501 acgagccatg acattgctga cgactctggc agtggcagat gacataaaac tggtcgactg<br /> | |
| 525 4639561 gttacaacaa cgcctggggc ttttagagca acgagacacg gcaatgttgc accgtttgct<br /> | |
| 526 4639621 gcatgatatt gaaaaaaata tcaccaaata aaaaacgcct tagtaagtat ttttc<br /> | |
| 527 //<br /> | |
| 528 </td></tr></table> | |
| 529 | |
| 530 </p> | |
| 531 | |
| 532 <h2>Data files</h2> | |
| 533 | |
| 534 <p> | |
| 535 None. | |
| 536 </p> | |
| 537 | |
| 538 <h2>Notes</h2> | |
| 539 | |
| 540 <p> | |
| 541 None. | |
| 542 </p> | |
| 543 | |
| 544 <h2>References</h2> | |
| 545 | |
| 546 <pre> | |
| 547 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and | |
| 548 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench | |
| 549 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306. | |
| 550 | |
| 551 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for | |
| 552 large-scale analysis of high-throughput omics data, J. Pest Sci., | |
| 553 31, 7. | |
| 554 | |
| 555 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome | |
| 556 Analysis Environment with REST and SOAP Web Service Interfaces, | |
| 557 Nucleic Acids Res., 38, W700-W705. | |
| 558 | |
| 559 </pre> | |
| 560 | |
| 561 <h2>Warnings</h2> | |
| 562 | |
| 563 <p> | |
| 564 None. | |
| 565 </p> | |
| 566 | |
| 567 <h2>Diagnostic Error Messages</h2> | |
| 568 | |
| 569 <p> | |
| 570 None. | |
| 571 </p> | |
| 572 | |
| 573 <h2>Exit status</h2> | |
| 574 | |
| 575 <p> | |
| 576 It always exits with a status of 0. | |
| 577 </p> | |
| 578 | |
| 579 <h2>Known bugs</h2> | |
| 580 | |
| 581 <p> | |
| 582 None. | |
| 583 </p> | |
| 584 | |
| 585 <h2>See also</h2> | |
| 586 | |
| 587 <table border cellpadding=4 bgcolor="#FFFFF0"><tr><th>Program name</th> | |
| 588 <th>Description</th></tr> | |
| 589 | |
| 590 <tr> | |
| 591 <td><a href="entret">entret</a></td> | |
| 592 <td>Retrieve sequence entries from flatfile databases and files</td> | |
| 593 </tr><tr> | |
| 594 <td><a href="seqret">seqret</a></td> | |
| 595 <td>Read and write (return) sequences</td> | |
| 596 </tr> | |
| 597 | |
| 598 </table> | |
| 599 | |
| 600 <h2>Author(s)</h2> | |
| 601 | |
| 602 <pre> | |
| 603 Hidetoshi Itaya (celery@g-language.org) | |
| 604 Institute for Advanced Biosciences, Keio University | |
| 605 252-0882 Japan | |
| 606 | |
| 607 Kazuharu Arakawa (gaou@sfc.keio.ac.jp) | |
| 608 Institute for Advanced Biosciences, Keio University | |
| 609 252-0882 Japan</pre> | |
| 610 | |
| 611 <h2>History</h2> | |
| 612 | |
| 613 2012 - Written by Hidetoshi Itaya | |
| 614 | |
| 615 <h2>Target users</h2> | |
| 616 | |
| 617 This program is intended to be used by everyone and everything, from | |
| 618 naive users to embedded scrips. | |
| 619 | |
| 620 <h2>Comments</h2> | |
| 621 | |
| 622 None. | |
| 623 | |
| 624 | |
| 625 </HTML> | |
| 626 | |
| 627 <!-- tfm output ends here --> | |
| 628 </div> | |
| 629 </body> | |
| 630 </html> |
