comparison GEMBASSY-1.0.3/doc/text/gbasecounter.txt @ 0:8300eb051bea draft

Initial upload
author ktnyt
date Fri, 26 Jun 2015 05:19:29 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:8300eb051bea
1 gbasecounter
2 Function
3
4 Creates a position weight matrix of oligomers around start codon
5
6 Description
7
8 This function creates a position weight matrix (PWM) of
9 oligomers of specified length around the start codon of all
10 genes in the given genome.
11
12 G-language SOAP service is provided by the
13 Institute for Advanced Biosciences, Keio University.
14 The original web service is located at the following URL:
15
16 http://www.g-language.org/wiki/soap
17
18 WSDL(RPC/Encoded) file is located at:
19
20 http://soap.g-language.org/g-language.wsdl
21
22 Documentation on G-language Genome Analysis Environment methods are
23 provided at the Document Center
24
25 http://ws.g-language.org/gdoc/
26
27 Usage
28
29 Here is a sample session with gbasecounter
30
31 % gbasecounter refseqn:NC_000913
32 Creates a position weight matrix of oligomers around start codon
33 Weight matrix output file [nc_000913.gbasecounter]:
34
35 Go to the input files for this example
36 Go to the output files for this example
37
38 Command line arguments
39
40 Standard (Mandatory) qualifiers:
41 [-sequence] seqall Nucleotide sequence(s) filename and optional
42 format, or reference (input USA)
43 [-outfile] outfile [*.gbasecounter] Weight matrix output file
44
45 Additional (Optional) qualifiers: (none)
46 Advanced (Unprompted) qualifiers:
47 -position selection [start] Either 'start' (around start codon)
48 or 'end' (around stop codon) to create the
49 PWM
50 -patlen integer [3] Length of oligomer to count (Any integer
51 value)
52 -upstream integer [30] Length upstream of specified position
53 to create PWM (Any integer value)
54 -downstream integer [30] Length downstream of specified position
55 to create PWM (Any integer value)
56 -[no]accid boolean [Y] Include to use sequence accession ID as
57 query
58
59 Associated qualifiers:
60
61 "-sequence" associated qualifiers
62 -sbegin1 integer Start of each sequence to be used
63 -send1 integer End of each sequence to be used
64 -sreverse1 boolean Reverse (if DNA)
65 -sask1 boolean Ask for begin/end/reverse
66 -snucleotide1 boolean Sequence is nucleotide
67 -sprotein1 boolean Sequence is protein
68 -slower1 boolean Make lower case
69 -supper1 boolean Make upper case
70 -scircular1 boolean Sequence is circular
71 -sformat1 string Input sequence format
72 -iquery1 string Input query fields or ID list
73 -ioffset1 integer Input start position offset
74 -sdbname1 string Database name
75 -sid1 string Entryname
76 -ufo1 string UFO features
77 -fformat1 string Features format
78 -fopenfile1 string Features file name
79
80 "-outfile" associated qualifiers
81 -odirectory2 string Output directory
82
83 General qualifiers:
84 -auto boolean Turn off prompts
85 -stdout boolean Write first file to standard output
86 -filter boolean Read first file from standard input, write
87 first file to standard output
88 -options boolean Prompt for standard and additional values
89 -debug boolean Write debug output to program.dbg
90 -verbose boolean Report some/full command line options
91 -help boolean Report command line options and exit. More
92 information on associated and general
93 qualifiers can be found with -help -verbose
94 -warning boolean Report warnings
95 -error boolean Report errors
96 -fatal boolean Report fatal errors
97 -die boolean Report dying program messages
98 -version boolean Report version number and exit
99
100 Input file format
101
102 The database definitions for following commands are available at
103 http://soap.g-language.org/kbws/embossrc
104
105 gbasecounter reads one or more nucleotide sequences.
106
107 Output file format
108
109 The output from gbasecounter is to a plain text file.
110
111 File: nc_000913.gbasecounter
112
113 Sequence: NC_000913
114 Pattern,30,29,28,27,26,25,24,23,22,21,20,19,18,17,16,15,14,13,12,11,10,9,8,7,6,5,4,3,2,1,0,-1,-2,-3,-4,-5,-6,-7,-8,-9,-10,-11,-12,-13,-14,-15,-16,-17,-18,-19,-20,-21,-22,-23,-24,-25,-26,-27,-28,-29,-30
115 aaa,0,1,199,111,104,139,94,103,99,44,42,26,75,103,107,95,107,103,102,82,91,71,73,81,86,80,74,74,78,65,69,65,31,41,68,51,61,83,55,67,92,55,71,89,60,77,100,59,87,123,97,105,141,83,117,180,154,203,262,2,0
116 aac,2,0,0,63,104,56,67,64,28,34,22,12,17,37,43,59,61,71,54,42,62,59,63,52,56,61,48,55,56,52,38,30,34,54,36,42,43,33,49,49,36,43,58,37,53,62,46,47,79,38,52,72,58,52,89,74,83,91,68,2,1
117 aag,0,0,17,46,38,57,56,44,25,44,43,170,162,125,92,70,61,50,42,46,21,22,43,39,29,35,39,34,28,26,30,25,9,43,31,12,55,33,13,66,21,21,50,30,21,55,31,21,47,38,16,55,35,23,63,96,31,51,71,0,0
118 aat,1,565,4,56,124,45,83,74,63,42,24,24,20,27,59,71,54,74,66,71,67,52,58,77,61,52,57,49,56,71,61,34,33,24,40,38,30,43,46,25,48,56,35,58,51,33,47,71,46,70,77,60,74,74,73,83,69,61,110,0,1
119 aca,0,1,92,73,39,69,39,24,31,31,16,19,34,64,61,63,65,56,42,60,45,66,38,45,46,41,49,40,51,43,39,20,34,29,23,26,28,34,35,26,35,39,30,28,48,26,28,53,35,36,59,42,53,46,64,56,62,44,55,0,0
120 acc,2,2,0,81,37,19,28,19,15,8,12,7,7,14,22,27,30,24,31,23,30,27,34,27,30,22,25,42,34,29,25,41,23,32,44,19,32,51,21,19,50,23,24,52,30,31,56,25,31,55,30,25,35,30,32,53,20,21,48,0,2
121 acg,0,0,21,38,23,38,32,25,13,18,12,15,34,29,34,37,25,31,25,34,30,20,22,24,40,22,24,30,34,29,25,29,25,34,41,23,32,25,36,44,28,32,40,32,23,28,40,30,25,36,39,32,28,40,38,39,45,30,33,0,0
122 act,0,0,1,57,35,14,30,29,21,9,6,9,9,10,17,38,28,35,30,37,41,46,38,43,39,31,31,31,30,32,27,18,55,24,20,32,16,25,32,24,31,44,14,33,43,12,35,60,24,40,58,19,36,71,22,44,46,13,45,3,1
123
124 [Part of this file has been deleted for brevity]
125
126 tcg,0.000,0.000,0.347,0.255,0.301,0.764,0.347,0.232,0.162,0.093,0.093,0.278,0.347,0.370,0.370,0.440,0.556,0.394,0.486,0.440,0.417,0.347,0.370,0.463,0.417,0.695,0.394,0.671,0.533,0.579,0.602,0.347,0.695,1.598,0.556,0.648,1.366,0.394,0.463,1.505,0.579,0.810,1.320,0.278,0.810,1.065,0.533,0.579,0.972,0.255,0.787,1.158,0.440,0.787,0.602,0.255,0.625,0.463,0.347,0.000,0.000
127 tct,0.000,0.046,0.000,0.671,0.764,0.394,0.278,0.347,0.278,0.116,0.116,0.162,0.255,0.162,0.486,0.648,0.533,0.625,0.741,0.718,0.903,0.834,0.880,0.857,0.741,0.857,0.671,0.648,0.857,0.695,0.625,0.440,0.880,0.463,0.556,1.111,0.509,0.579,1.227,0.556,0.370,1.135,0.671,0.648,1.250,0.834,0.509,1.273,0.440,0.718,0.972,1.042,0.648,0.926,0.533,0.625,0.556,0.185,1.690,0.000,0.000
128 tga,0.000,0.000,2.315,0.463,1.227,1.297,1.088,0.949,0.625,0.417,1.065,0.903,1.737,1.667,1.042,1.158,1.366,1.320,1.227,1.158,0.926,1.459,1.181,0.810,1.366,0.972,0.972,1.111,0.764,0.787,1.227,0.000,1.598,1.250,0.000,1.482,1.181,0.000,1.459,1.389,0.000,1.783,1.297,0.000,1.505,1.482,0.023,1.343,1.690,0.000,1.690,1.204,0.000,1.389,0.949,0.000,2.408,0.996,0.000,0.023,24.311
129 tgc,0.023,0.000,0.000,0.394,0.996,0.579,0.787,0.556,0.208,0.185,0.208,0.116,0.278,0.324,0.394,0.834,0.486,0.394,0.718,0.556,0.509,0.857,0.509,0.625,0.810,0.741,0.695,0.834,0.625,0.787,1.158,0.347,1.158,1.621,0.394,1.667,1.204,0.347,1.551,1.320,0.417,1.088,1.065,0.232,1.320,1.042,0.139,1.204,0.996,0.208,0.996,0.602,0.139,0.648,0.764,0.069,0.857,0.394,0.023,0.000,7.803
130 tgg,0.000,0.023,0.069,0.208,0.370,0.509,0.486,0.417,0.394,0.671,1.343,1.713,1.621,1.482,0.810,0.834,0.718,0.301,0.463,0.509,0.509,0.741,0.579,0.509,0.625,0.486,0.509,0.625,0.625,0.533,0.857,0.996,0.718,1.968,1.042,0.880,1.760,0.671,0.949,1.459,0.556,0.787,0.903,0.718,0.695,1.273,0.533,0.440,0.648,0.880,0.417,0.718,0.648,0.278,0.625,0.463,0.440,0.486,0.116,0.023,11.021
131 tgt,0.023,0.880,0.023,0.533,1.135,0.301,0.440,0.602,0.417,0.208,0.232,0.185,0.185,0.278,0.370,0.440,0.533,0.556,0.648,0.764,0.509,0.926,0.579,0.718,0.880,0.695,0.718,0.741,0.741,0.579,0.625,0.278,1.158,0.857,0.278,0.972,0.718,0.324,0.926,0.695,0.463,1.111,0.834,0.162,1.482,0.787,0.278,1.065,0.695,0.278,1.042,0.695,0.208,0.903,0.718,0.139,0.857,0.232,0.093,0.023,7.340
132 tta,0.000,0.000,6.506,0.648,0.810,1.829,1.320,0.602,0.486,0.509,0.255,0.347,0.301,0.834,1.320,1.459,1.412,1.667,1.644,1.852,1.667,1.574,1.366,1.042,1.204,1.621,1.505,1.227,1.436,1.088,1.273,1.343,0.486,1.158,1.042,0.440,1.135,1.389,0.370,1.273,1.574,0.486,1.875,1.505,0.463,1.991,1.875,0.533,2.362,2.061,0.324,2.084,2.200,0.509,1.505,1.320,0.463,1.366,0.648,0.000,0.069
133 ttc,0.000,0.000,0.000,0.648,0.417,0.695,0.764,0.347,0.301,0.278,0.208,0.023,0.232,0.533,0.718,0.718,0.903,1.042,1.158,0.880,1.158,1.065,0.903,0.834,1.343,0.996,0.926,0.810,0.741,0.834,1.042,0.926,0.579,1.088,0.695,0.695,1.297,0.741,0.741,1.111,0.926,0.787,1.366,0.695,0.857,1.412,0.648,0.834,1.111,0.440,0.602,1.250,1.019,1.135,0.787,0.440,0.880,0.509,0.370,0.000,0.000
134 ttg,0.857,0.023,0.255,0.394,0.556,1.111,0.533,0.463,0.417,0.185,0.232,0.533,0.602,1.042,0.718,0.695,1.135,0.972,0.857,0.926,0.787,0.671,1.320,0.695,0.903,1.204,0.880,0.764,0.926,0.741,0.718,1.019,0.347,1.551,1.042,0.370,2.014,0.834,0.463,2.061,0.880,0.278,2.014,0.857,0.208,2.593,0.741,0.278,1.922,0.764,0.417,2.130,0.834,0.208,1.111,0.394,0.093,1.111,0.417,0.000,0.023
135 ttt,0.023,0.440,0.093,1.598,1.181,1.320,1.829,1.343,0.648,0.370,0.394,0.278,0.185,0.440,1.135,1.574,1.667,1.945,2.315,2.362,2.431,2.501,2.107,2.362,1.806,2.014,2.292,2.014,1.598,1.760,1.829,1.389,1.505,1.042,1.343,1.297,0.926,1.528,1.574,1.227,1.482,1.737,1.389,1.667,1.922,1.389,1.945,1.922,1.343,1.806,1.760,1.389,2.014,1.760,1.065,0.949,1.111,0.625,1.227,0.023,0.023
136
137
138 Data files
139
140 None.
141
142 Notes
143
144 None.
145
146 References
147
148 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
149 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
150 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.
151
152 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
153 large-scale analysis of high-throughput omics data, J. Pest Sci.,
154 31, 7.
155
156 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
157 Analysis Environment with REST and SOAP Web Service Interfaces,
158 Nucleic Acids Res., 38, W700-W705.
159
160 Warnings
161
162 None.
163
164 Diagnostic Error Messages
165
166 None.
167
168 Exit status
169
170 It always exits with a status of 0.
171
172 Known bugs
173
174 None.
175
176 See also
177
178 gbasezvalue Extracts conserved oligomers per position using Z-score
179 gviewcds Displays a graph of nucleotide contents around start and stop
180 codons
181
182 Author(s)
183
184 Hidetoshi Itaya (celery@g-language.org)
185 Institute for Advanced Biosciences, Keio University
186 252-0882 Japan
187
188 Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
189 Institute for Advanced Biosciences, Keio University
190 252-0882 Japan
191
192 History
193
194 2012 - Written by Hidetoshi Itaya
195 2013 - Fixed by Hidetoshi Itaya
196
197 Target users
198
199 This program is intended to be used by everyone and everything, from
200 naive users to embedded scripts.
201
202 Comments
203
204 None.
205