0
|
1 gsvalue
|
|
2 Function
|
|
3
|
|
4 Calculate the strength of selected codon usage bias (S)
|
|
5
|
|
6 Description
|
|
7
|
|
8 gsvalue calculates the strength of selected codon usage bias (S), also
|
|
9 known as Sharp's S index. Using four codon pairs that are recognized by the
|
|
10 same tRNA anticodon, namely, Phe(UUC and UUU), Ile(AUC and AUU), Tyr(UAC and
|
|
11 UAU), and Asn(AAC and AAU), since the former in each of the pairs has
|
|
12 stronger Watson-Crick pairing, selection towards the former codon can be
|
|
13 observed for highly expressed genes. S index is therefore the weighted
|
|
14 average of such bias, giving an over-all value for a genome, indicating its
|
|
15 strength of selected codon usage bias. See Sharp et al. (2005) for details.
|
|
16 Sharp originally defined 40 genes as the highly expressed gene group, with
|
|
17 tufA, tsf, fusA, rplA-rplF, rplI-rplT, rpsB-rpsT. Since the identificaiton
|
|
18 of these genes is not convenient for computational automation, by default,
|
|
19 this method uses ribosomal proteins as the highly expressed gene group,
|
|
20 as used by Viera-silva and Rocha (2010).
|
|
21 However, Sharp's gene group can be optionally used with -sharp option.
|
|
22 With this option, all of the 40 genes must be named accordingly in the given
|
|
23 genome file.
|
|
24
|
|
25 G-language SOAP service is provided by the
|
|
26 Institute for Advanced Biosciences, Keio University.
|
|
27 The original web service is located at the following URL:
|
|
28
|
|
29 http://www.g-language.org/wiki/soap
|
|
30
|
|
31 WSDL(RPC/Encoded) file is located at:
|
|
32
|
|
33 http://soap.g-language.org/g-language.wsdl
|
|
34
|
|
35 Documentation on G-language Genome Analysis Environment methods are
|
|
36 provided at the Document Center
|
|
37
|
|
38 http://ws.g-language.org/gdoc/
|
|
39
|
|
40 Usage
|
|
41
|
|
42 Here is a sample session with gsvalue
|
|
43
|
|
44 % gsvalue refseqn:NC_000913
|
|
45 Calculate the strength of selected codon usage bias (S)
|
|
46 Codon usage output file [nc_000913.gsvalue]:
|
|
47
|
|
48 Go to the input files for this example
|
|
49 Go to the output files for this example
|
|
50
|
|
51 Command line arguments
|
|
52
|
|
53 Standard (Mandatory) qualifiers:
|
|
54 [-sequence] seqall Nucleotide sequence(s) filename and optional
|
|
55 format, or reference (input USA)
|
|
56 [-outfile] outfile [*.gsvalue] Codon usage output file
|
|
57
|
|
58 Additional (Optional) qualifiers: (none)
|
|
59 Advanced (Unprompted) qualifiers:
|
|
60 -sharp boolean [N] Include to use the 40 genes used by
|
|
61 Sharp instead of ribosomal proteins
|
|
62 -[no]accid boolean [Y] Include to use sequence accession ID as
|
|
63 query
|
|
64
|
|
65 Associated qualifiers:
|
|
66
|
|
67 "-sequence" associated qualifiers
|
|
68 -sbegin1 integer Start of each sequence to be used
|
|
69 -send1 integer End of each sequence to be used
|
|
70 -sreverse1 boolean Reverse (if DNA)
|
|
71 -sask1 boolean Ask for begin/end/reverse
|
|
72 -snucleotide1 boolean Sequence is nucleotide
|
|
73 -sprotein1 boolean Sequence is protein
|
|
74 -slower1 boolean Make lower case
|
|
75 -supper1 boolean Make upper case
|
|
76 -scircular1 boolean Sequence is circular
|
|
77 -sformat1 string Input sequence format
|
|
78 -iquery1 string Input query fields or ID list
|
|
79 -ioffset1 integer Input start position offset
|
|
80 -sdbname1 string Database name
|
|
81 -sid1 string Entryname
|
|
82 -ufo1 string UFO features
|
|
83 -fformat1 string Features format
|
|
84 -fopenfile1 string Features file name
|
|
85
|
|
86 "-outfile" associated qualifiers
|
|
87 -odirectory2 string Output directory
|
|
88
|
|
89 General qualifiers:
|
|
90 -auto boolean Turn off prompts
|
|
91 -stdout boolean Write first file to standard output
|
|
92 -filter boolean Read first file from standard input, write
|
|
93 first file to standard output
|
|
94 -options boolean Prompt for standard and additional values
|
|
95 -debug boolean Write debug output to program.dbg
|
|
96 -verbose boolean Report some/full command line options
|
|
97 -help boolean Report command line options and exit. More
|
|
98 information on associated and general
|
|
99 qualifiers can be found with -help -verbose
|
|
100 -warning boolean Report warnings
|
|
101 -error boolean Report errors
|
|
102 -fatal boolean Report fatal errors
|
|
103 -die boolean Report dying program messages
|
|
104 -version boolean Report version number and exit
|
|
105
|
|
106 Input file format
|
|
107
|
|
108 The database definitions for following commands are available at
|
|
109 http://soap.g-language.org/kbws/embossrc
|
|
110
|
|
111 gsvalue reads one or more nucleotide sequences.
|
|
112
|
|
113 Output file format
|
|
114
|
|
115 The output from gsvalue is to a plain text file.
|
|
116
|
|
117 File: nc_000913.gsvalue
|
|
118
|
|
119 Sequence: NC_000913 S-value: 1.23467100598485
|
|
120
|
|
121
|
|
122 Data files
|
|
123
|
|
124 None.
|
|
125
|
|
126 Notes
|
|
127
|
|
128 None.
|
|
129
|
|
130 References
|
|
131
|
|
132 Sharp PM et al. (2005) "Variation in the strength of selected codon usage
|
|
133 bias among bacteria", Nucleic Acids Research, 33(4):1141-1153
|
|
134
|
|
135 Vieira-Silva S and Rocha EPC (2010) "The systemic imprint of growth and its
|
|
136 uses in ecological (meta)genomics", PLoS Genetics, 6(1):e1000808
|
|
137
|
|
138 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
|
|
139 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
|
|
140 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.
|
|
141
|
|
142 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
|
|
143 large-scale analysis of high-throughput omics data, J. Pest Sci.,
|
|
144 31, 7.
|
|
145
|
|
146 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
|
|
147 Analysis Environment with REST and SOAP Web Service Interfaces,
|
|
148 Nucleic Acids Res., 38, W700-W705.
|
|
149
|
|
150 Warnings
|
|
151
|
|
152 None.
|
|
153
|
|
154 Diagnostic Error Messages
|
|
155
|
|
156 None.
|
|
157
|
|
158 Exit status
|
|
159
|
|
160 It always exits with a status of 0.
|
|
161
|
|
162 Known bugs
|
|
163
|
|
164 None.
|
|
165
|
|
166 See also
|
|
167
|
|
168 gcbi Calculates the codon bias index (CBI)
|
|
169 gdelta_enc Calculate the codon usage bias related to translation optimization (delta ENC)
|
|
170 gicdi Calculates the intrinsic codon deviation index (ICDI)
|
|
171
|
|
172 Author(s)
|
|
173
|
|
174 Hidetoshi Itaya (celery@g-language.org)
|
|
175 Institute for Advanced Biosciences, Keio University
|
|
176 252-0882 Japan
|
|
177
|
|
178 Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
|
|
179 Institute for Advanced Biosciences, Keio University
|
|
180 252-0882 Japan
|
|
181
|
|
182 History
|
|
183
|
|
184 2012 - Written by Hidetoshi Itaya
|
|
185 2013 - Fixed by Hidetoshi Itaya
|
|
186
|
|
187 Target users
|
|
188
|
|
189 This program is intended to be used by everyone and everything, from
|
|
190 naive users to embedded scripts.
|
|
191
|
|
192 Comments
|
|
193
|
|
194 None.
|
|
195
|