comparison GEMBASSY-1.0.3/doc/text/gcodoncompiler.txt @ 0:8300eb051bea draft

Initial upload
author ktnyt
date Fri, 26 Jun 2015 05:19:29 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:8300eb051bea
1 gcodoncompiler
2 Function
3
4 Calculate various kinds of amino acid and codon usage data
5
6 Description
7
8 gcodoncompiler calculates various kinds of amino acid and codon usage data.
9 The following values are calculable:
10 A0: Absolute amino acid frequency
11 A1: Relative amino acid frequency
12 C0: Absolute codon frequency
13 C1: Relative codon frequency in a complete sequence
14 C2: Relative codon frequency in each amino acid
15 C3: Relative synonymous codon usage
16 C4: Relative adaptiveness
17 C5: Maximum or minor codon
18
19 For amino acids unpresent in a gene, C2-C3 does not calculate the values.
20 By using R* in place, such values are hypothesized that alternative
21 synonymous codons are used with equal frequency.
22
23 G-language SOAP service is provided by the
24 Institute for Advanced Biosciences, Keio University.
25 The original web service is located at the following URL:
26
27 http://www.g-language.org/wiki/soap
28
29 WSDL(RPC/Encoded) file is located at:
30
31 http://soap.g-language.org/g-language.wsdl
32
33 Documentation on G-language Genome Analysis Environment methods are
34 provided at the Document Center
35
36 http://ws.g-language.org/gdoc/
37
38 Usage
39
40 Here is a sample session with gcodoncompiler
41
42 % gcodoncompiler refseqn:NC_000913
43 Calculate various kinds of amino acid and codon usage data
44 Codon usage output file [nc_000913.gcodoncompiler]:
45
46 Go to the input files for this example
47 Go to the output files for this example
48
49 Command line arguments
50
51 Standard (Mandatory) qualifiers:
52 [-sequence] seqall Nucleotide sequence(s) filename and optional
53 format, or reference (input USA)
54 [-outfile] outfile [*.gcodoncompiler] Codon usage output file
55
56 Additional (Optional) qualifiers: (none)
57 Advanced (Unprompted) qualifiers:
58 -translate boolean [N] Include to translate using standard
59 codon table
60 -startcodon boolean [N] Include to include start codon
61 -stopcodon boolean [N] Include to include stop codon
62 -delkey string [[^ACDEFGHIKLMNPQRSTVWYacgtU]] Regular
63 expression to delete key (i.e. amino acids
64 and nucleotides) (Any string)
65 -data menu [R0] Kinds of codon usage data. R*
66 hypothesizes amino acids which are not
67 present in the gene (Values: A0 (Absolute
68 amino acid frequency ('AA')); A1 (Relative
69 amino acid frequency ('RAAU')); C0 (Absolute
70 codon frequency ('AF')); C1 (Relative codon
71 frequency in a complete sequence); C2
72 (Relative codon frequency in each amino acid
73 ('RF')); C3 (Relative synonymous codon
74 usage ('RSCU')); C4 (Relative adaptiveness);
75 i.e., ratio to maximum of minor codon ('W')
76 C5 (Maximum (1) or minor (0) codon); R0
77 (Absolute codon frequency ('AF')); R1
78 (Relative codon frequency in a complete
79 sequence); R2 (Relative codon frequency in
80 each amino acid ('RF')); R3 (Relative
81 synonymous codon usage ('RSCU')); R4
82 (Relative adaptiveness); i.e., ratio to
83 maximum of minor codon ('W') R5 (Maximum (1)
84 or minor (0) codon))
85 -[no]accid boolean [Y] Include to use sequence accession ID as
86 query
87
88 Associated qualifiers:
89
90 "-sequence" associated qualifiers
91 -sbegin1 integer Start of each sequence to be used
92 -send1 integer End of each sequence to be used
93 -sreverse1 boolean Reverse (if DNA)
94 -sask1 boolean Ask for begin/end/reverse
95 -snucleotide1 boolean Sequence is nucleotide
96 -sprotein1 boolean Sequence is protein
97 -slower1 boolean Make lower case
98 -supper1 boolean Make upper case
99 -scircular1 boolean Sequence is circular
100 -sformat1 string Input sequence format
101 -iquery1 string Input query fields or ID list
102 -ioffset1 integer Input start position offset
103 -sdbname1 string Database name
104 -sid1 string Entryname
105 -ufo1 string UFO features
106 -fformat1 string Features format
107 -fopenfile1 string Features file name
108
109 "-outfile" associated qualifiers
110 -odirectory2 string Output directory
111
112 General qualifiers:
113 -auto boolean Turn off prompts
114 -stdout boolean Write first file to standard output
115 -filter boolean Read first file from standard input, write
116 first file to standard output
117 -options boolean Prompt for standard and additional values
118 -debug boolean Write debug output to program.dbg
119 -verbose boolean Report some/full command line options
120 -help boolean Report command line options and exit. More
121 information on associated and general
122 qualifiers can be found with -help -verbose
123 -warning boolean Report warnings
124 -error boolean Report errors
125 -fatal boolean Report fatal errors
126 -die boolean Report dying program messages
127 -version boolean Report version number and exit
128
129 Input file format
130
131 The database definitions for following commands are available at
132 http://soap.g-language.org/kbws/embossrc
133
134 gcodoncompiler reads one or more nucleotide sequences.
135
136 Output file format
137
138 The output from gcodoncompiler is to a plain text file.
139
140 File: nc_000913.gcodoncompiler
141
142 Sequence: NC_000913
143 Agca,Agcc,Agcg,Agct,Ctgc,Ctgt,Dgac,Dgat,Egaa,Egag,Fttc,Fttt,Ggga,Gggc,Gggg,Gggt,Hcac,Hcat,Iata,Iatc,Iatt,Kaaa,Kaag,Lcta,Lctc,Lctg,Lctt,Ltta,Lttg,Matg,Naac,Naat,Pcca,Pccc,Pccg,Pcct,Qcaa,Qcag,Raga,Ragg,Rcga,Rcgc,Rcgg,Rcgt,Sagc,Sagt,Stca,Stcc,Stcg,Stct,Taca,Tacc,Tacg,Tact,Utga,Vgta,Vgtc,Vgtg,Vgtt,Wtgg,Ytac,Ytat,locus_tag
144 26551,33911,44924,20010,8486,6707,25234,42161,52362,23474,21841,29334,10226,39395,14472,32678,12830,16952,5356,33359,40221,44272,13398,5079,14709,70441,14410,18097,17936,32971,28329,22786,11063,7142,30994,9130,20216,38169,2495,1366,4529,29308,6991,27864,21132,11323,9159,11332,11759,10992,8979,31001,18989,11581,3,14337,20240,34499,24056,20071,16088,21069,
145
146
147 Data files
148
149 None.
150
151 Notes
152
153 None.
154
155 References
156
157 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
158 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
159 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.
160
161 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
162 large-scale analysis of high-throughput omics data, J. Pest Sci.,
163 31, 7.
164
165 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
166 Analysis Environment with REST and SOAP Web Service Interfaces,
167 Nucleic Acids Res., 38, W700-W705.
168
169 Warnings
170
171 None.
172
173 Diagnostic Error Messages
174
175 None.
176
177 Exit status
178
179 It always exits with a status of 0.
180
181 Known bugs
182
183 None.
184
185 See also
186
187 gaminoinfo Prints out basic amino acid sequence statistics
188 gaaui Calculates various indece of amino acid usage
189
190 Author(s)
191
192 Hidetoshi Itaya (celery@g-language.org)
193 Institute for Advanced Biosciences, Keio University
194 252-0882 Japan
195
196 Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
197 Institute for Advanced Biosciences, Keio University
198 252-0882 Japan
199
200 History
201
202 2012 - Written by Hidetoshi Itaya
203 2013 - Fixed by Hidetoshi Itaya
204
205 Target users
206
207 This program is intended to be used by everyone and everything, from
208 naive users to embedded scripts.
209
210 Comments
211
212 None.
213