comparison GEMBASSY-1.0.3/doc/text/gbaserelativeentropy.txt @ 0:8300eb051bea draft

Initial upload
author ktnyt
date Fri, 26 Jun 2015 05:19:29 -0400
parents
children
comparison
equal deleted inserted replaced
-1:000000000000 0:8300eb051bea
1 gbaserelativeentropy
2 Function
3
4 Calculates and graphs the sequence conservation using Kullback-Leibler
5
6 Description
7
8 This function calculates and graphs the sequence conservation in regions
9 around the start/stop codons using Kullback-Leibler divergence (relative
10 entropy). In realistic conditions, as background nucleotide composition
11 (e.g. G+C content) varies among species. Kullback-Leibler divergence
12 calculates the entropy with reduced background noise.
13
14 The relative entropy H at position i with distribution P(i) is calculated
15 as follows:
16 H(P(i)||pi) = sum(P(i,j) * log(2, P(i,j)/pi(j)))
17
18 G-language SOAP service is provided by the
19 Institute for Advanced Biosciences, Keio University.
20 The original web service is located at the following URL:
21
22 http://www.g-language.org/wiki/soap
23
24 WSDL(RPC/Encoded) file is located at:
25
26 http://soap.g-language.org/g-language.wsdl
27
28 Documentation on G-language Genome Analysis Environment methods are
29 provided at the Document Center
30
31 http://ws.g-language.org/gdoc/
32
33 Usage
34
35 Here is a sample session with gbaserelativeentropy
36
37 % gbaserelativeentropy refseqn:NC_000913
38 Calculates and graphs the sequence conservation using Kullback-Leibler
39 divergence (relative entropy)
40 Program compseq output file (optional) [nc_000913.gbaserelativeentropy]:
41
42 Go to the input files for this example
43 Go to the output files for this example
44
45 Example 2
46
47 % gbaserelativeentropy refseqn:NC_000913 -plot -graph png
48 Calculates and graphs the sequence conservation using Kullback-Leibler
49 divergence (relative entropy)
50 Created gbaserelativeentropy.1.png
51
52 Go to the input files for this example
53 Go to the output files for this example
54
55 Command line arguments
56
57 Calculates and graphs the sequence conservation using Kullback-Leibler
58 divergence (relative entropy)
59 Version: EMBOSS:6.5.7.0 GEMBASSY:1.0.1
60
61 Standard (Mandatory) qualifiers (* if not always prompted):
62 [-sequence] seqall Nucleotide sequence(s) filename and optional
63 format, or reference (input USA)
64 * -graph xygraph [$EMBOSS_GRAPHICS value, or x11] Graph type
65 (ps, hpgl, hp7470, hp7580, meta, cps, x11,
66 tek, tekt, none, data, xterm, png, gif, svg)
67 * -outfile outfile [*.gbaserelativeentropy] Program compseq
68 output file (optional)
69
70 Additional (Optional) qualifiers: (none)
71 Advanced (Unprompted) qualifiers:
72 -position selection [start] Either 'start' (around start codon)
73 or 'end' (around stop codon) to create the
74 PWM
75 -patlen integer [3] Length of oligomer to count (Any integer
76 value)
77 -upstream integer [30] Length upstream of specified position
78 to create PWM (Any integer value)
79 -downstream integer [30] Length downstream of specified position
80 to create PWM (Any integer value)
81 -[no]accid boolean [Y] Include to use sequence accession ID as
82 query
83 -plot toggle [N] Include to plot result
84
85 Associated qualifiers:
86
87 "-sequence" associated qualifiers
88 -sbegin1 integer Start of each sequence to be used
89 -send1 integer End of each sequence to be used
90 -sreverse1 boolean Reverse (if DNA)
91 -sask1 boolean Ask for begin/end/reverse
92 -snucleotide1 boolean Sequence is nucleotide
93 -sprotein1 boolean Sequence is protein
94 -slower1 boolean Make lower case
95 -supper1 boolean Make upper case
96 -scircular1 boolean Sequence is circular
97 -sformat1 string Input sequence format
98 -iquery1 string Input query fields or ID list
99 -ioffset1 integer Input start position offset
100 -sdbname1 string Database name
101 -sid1 string Entryname
102 -ufo1 string UFO features
103 -fformat1 string Features format
104 -fopenfile1 string Features file name
105
106 "-graph" associated qualifiers
107 -gprompt boolean Graph prompting
108 -gdesc string Graph description
109 -gtitle string Graph title
110 -gsubtitle string Graph subtitle
111 -gxtitle string Graph x axis title
112 -gytitle string Graph y axis title
113 -goutfile string Output file for non interactive displays
114 -gdirectory string Output directory
115
116 "-outfile" associated qualifiers
117 -odirectory string Output directory
118
119 General qualifiers:
120 -auto boolean Turn off prompts
121 -stdout boolean Write first file to standard output
122 -filter boolean Read first file from standard input, write
123 first file to standard output
124 -options boolean Prompt for standard and additional values
125 -debug boolean Write debug output to program.dbg
126 -verbose boolean Report some/full command line options
127 -help boolean Report command line options and exit. More
128 information on associated and general
129 qualifiers can be found with -help -verbose
130 -warning boolean Report warnings
131 -error boolean Report errors
132 -fatal boolean Report fatal errors
133 -die boolean Report dying program messages
134 -version boolean Report version number and exit
135
136 Input file format
137
138 The database definitions for following commands are available at
139 http://soap.g-language.org/kbws/embossrc
140
141 gbaserelativeentropy reads one or more nucleotide sequences.
142
143 Output file format
144
145 The output from gbaserelativeentropy is to a plain text file or the
146 EMBOSS graphics device.
147
148 File: nc_000913.gbaserelativeentropy
149
150 Sequence: NC_000913
151 -30,-0.46682
152 -29,-0.46265
153 -28,-0.45732
154 -27,-0.45704
155 -26,-0.44692
156 -25,-0.44396
157 -24,-0.43528
158 -23,-0.43419
159 -22,-0.42518
160
161 [Part of this file has been deleted for brevity]
162
163 21,-0.40010
164 22,-0.41772
165 23,-0.42503
166 24,-0.39675
167 25,-0.43091
168 26,-0.43196
169 27,-0.40576
170 28,-0.43387
171 29,-0.41228
172 30,-0.38869
173
174
175 Data files
176
177 None.
178
179 Notes
180
181 None.
182
183 References
184
185 Arakawa, K., Mori, K., Ikeda, K., Matsuzaki, T., Konayashi, Y., and
186 Tomita, M. (2003) G-language Genome Analysis Environment: A Workbench
187 for Nucleotide Sequence Data Mining, Bioinformatics, 19, 305-306.
188
189 Arakawa, K. and Tomita, M. (2006) G-language System as a Platform for
190 large-scale analysis of high-throughput omics data, J. Pest Sci.,
191 31, 7.
192
193 Arakawa, K., Kido, N., Oshita, K., Tomita, M. (2010) G-language Genome
194 Analysis Environment with REST and SOAP Web Service Interfaces,
195 Nucleic Acids Res., 38, W700-W705.
196
197 Warnings
198
199 None.
200
201 Diagnostic Error Messages
202
203 None.
204
205 Exit status
206
207 It always exits with a status of 0.
208
209 Known bugs
210
211 None.
212
213 See also
214
215 gbase_entropy Calculates and graphs the sequence conservation
216 using Shanon uncertainty (entropy)
217 gbase_information_content Calculates and graphs the sequence conservation
218 using information content
219
220 Author(s)
221
222 Hidetoshi Itaya (celery@g-language.org)
223 Institute for Advanced Biosciences, Keio University
224 252-0882 Japan
225
226 Kazuharu Arakawa (gaou@sfc.keio.ac.jp)
227 Institute for Advanced Biosciences, Keio University
228 252-0882 Japan
229
230 History
231
232 2012 - Written by Hidetoshi Itaya
233 2013 - Fixed by Hidetoshi Itaya
234
235 Target users
236
237 This program is intended to be used by everyone and everything, from
238 naive users to embedded scripts.
239
240 Comments
241
242 None.
243