annotate tools/rgenetics/rgGRR.py @ 0:9071e359b9a3

Uploaded
author xuebing
date Fri, 09 Mar 2012 19:37:19 -0500
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
2 # july 2009: Need to see outliers so need to draw them last?
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
3 # could use clustering on the zscores to guess real relationships for unrelateds
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
4 # but definitely need to draw last
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
5 # added MAX_SHOW_ROWS to limit the length of the main report page
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
6 # Changes for Galaxy integration
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
7 # added more robust knuth method for one pass mean and sd
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
8 # no difference really - let's use scipy.mean() and scipy.std() instead...
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
9 # fixed labels and changed to .xls for outlier reports so can open in excel
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
10 # interesting - with a few hundred subjects, 5k gives good resolution
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
11 # and 100k gives better but not by much
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
12 # TODO remove non autosomal markers
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
13 # TODO it would be best if label had the zmean and zsd as these are what matter for
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
14 # outliers rather than the group mean/sd
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
15 # mods to rgGRR.py from channing CVS which John Ziniti has rewritten to produce SVG plots
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
16 # to make a Galaxy tool - we need the table of mean and SD for interesting pairs, the SVG and the log
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
17 # so the result should be an HTML file
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
18
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
19 # rgIBS.py
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
20 # use a random subset of markers for a quick ibs
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
21 # to identify sample dups and closely related subjects
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
22 # try snpMatrix and plink and see which one works best for us?
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
23 # abecasis grr plots mean*sd for every subject to show clusters
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
24 # mods june 23 rml to avoid non-autosomal markers
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
25 # we seem to be distinguishing parent-child by gender - 2 clouds!
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
26
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
27
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
28 snpMatrix from David Clayton has:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
29 ibs.stats function to calculate the identity-by-state stats of a group of samples
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
30 Description
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
31 Given a snp.matrix-class or a X.snp.matrix-class object with N samples, calculates some statistics
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
32 about the relatedness of every pair of samples within.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
33
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
34 Usage
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
35 ibs.stats(x)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
36 8 ibs.stats
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
37 Arguments
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
38 x a snp.matrix-class or a X.snp.matrix-class object containing N samples
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
39 Details
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
40 No-calls are excluded from consideration here.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
41 Value
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
42 A data.frame containing N(N - 1)/2 rows, where the row names are the sample name pairs separated
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
43 by a comma, and the columns are:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
44 Count count of identical calls, exclusing no-calls
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
45 Fraction fraction of identical calls comparied to actual calls being made in both samples
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
46 Warning
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
47 In some applications, it may be preferable to subset a (random) selection of SNPs first - the
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
48 calculation
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
49 time increases as N(N - 1)M/2 . Typically for N = 800 samples and M = 3000 SNPs, the
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
50 calculation time is about 1 minute. A full GWA scan could take hours, and quite unnecessary for
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
51 simple applications such as checking for duplicate or related samples.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
52 Note
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
53 This is mostly written to find mislabelled and/or duplicate samples.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
54 Illumina indexes their SNPs in alphabetical order so the mitochondria SNPs comes first - for most
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
55 purpose it is undesirable to use these SNPs for IBS purposes.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
56 TODO: Worst-case S4 subsetting seems to make 2 copies of a large object, so one might want to
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
57 subset before rbind(), etc; a future version of this routine may contain a built-in subsetting facility
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
58 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
59 import sys,os,time,random,string,copy,optparse
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
60
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
61 try:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
62 set
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
63 except NameError:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
64 from Sets import Set as set
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
65
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
66 from rgutils import timenow,plinke
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
67
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
68 import plinkbinJZ
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
69
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
70
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
71 opts = None
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
72 verbose = False
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
73
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
74 showPolygons = False
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
75
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
76 class NullDevice:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
77 def write(self, s):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
78 pass
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
79
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
80 tempstderr = sys.stderr # save
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
81 #sys.stderr = NullDevice()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
82 # need to avoid blather about deprecation and other strange stuff from scipy
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
83 # the current galaxy job runner assumes that
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
84 # the job is in error if anything appears on sys.stderr
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
85 # grrrrr. James wants to keep it that way instead of using the
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
86 # status flag for some strange reason. Presumably he doesn't use R or (in this case, scipy)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
87 import numpy
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
88 import scipy
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
89 from scipy import weave
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
90
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
91
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
92 sys.stderr=tempstderr
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
93
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
94
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
95 PROGNAME = os.path.split(sys.argv[0])[-1]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
96 X_AXIS_LABEL = 'Mean Alleles Shared'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
97 Y_AXIS_LABEL = 'SD Alleles Shared'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
98 LEGEND_ALIGN = 'topleft'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
99 LEGEND_TITLE = 'Relationship'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
100 DEFAULT_SYMBOL_SIZE = 1.0 # default symbol size
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
101 DEFAULT_SYMBOL_SIZE = 0.5 # default symbol size
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
102
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
103 ### Some colors for R/rpy
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
104 R_BLACK = 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
105 R_RED = 2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
106 R_GREEN = 3
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
107 R_BLUE = 4
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
108 R_CYAN = 5
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
109 R_PURPLE = 6
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
110 R_YELLOW = 7
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
111 R_GRAY = 8
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
112
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
113 ### ... and some point-styles
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
114
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
115 ###
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
116 PLOT_HEIGHT = 600
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
117 PLOT_WIDTH = 1150
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
118
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
119
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
120 #SVG_COLORS = ('black', 'darkblue', 'blue', 'deepskyblue', 'firebrick','maroon','crimson')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
121 #SVG_COLORS = ('cyan','dodgerblue','mediumpurple', 'fuchsia', 'red','gold','gray')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
122 SVG_COLORS = ('cyan','dodgerblue','mediumpurple','forestgreen', 'lightgreen','gold','gray')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
123 # dupe,parentchild,sibpair,halfsib,parents,unrel,unkn
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
124 #('orange', 'red', 'green', 'chartreuse', 'blue', 'purple', 'gray')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
125
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
126 OUTLIERS_HEADER_list = ['Mean','Sdev','ZMean','ZSdev','FID1','IID1','FID2','IID2','RelMean_M','RelMean_SD','RelSD_M','RelSD_SD','PID1','MID1','PID2','MID2','Ped']
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
127 OUTLIERS_HEADER = '\t'.join(OUTLIERS_HEADER_list)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
128 TABLE_HEADER='fid1_iid1\tfid2_iid2\tmean\tsdev\tzmean\tzsdev\tgeno\trelcode\tpid1\tmid1\tpid2\tmid2\n'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
129
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
130
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
131 ### Relationship codes, text, and lookups/mappings
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
132 N_RELATIONSHIP_TYPES = 7
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
133 REL_DUPE, REL_PARENTCHILD, REL_SIBS, REL_HALFSIBS, REL_RELATED, REL_UNRELATED, REL_UNKNOWN = range(N_RELATIONSHIP_TYPES)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
134 REL_LOOKUP = {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
135 REL_DUPE: ('dupe', R_BLUE, 1),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
136 REL_PARENTCHILD: ('parentchild', R_YELLOW, 1),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
137 REL_SIBS: ('sibpairs', R_RED, 1),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
138 REL_HALFSIBS: ('halfsibs', R_GREEN, 1),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
139 REL_RELATED: ('parents', R_PURPLE, 1),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
140 REL_UNRELATED: ('unrelated', R_CYAN, 1),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
141 REL_UNKNOWN: ('unknown', R_GRAY, 1),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
142 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
143 OUTLIER_STDEVS = {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
144 REL_DUPE: 2,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
145 REL_PARENTCHILD: 2,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
146 REL_SIBS: 2,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
147 REL_HALFSIBS: 2,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
148 REL_RELATED: 2,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
149 REL_UNRELATED: 3,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
150 REL_UNKNOWN: 2,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
151 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
152 # note now Z can be passed in
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
153
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
154 REL_STATES = [REL_LOOKUP[r][0] for r in range(N_RELATIONSHIP_TYPES)]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
155 REL_COLORS = SVG_COLORS
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
156 REL_POINTS = [REL_LOOKUP[r][2] for r in range(N_RELATIONSHIP_TYPES)]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
157
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
158 DEFAULT_MAX_SAMPLE_SIZE = 10000
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
159
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
160 REF_COUNT_HOM1 = 3
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
161 REF_COUNT_HET = 2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
162 REF_COUNT_HOM2 = 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
163 MISSING = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
164 MAX_SHOW_ROWS = 100 # framingham has millions - delays showing output page - so truncate and explain
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
165 MARKER_PAIRS_PER_SECOND_SLOW = 15000000.0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
166 MARKER_PAIRS_PER_SECOND_FAST = 70000000.0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
167
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
168
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
169 galhtmlprefix = """<?xml version="1.0" encoding="utf-8" ?>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
170 <!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
171 <html xmlns="http://www.w3.org/1999/xhtml" xml:lang="en" lang="en">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
172 <head>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
173 <meta http-equiv="Content-Type" content="text/html; charset=utf-8" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
174 <meta name="generator" content="Galaxy %s tool output - see http://g2.trac.bx.psu.edu/" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
175 <title></title>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
176 <link rel="stylesheet" href="/static/style/base.css" type="text/css" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
177 </head>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
178 <body>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
179 <div class="document">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
180 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
181
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
182
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
183 SVG_HEADER = '''<?xml version="1.0" standalone="no"?>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
184 <!DOCTYPE svg PUBLIC "-//W3C//DTD SVG 1.2//EN" "http://www.w3.org/Graphics/SVG/1.2/DTD/svg12.dtd">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
185
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
186 <svg width="1280" height="800"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
187 xmlns="http://www.w3.org/2000/svg" version="1.2"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
188 xmlns:xlink="http://www.w3.org/1999/xlink" viewBox="0 0 1280 800" onload="init()">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
189
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
190 <script type="text/ecmascript" xlink:href="/static/scripts/checkbox_and_radiobutton.js"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
191 <script type="text/ecmascript" xlink:href="/static/scripts/helper_functions.js"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
192 <script type="text/ecmascript" xlink:href="/static/scripts/timer.js"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
193 <script type="text/ecmascript">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
194 <![CDATA[
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
195 var checkBoxes = new Array();
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
196 var radioGroupBandwidth;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
197 var colours = ['%s','%s','%s','%s','%s','%s','%s'];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
198 function init() {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
199 var style = {"font-family":"Arial,Helvetica", "fill":"black", "font-size":12};
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
200 var dist = 12;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
201 var yOffset = 4;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
202
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
203 //A checkBox for each relationship type dupe,parentchild,sibpair,halfsib,parents,unrel,unkn
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
204 checkBoxes["dupe"] = new checkBox("dupe","checkboxes",20,40,"cbRect","cbCross",true,"Duplicate",style,dist,yOffset,undefined,hideShowLayer);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
205 checkBoxes["parentchild"] = new checkBox("parentchild","checkboxes",20,60,"cbRect","cbCross",true,"Parent-Child",style,dist,yOffset,undefined,hideShowLayer);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
206 checkBoxes["sibpairs"] = new checkBox("sibpairs","checkboxes",20,80,"cbRect","cbCross",true,"Sib-pairs",style,dist,yOffset,undefined,hideShowLayer);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
207 checkBoxes["halfsibs"] = new checkBox("halfsibs","checkboxes",20,100,"cbRect","cbCross",true,"Half-sibs",style,dist,yOffset,undefined,hideShowLayer);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
208 checkBoxes["parents"] = new checkBox("parents","checkboxes",20,120,"cbRect","cbCross",true,"Parents",style,dist,yOffset,undefined,hideShowLayer);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
209 checkBoxes["unrelated"] = new checkBox("unrelated","checkboxes",20,140,"cbRect","cbCross",true,"Unrelated",style,dist,yOffset,undefined,hideShowLayer);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
210 checkBoxes["unknown"] = new checkBox("unknown","checkboxes",20,160,"cbRect","cbCross",true,"Unknown",style,dist,yOffset,undefined,hideShowLayer);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
211
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
212 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
213
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
214 function hideShowLayer(id, status, label) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
215 var vis = "hidden";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
216 if (status) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
217 vis = "visible";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
218 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
219 document.getElementById(id).setAttributeNS(null, 'visibility', vis);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
220 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
221
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
222 function showBTT(evt, rel, mm, dm, md, dd, n, mg, dg, lg, hg) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
223 var x = parseInt(evt.pageX)-250;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
224 var y = parseInt(evt.pageY)-110;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
225 switch(rel) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
226 case 0:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
227 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
228 relt = "dupe";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
229 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
230 case 1:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
231 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
232 relt = "parentchild";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
233 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
234 case 2:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
235 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
236 relt = "sibpairs";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
237 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
238 case 3:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
239 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
240 relt = "halfsibs";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
241 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
242 case 4:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
243 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
244 relt = "parents";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
245 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
246 case 5:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
247 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
248 relt = "unrelated";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
249 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
250 case 6:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
251 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
252 relt = "unknown";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
253 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
254 default:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
255 fill = "cyan";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
256 relt = "ERROR_CODE: "+rel;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
257 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
258
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
259 document.getElementById("btRel").textContent = "GROUP: "+relt;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
260 document.getElementById("btMean").textContent = "mean="+mm+" +/- "+dm;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
261 document.getElementById("btSdev").textContent = "sdev="+dm+" +/- "+dd;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
262 document.getElementById("btPair").textContent = "npairs="+n;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
263 document.getElementById("btGeno").textContent = "ngenos="+mg+" +/- "+dg+" (min="+lg+", max="+hg+")";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
264 document.getElementById("btHead").setAttribute('fill', fill);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
265
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
266 var tt = document.getElementById("btTip");
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
267 tt.setAttribute("transform", "translate("+x+","+y+")");
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
268 tt.setAttribute('visibility', 'visible');
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
269 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
270
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
271 function showOTT(evt, rel, s1, s2, mean, sdev, ngeno, rmean, rsdev) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
272 var x = parseInt(evt.pageX)-150;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
273 var y = parseInt(evt.pageY)-180;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
274
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
275 switch(rel) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
276 case 0:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
277 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
278 relt = "dupe";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
279 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
280 case 1:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
281 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
282 relt = "parentchild";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
283 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
284 case 2:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
285 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
286 relt = "sibpairs";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
287 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
288 case 3:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
289 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
290 relt = "halfsibs";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
291 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
292 case 4:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
293 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
294 relt = "parents";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
295 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
296 case 5:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
297 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
298 relt = "unrelated";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
299 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
300 case 6:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
301 fill = colours[rel];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
302 relt = "unknown";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
303 break;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
304 default:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
305 fill = "cyan";
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
306 relt = "ERROR_CODE: "+rel;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
307 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
308
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
309 document.getElementById("otRel").textContent = "PAIR: "+relt;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
310 document.getElementById("otS1").textContent = "s1="+s1;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
311 document.getElementById("otS2").textContent = "s2="+s2;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
312 document.getElementById("otMean").textContent = "mean="+mean;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
313 document.getElementById("otSdev").textContent = "sdev="+sdev;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
314 document.getElementById("otGeno").textContent = "ngenos="+ngeno;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
315 document.getElementById("otRmean").textContent = "relmean="+rmean;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
316 document.getElementById("otRsdev").textContent = "relsdev="+rsdev;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
317 document.getElementById("otHead").setAttribute('fill', fill);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
318
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
319 var tt = document.getElementById("otTip");
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
320 tt.setAttribute("transform", "translate("+x+","+y+")");
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
321 tt.setAttribute('visibility', 'visible');
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
322 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
323
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
324 function hideBTT(evt) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
325 document.getElementById("btTip").setAttributeNS(null, 'visibility', 'hidden');
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
326 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
327
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
328 function hideOTT(evt) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
329 document.getElementById("otTip").setAttributeNS(null, 'visibility', 'hidden');
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
330 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
331
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
332 ]]>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
333 </script>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
334 <defs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
335 <!-- symbols for check boxes -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
336 <symbol id="cbRect" overflow="visible">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
337 <rect x="-5" y="-5" width="10" height="10" fill="white" stroke="dimgray" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
338 </symbol>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
339 <symbol id="cbCross" overflow="visible">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
340 <g pointer-events="none" stroke="black" stroke-width="1">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
341 <line x1="-3" y1="-3" x2="3" y2="3"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
342 <line x1="3" y1="-3" x2="-3" y2="3"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
343 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
344 </symbol>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
345 </defs>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
346
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
347 <desc>Developer Works Dynamic Scatter Graph Scaling Example</desc>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
348
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
349 <!-- Now Draw the main X and Y axis -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
350 <g style="stroke-width:1.0; stroke:black; shape-rendering:crispEdges">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
351 <!-- X Axis top and bottom -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
352 <path d="M 100 100 L 1250 100 Z"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
353 <path d="M 100 700 L 1250 700 Z"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
354
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
355 <!-- Y Axis left and right -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
356 <path d="M 100 100 L 100 700 Z"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
357 <path d="M 1250 100 L 1250 700 Z"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
358 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
359
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
360 <g transform="translate(100,100)">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
361
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
362 <!-- Grid Lines -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
363 <g style="fill:none; stroke:#dddddd; stroke-width:1; stroke-dasharray:2,2; text-anchor:end; shape-rendering:crispEdges">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
364
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
365 <!-- Vertical grid lines -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
366 <line x1="125" y1="0" x2="115" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
367 <line x1="230" y1="0" x2="230" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
368 <line x1="345" y1="0" x2="345" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
369 <line x1="460" y1="0" x2="460" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
370 <line x1="575" y1="0" x2="575" y2="600" style="stroke-dasharray:none;" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
371 <line x1="690" y1="0" x2="690" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
372 <line x1="805" y1="0" x2="805" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
373 <line x1="920" y1="0" x2="920" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
374 <line x1="1035" y1="0" x2="1035" y2="600" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
375
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
376 <!-- Horizontal grid lines -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
377 <line x1="0" y1="60" x2="1150" y2="60" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
378 <line x1="0" y1="120" x2="1150" y2="120" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
379 <line x1="0" y1="180" x2="1150" y2="180" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
380 <line x1="0" y1="240" x2="1150" y2="240" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
381 <line x1="0" y1="300" x2="1150" y2="300" style="stroke-dasharray:none;" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
382 <line x1="0" y1="360" x2="1150" y2="360" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
383 <line x1="0" y1="420" x2="1150" y2="420" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
384 <line x1="0" y1="480" x2="1150" y2="480" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
385 <line x1="0" y1="540" x2="1150" y2="540" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
386 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
387
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
388 <!-- Legend -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
389 <g style="fill:black; stroke:none" font-size="12" font-family="Arial" transform="translate(25,25)">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
390 <rect width="160" height="270" style="fill:none; stroke:black; shape-rendering:crispEdges" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
391 <text x="5" y="20" style="fill:black; stroke:none;" font-size="13" font-weight="bold">Given Pair Relationship</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
392 <rect x="120" y="35" width="10" height="10" fill="%s" stroke="%s" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
393 <rect x="120" y="55" width="10" height="10" fill="%s" stroke="%s" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
394 <rect x="120" y="75" width="10" height="10" fill="%s" stroke="%s" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
395 <rect x="120" y="95" width="10" height="10" fill="%s" stroke="%s" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
396 <rect x="120" y="115" width="10" height="10" fill="%s" stroke="%s" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
397 <rect x="120" y="135" width="10" height="10" fill="%s" stroke="%s" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
398 <rect x="120" y="155" width="10" height="10" fill="%s" stroke="%s" stroke-width="1" cursor="pointer"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
399 <text x="15" y="195" style="fill:black; stroke:none" font-size="12" font-family="Arial" >Zscore gt 15</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
400 <circle cx="125" cy="192" r="6" style="stroke:red; fill:gold; fill-opacity:1.0; stroke-width:1;"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
401 <text x="15" y="215" style="fill:black; stroke:none" font-size="12" font-family="Arial" >Zscore 4 to 15</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
402 <circle cx="125" cy="212" r="3" style="stroke:gold; fill:gold; fill-opacity:1.0; stroke-width:1;"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
403 <text x="15" y="235" style="fill:black; stroke:none" font-size="12" font-family="Arial" >Zscore lt 4</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
404 <circle cx="125" cy="232" r="2" style="stroke:gold; fill:gold; fill-opacity:1.0; stroke-width:1;"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
405 <g id="checkboxes">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
406 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
407 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
408
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
409
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
410 <g style='fill:black; stroke:none' font-size="17" font-family="Arial">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
411 <!-- X Axis Labels -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
412 <text x="480" y="660">Mean Alleles Shared</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
413 <text x="0" y="630" >1.0</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
414 <text x="277" y="630" >1.25</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
415 <text x="564" y="630" >1.5</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
416 <text x="842" y="630" >1.75</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
417 <text x="1140" y="630" >2.0</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
418 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
419
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
420 <g transform="rotate(270)" style="fill:black; stroke:none" font-size="17" font-family="Arial">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
421 <!-- Y Axis Labels -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
422 <text x="-350" y="-40">SD Alleles Shared</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
423 <text x="-20" y="-10" >1.0</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
424 <text x="-165" y="-10" >0.75</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
425 <text x="-310" y="-10" >0.5</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
426 <text x="-455" y="-10" >0.25</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
427 <text x="-600" y="-10" >0.0</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
428 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
429
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
430 <!-- Plot Title -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
431 <g style="fill:black; stroke:none" font-size="18" font-family="Arial">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
432 <text x="425" y="-30">%s</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
433 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
434
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
435 <!-- One group/layer of points for each relationship type -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
436 '''
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
437
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
438 SVG_FOOTER = '''
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
439 <!-- End of Data -->
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
440 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
441 <g id="btTip" visibility="hidden" style="stroke-width:1.0; fill:black; stroke:none;" font-size="10" font-family="Arial">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
442 <rect width="250" height="110" style="fill:silver" rx="2" ry="2"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
443 <rect id="btHead" width="250" height="20" rx="2" ry="2" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
444 <text id="btRel" y="14" x="85">unrelated</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
445 <text id="btMean" y="40" x="4">mean=1.5 +/- 0.04</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
446 <text id="btSdev" y="60" x="4">sdev=0.7 +/- 0.03</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
447 <text id="btPair" y="80" x="4">npairs=1152</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
448 <text id="btGeno" y="100" x="4">ngenos=4783 +/- 24 (min=1000, max=5000)</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
449 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
450
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
451 <g id="otTip" visibility="hidden" style="stroke-width:1.0; fill:black; stroke:none;" font-size="10" font-family="Arial">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
452 <rect width="150" height="180" style="fill:silver" rx="2" ry="2"/>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
453 <rect id="otHead" width="150" height="20" rx="2" ry="2" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
454 <text id="otRel" y="14" x="40">sibpairs</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
455 <text id="otS1" y="40" x="4">s1=fid1,iid1</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
456 <text id="otS2" y="60" x="4">s2=fid2,iid2</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
457 <text id="otMean" y="80" x="4">mean=1.82</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
458 <text id="otSdev" y="100" x="4">sdev=0.7</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
459 <text id="otGeno" y="120" x="4">ngeno=4487</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
460 <text id="otRmean" y="140" x="4">relmean=1.85</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
461 <text id="otRsdev" y="160" x="4">relsdev=0.65</text>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
462 </g>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
463 </svg>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
464 '''
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
465
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
466
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
467 DEFAULT_MAX_SAMPLE_SIZE = 5000
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
468
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
469 REF_COUNT_HOM1 = 3
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
470 REF_COUNT_HET = 2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
471 REF_COUNT_HOM2 = 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
472 MISSING = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
473
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
474 MARKER_PAIRS_PER_SECOND_SLOW = 15000000
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
475 MARKER_PAIRS_PER_SECOND_FAST = 70000000
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
476
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
477 POLYGONS = {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
478 REL_UNRELATED: ((1.360, 0.655), (1.385, 0.730), (1.620, 0.575), (1.610, 0.505)),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
479 REL_HALFSIBS: ((1.630, 0.500), (1.630, 0.550), (1.648, 0.540), (1.648, 0.490)),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
480 REL_SIBS: ((1.660, 0.510), (1.665, 0.560), (1.820, 0.410), (1.820, 0.390)),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
481 REL_PARENTCHILD: ((1.650, 0.470), (1.650, 0.490), (1.750, 0.440), (1.750, 0.420)),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
482 REL_DUPE: ((1.970, 0.000), (1.970, 0.150), (2.000, 0.150), (2.000, 0.000)),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
483 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
484
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
485 def distance(point1, point2):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
486 """ Calculate the distance between two points
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
487 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
488 (x1,y1) = [float(d) for d in point1]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
489 (x2,y2) = [float(d) for d in point2]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
490 dx = abs(x1 - x2)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
491 dy = abs(y1 - y2)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
492 return math.sqrt(dx**2 + dy**2)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
493
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
494 def point_inside_polygon(x, y, poly):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
495 """ Determine if a point (x,y) is inside a given polygon or not
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
496 poly is a list of (x,y) pairs.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
497
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
498 Taken from: http://www.ariel.com.au/a/python-point-int-poly.html
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
499 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
500
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
501 n = len(poly)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
502 inside = False
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
503
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
504 p1x,p1y = poly[0]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
505 for i in range(n+1):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
506 p2x,p2y = poly[i % n]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
507 if y > min(p1y,p2y):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
508 if y <= max(p1y,p2y):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
509 if x <= max(p1x,p2x):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
510 if p1y != p2y:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
511 xinters = (y-p1y)*(p2x-p1x)/(p2y-p1y)+p1x
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
512 if p1x == p2x or x <= xinters:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
513 inside = not inside
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
514 p1x,p1y = p2x,p2y
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
515 return inside
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
516
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
517 def readMap(pedfile):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
518 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
519 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
520 mapfile = pedfile.replace('.ped', '.map')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
521 marker_list = []
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
522 if os.path.exists(mapfile):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
523 print 'readMap: %s' % (mapfile)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
524 fh = file(mapfile, 'r')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
525 for line in fh:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
526 marker_list.append(line.strip().split())
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
527 fh.close()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
528 print 'readMap: %s markers' % (len(marker_list))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
529 return marker_list
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
530
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
531 def calcMeanSD(useme):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
532 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
533 A numerically stable algorithm is given below. It also computes the mean.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
534 This algorithm is due to Knuth,[1] who cites Welford.[2]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
535 n = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
536 mean = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
537 M2 = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
538
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
539 foreach x in data:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
540 n = n + 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
541 delta = x - mean
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
542 mean = mean + delta/n
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
543 M2 = M2 + delta*(x - mean) // This expression uses the new value of mean
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
544 end for
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
545
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
546 variance_n = M2/n
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
547 variance = M2/(n - 1)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
548 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
549 mean = 0.0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
550 M2 = 0.0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
551 sd = 0.0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
552 n = len(useme)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
553 if n > 1:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
554 for i,x in enumerate(useme):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
555 delta = x - mean
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
556 mean = mean + delta/(i+1) # knuth uses n+=1 at start
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
557 M2 = M2 + delta*(x - mean) # This expression uses the new value of mean
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
558 variance = M2/(n-1) # assume is sample so lose 1 DOF
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
559 sd = pow(variance,0.5)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
560 return mean,sd
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
561
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
562
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
563 def doIBSpy(ped=None,basename='',outdir=None,logf=None,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
564 nrsSamples=10000,title='title',pdftoo=0,Zcutoff=2.0):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
565 #def doIBS(pedName, title, nrsSamples=None, pdftoo=False):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
566 """ started with snpmatrix but GRR uses actual IBS counts and sd's
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
567 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
568 repOut = [] # text strings to add to the html display
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
569 refallele = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
570 tblf = '%s_table.xls' % (title)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
571 tbl = file(os.path.join(outdir,tblf), 'w')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
572 tbl.write(TABLE_HEADER)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
573 svgf = '%s.svg' % (title)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
574 svg = file(os.path.join(outdir,svgf), 'w')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
575
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
576 nMarkers = len(ped._markers)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
577 if nMarkers < 5:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
578 print sys.stderr, '### ERROR - %d is too few markers for reliable estimation in %s - terminating' % (nMarkers,PROGNAME)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
579 sys.exit(1)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
580 nSubjects = len(ped._subjects)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
581 nrsSamples = min(nMarkers, nrsSamples)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
582 if opts and opts.use_mito:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
583 markers = range(nMarkers)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
584 nrsSamples = min(len(markers), nrsSamples)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
585 sampleIndexes = sorted(random.sample(markers, nrsSamples))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
586 else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
587 autosomals = ped.autosomal_indices()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
588 nrsSamples = min(len(autosomals), nrsSamples)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
589 sampleIndexes = sorted(random.sample(autosomals, nrsSamples))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
590
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
591 print ''
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
592 print 'Getting random.sample of %s from %s total' % (nrsSamples, nMarkers)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
593 npairs = (nSubjects*(nSubjects-1))/2 # total rows in table
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
594 newfiles=[svgf,tblf]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
595 explanations = ['rgGRR Plot (requires SVG)','Mean by SD alleles shared - %d rows' % npairs]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
596 # these go with the output file links in the html file
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
597 s = 'Reading genotypes for %s subjects and %s markers\n' % (nSubjects, nrsSamples)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
598 logf.write(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
599 minUsegenos = nrsSamples/2 # must have half?
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
600 nGenotypes = nSubjects*nrsSamples
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
601 stime = time.time()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
602 emptyRows = set()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
603 genos = numpy.zeros((nSubjects, nrsSamples), dtype=int)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
604 for s in xrange(nSubjects):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
605 nValid = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
606 #getGenotypesByIndices(self, s, mlist, format)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
607 genos[s] = ped.getGenotypesByIndices(s, sampleIndexes, format='ref')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
608 nValid = sum([1 for g in genos[s] if g])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
609 if not nValid:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
610 emptyRows.add(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
611 sub = ped.getSubject(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
612 print 'All missing for row %d (%s)' % (s, sub)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
613 logf.write('All missing for row %d (%s)\n' % (s, sub))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
614 rtime = time.time() - stime
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
615 if verbose:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
616 print '@@Read %s genotypes in %s seconds' % (nGenotypes, rtime)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
617
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
618
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
619 ### Now the expensive part. For each pair of subjects, we get the mean number
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
620 ### and standard deviation of shared alleles over all of the markers where both
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
621 ### subjects have a known genotype. Identical subjects should have mean shared
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
622 ### alleles very close to 2.0 with a standard deviation very close to 0.0.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
623 tot = nSubjects*(nSubjects-1)/2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
624 nprog = tot/10
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
625 nMarkerpairs = tot * nrsSamples
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
626 estimatedTimeSlow = nMarkerpairs/MARKER_PAIRS_PER_SECOND_SLOW
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
627 estimatedTimeFast = nMarkerpairs/MARKER_PAIRS_PER_SECOND_FAST
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
628
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
629 pairs = []
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
630 pair_data = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
631 means = [] ## Mean IBS for each pair
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
632 ngenoL = [] ## Count of comparable genotypes for each pair
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
633 sdevs = [] ## Standard dev for each pair
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
634 rels = [] ## A relationship code for each pair
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
635 zmeans = [0.0 for x in xrange(tot)] ## zmean score for each pair for the relgroup
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
636 zstds = [0.0 for x in xrange(tot)] ## zstd score for each pair for the relgrp
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
637 skip = set()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
638 ndone = 0 ## How many have been done so far
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
639
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
640 logf.write('Calculating %d pairs...\n' % (tot))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
641 logf.write('Estimated time is %2.2f to %2.2f seconds ...\n' % (estimatedTimeFast, estimatedTimeSlow))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
642
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
643 t1sum = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
644 t2sum = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
645 t3sum = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
646 now = time.time()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
647 scache = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
648 _founder_cache = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
649 C_CODE = """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
650 #include "math.h"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
651 int i;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
652 int sumibs = 0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
653 int ssqibs = 0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
654 int ngeno = 0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
655 float mean = 0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
656 float M2 = 0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
657 float delta = 0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
658 float sdev=0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
659 float variance=0;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
660 for (i=0; i<nrsSamples; i++) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
661 int a1 = g1[i];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
662 int a2 = g2[i];
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
663 if (a1 != 0 && a2 != 0) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
664 ngeno += 1;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
665 int shared = 2-abs(a1-a2);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
666 delta = shared - mean;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
667 mean = mean + delta/ngeno;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
668 M2 += delta*(shared-mean);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
669 // yes that second time, the updated mean is used see calcmeansd above;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
670 //printf("%d %d %d %d %d %d\\n", i, a1, a2, ngeno, shared, squared);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
671 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
672 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
673 if (ngeno > 1) {
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
674 variance = M2/(ngeno-1);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
675 sdev = sqrt(variance);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
676 //printf("OK: %d %3.2f %3.2f\\n", ngeno, mean, sdev);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
677 }
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
678 //printf("%d %d %d %1.2f %1.2f\\n", ngeno, sumibs, ssqibs, mean, sdev);
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
679 result[0] = ngeno;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
680 result[1] = mean;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
681 result[2] = sdev;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
682 return_val = ngeno;
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
683 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
684 started = time.time()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
685 for s1 in xrange(nSubjects):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
686 if s1 in emptyRows:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
687 continue
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
688 (fid1,iid1,did1,mid1,sex1,phe1,iid1,d_sid1,m_sid1) = scache.setdefault(s1, ped.getSubject(s1))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
689
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
690 isFounder1 = _founder_cache.setdefault(s1, (did1==mid1))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
691 g1 = genos[s1]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
692
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
693 for s2 in xrange(s1+1, nSubjects):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
694 if s2 in emptyRows:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
695 continue
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
696 t1s = time.time()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
697
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
698 (fid2,iid2,did2,mid2,sex2,phe2,iid2,d_sid2,m_sid2) = scache.setdefault(s2, ped.getSubject(s2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
699
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
700 g2 = genos[s2]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
701 isFounder2 = _founder_cache.setdefault(s2, (did2==mid2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
702
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
703 # Determine the relationship for this pair
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
704 relcode = REL_UNKNOWN
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
705 if (fid2 == fid1):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
706 if iid1 == iid2:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
707 relcode = REL_DUPE
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
708 elif (did2 == did1) and (mid2 == mid1) and did1 != mid1:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
709 relcode = REL_SIBS
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
710 elif (iid1 == mid2) or (iid1 == did2) or (iid2 == mid1) or (iid2 == did1):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
711 relcode = REL_PARENTCHILD
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
712 elif (str(did1) != '0' and (did2 == did1)) or (str(mid1) != '0' and (mid2 == mid1)):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
713 relcode = REL_HALFSIBS
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
714 else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
715 # People in the same family should be marked as some other
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
716 # form of related. In general, these people will have a
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
717 # pretty random spread of similarity. This distinction is
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
718 # probably not very useful most of the time
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
719 relcode = REL_RELATED
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
720 else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
721 ### Different families
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
722 relcode = REL_UNRELATED
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
723
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
724 t1e = time.time()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
725 t1sum += t1e-t1s
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
726
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
727
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
728 ### Calculate sum(2-abs(a1-a2)) and sum((2-abs(a1-a2))**2) and count
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
729 ### the number of contributing genotypes. These values are not actually
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
730 ### calculated here, but instead are looked up in a table for speed.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
731 ### FIXME: This is still too slow ...
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
732 result = [0.0, 0.0, 0.0]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
733 ngeno = weave.inline(C_CODE, ['g1', 'g2', 'nrsSamples', 'result'])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
734 if ngeno >= minUsegenos:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
735 _, mean, sdev = result
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
736 means.append(mean)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
737 sdevs.append(sdev)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
738 ngenoL.append(ngeno)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
739 pairs.append((s1, s2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
740 rels.append(relcode)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
741 else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
742 skip.add(ndone) # signal no comparable genotypes for this pair
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
743 ndone += 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
744 t2e = time.time()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
745 t2sum += t2e-t1e
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
746 t3e = time.time()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
747 t3sum += t3e-t2e
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
748
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
749 logme = [ 'T1: %s' % (t1sum), 'T2: %s' % (t2sum), 'T3: %s' % (t3sum),'TOT: %s' % (t3e-now),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
750 '%s pairs with no (or not enough) comparable genotypes (%3.1f%%)' % (len(skip),
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
751 float(len(skip))/float(tot)*100)]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
752 logf.write('%s\n' % '\t'.join(logme))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
753 ### Calculate mean and standard deviation of scores on a per relationship
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
754 ### type basis, allowing us to flag outliers for each particular relationship
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
755 ### type
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
756 relstats = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
757 relCounts = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
758 outlierFiles = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
759 for relCode, relInfo in REL_LOOKUP.items():
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
760 relName, relColor, relStyle = relInfo
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
761 useme = [means[x] for x in xrange(len(means)) if rels[x] == relCode]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
762 relCounts[relCode] = len(useme)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
763 mm = scipy.mean(useme)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
764 ms = scipy.std(useme)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
765 useme = [sdevs[x] for x in xrange(len(sdevs)) if rels[x] == relCode]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
766 sm = scipy.mean(useme)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
767 ss = scipy.std(useme)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
768 relstats[relCode] = {'sd':(sm,ss), 'mean':(mm,ms)}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
769 s = 'Relstate %s (n=%d): mean(mean)=%3.2f sdev(mean)=%3.2f, mean(sdev)=%3.2f sdev(sdev)=%3.2f\n' % \
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
770 (relName,relCounts[relCode], mm, ms, sm, ss)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
771 logf.write(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
772
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
773 ### now fake z scores for each subject like abecasis recommends max(|zmu|,|zsd|)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
774 ### within each group, for each pair, z=(groupmean-pairmean)/groupsd
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
775 available = len(means)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
776 logf.write('%d pairs are available of %d\n' % (available, tot))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
777 ### s = '\nOutliers:\nrelationship\tzmean\tzsd\tped1\tped2\tmean\tsd\trmeanmean\trmeansd\trsdmean\trsdsd\n'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
778 ### logf.write(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
779 pairnum = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
780 offset = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
781 nOutliers = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
782 cexs = []
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
783 outlierRecords = dict([(r, []) for r in range(N_RELATIONSHIP_TYPES)])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
784 zsdmax = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
785 for s1 in range(nSubjects):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
786 if s1 in emptyRows:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
787 continue
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
788 (fid1,iid1,did1,mid1,sex1,aff1,ok1,d_sid1,m_sid1) = scache[s1]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
789 for s2 in range(s1+1, nSubjects):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
790 if s2 in emptyRows:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
791 continue
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
792 if pairnum not in skip:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
793 ### Get group stats for this relationship
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
794 (fid2,iid2,did2,mid2,sex2,aff2,ok2,d_sid2,m_sid2) = scache[s2]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
795 try:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
796 r = rels[offset]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
797 except IndexError:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
798 logf.write('###OOPS offset %d available %d pairnum %d len(rels) %d', offset, available, pairnum, len(rels))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
799 notfound = ('?',('?','0','0'))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
800 relInfo = REL_LOOKUP.get(r,notfound)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
801 relName, relColor, relStyle = relInfo
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
802 rmm,rmd = relstats[r]['mean'] # group mean, group meansd alleles shared
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
803 rdm,rdd = relstats[r]['sd'] # group sdmean, group sdsd alleles shared
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
804
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
805 try:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
806 zsd = (sdevs[offset] - rdm)/rdd # distance from group mean in group sd units
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
807 except:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
808 zsd = 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
809 if abs(zsd) > zsdmax:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
810 zsdmax = zsd # keep for sort scaling
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
811 try:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
812 zmean = (means[offset] - rmm)/rmd # distance from group mean
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
813 except:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
814 zmean = 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
815 zmeans[offset] = zmean
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
816 zstds[offset] = zsd
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
817 pid=(s1,s2)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
818 zrad = max(zsd,zmean)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
819 if zrad < 4:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
820 zrad = 2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
821 elif 4 < zrad < 15:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
822 zrad = 3 # to 9
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
823 else: # > 15 6=24+
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
824 zrad=zrad/4
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
825 zrad = min(zrad,6) # scale limit
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
826 zrad = max(2,max(zsd,zmean)) # as > 2, z grows
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
827 pair_data[pid] = (zmean,zsd,r,zrad)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
828 if max(zsd,zmean) > Zcutoff: # is potentially interesting
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
829 mean = means[offset]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
830 sdev = sdevs[offset]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
831 outlierRecords[r].append((mean, sdev, zmean, zsd, fid1, iid1, fid2, iid2, rmm, rmd, rdm, rdd,did1,mid1,did2,mid2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
832 nOutliers += 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
833 tbl.write('%s_%s\t%s_%s\t%f\t%f\t%f\t%f\t%d\t%s\t%s\t%s\t%s\t%s\n' % \
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
834 (fid1, iid1, fid2, iid2, mean, sdev, zmean,zsd, ngeno, relName, did1,mid1,did2,mid2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
835 offset += 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
836 pairnum += 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
837 logf.write( 'Outliers: %s\n' % (nOutliers))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
838
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
839 ### Write outlier files for each relationship type
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
840 repOut.append('<h2>Outliers in tab delimited files linked above are also listed below</h2>')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
841 lzsd = round(numpy.log10(zsdmax)) + 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
842 scalefactor = 10**lzsd
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
843 for relCode, relInfo in REL_LOOKUP.items():
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
844 relName, _, _ = relInfo
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
845 outliers = outlierRecords[relCode]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
846 if not outliers:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
847 continue
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
848 outliers = [(scalefactor*int(abs(x[3]))+ int(abs(x[2])),x) for x in outliers] # decorate
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
849 outliers.sort()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
850 outliers.reverse() # largest deviation first
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
851 outliers = [x[1] for x in outliers] # undecorate
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
852 nrows = len(outliers)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
853 truncated = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
854 if nrows > MAX_SHOW_ROWS:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
855 s = '<h3>%s outlying pairs (top %d of %d) from %s</h3><table border="0" cellpadding="3">' % \
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
856 (relName,MAX_SHOW_ROWS,nrows,title)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
857 truncated = nrows - MAX_SHOW_ROWS
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
858 else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
859 s = '<h3>%s outlying pairs (n=%d) from %s</h3><table border="0" cellpadding="3">' % (relName,nrows,title)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
860 repOut.append(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
861 fhname = '%s_rgGRR_%s_outliers.xls' % (title, relName)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
862 fhpath = os.path.join(outdir,fhname)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
863 fh = open(fhpath, 'w')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
864 newfiles.append(fhname)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
865 explanations.append('%s Outlier Pairs %s, N=%d, Cutoff SD=%f' % (relName,title,len(outliers),Zcutoff))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
866 fh.write(OUTLIERS_HEADER)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
867 s = ''.join(['<th>%s</th>' % x for x in OUTLIERS_HEADER_list])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
868 repOut.append('<tr align="center">%s</tr>' % s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
869 for n,rec in enumerate(outliers):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
870 #(mean, sdev, zmean, zsd, fid1, iid1, fid2, iid2, rmm, rmd, rdm, rdd) = rec
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
871 s = '%f\t%f\t%f\t%f\t%s\t%s\t%s\t%s\t%f\t%f\t%f\t%f\t%s\t%s\t%s\t%s\t' % tuple(rec)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
872 fh.write('%s%s\n' % (s,relName))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
873 # (mean, sdev, zmean, zsd, fid1, iid1, fid2, iid2, rmm, rmd, rdm, rdd, did1,mid1,did2,mid2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
874 s = '''<td>%f</td><td>%f</td><td>%f</td><td>%f</td><td>%s</td><td>%s</td>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
875 <td>%s</td><td>%s</td><td>%f</td><td>%f</td><td>%f</td><td>%f</td><td>%s</td><td>%s</td><td>%s</td><td>%s</td>''' % tuple(rec)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
876 s = '%s<td>%s</td>' % (s,relName)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
877 if n < MAX_SHOW_ROWS:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
878 repOut.append('<tr align="center">%s</tr>' % s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
879 if truncated > 0:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
880 repOut.append('<H2>WARNING: %d rows truncated - see outlier file for all %d rows</H2>' % (truncated,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
881 nrows))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
882 fh.close()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
883 repOut.append('</table><p>')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
884
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
885 ### Now, draw the plot in jpeg and svg formats, and optionally in the PDF format
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
886 ### if requested
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
887 logf.write('Plotting ...')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
888 pointColors = [REL_COLORS[rel] for rel in rels]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
889 pointStyles = [REL_POINTS[rel] for rel in rels]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
890
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
891 mainTitle = '%s (%s subjects, %d snp)' % (title, nSubjects, nrsSamples)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
892 svg.write(SVG_HEADER % (SVG_COLORS[0],SVG_COLORS[1],SVG_COLORS[2],SVG_COLORS[3],SVG_COLORS[4],
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
893 SVG_COLORS[5],SVG_COLORS[6],SVG_COLORS[0],SVG_COLORS[0],SVG_COLORS[1],SVG_COLORS[1],
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
894 SVG_COLORS[2],SVG_COLORS[2],SVG_COLORS[3],SVG_COLORS[3],SVG_COLORS[4],SVG_COLORS[4],
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
895 SVG_COLORS[5],SVG_COLORS[5],SVG_COLORS[6],SVG_COLORS[6],mainTitle))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
896 #rpy.r.jpeg(filename='%s.jpg' % (title), width=1600, height=1200, pointsize=12, quality=100, bg='white')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
897 #rpy.r.par(mai=(1,1,1,0.5))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
898 #rpy.r('par(xaxs="i",yaxs="i")')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
899 #rpy.r.plot(means, sdevs, main=mainTitle, ylab=Y_AXIS_LABEL, xlab=X_AXIS_LABEL, cex=cexs, col=pointColors, pch=pointStyles, xlim=(0,2), ylim=(0,2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
900 #rpy.r.legend(LEGEND_ALIGN, legend=REL_STATES, pch=REL_POINTS, col=REL_COLORS, title=LEGEND_TITLE)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
901 #rpy.r.grid(nx=10, ny=10, col='lightgray', lty='dotted')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
902 #rpy.r.dev_off()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
903
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
904 ### We will now go through each relationship type to partition plot points
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
905 ### into "bulk" and "outlier" groups. Bulk points will represent common
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
906 ### mean/sdev pairs and will cover the majority of the points in the plot --
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
907 ### they will use generic tooltip informtion about all of the pairs
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
908 ### represented by that point. "Outlier" points will be uncommon pairs,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
909 ### with very specific information in their tooltips. It would be nice to
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
910 ### keep hte total number of plotted points in the SVG representation to
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
911 ### ~10000 (certainly less than 100000?)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
912 pointMap = {}
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
913 orderedRels = [y[1] for y in reversed(sorted([(relCounts.get(x, 0),x) for x in REL_LOOKUP.keys()]))]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
914 # do we really want this? I want out of zone points last and big
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
915 for relCode in orderedRels:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
916 svgColor = SVG_COLORS[relCode]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
917 relName, relColor, relStyle = REL_LOOKUP[relCode]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
918 svg.write('<g id="%s" style="stroke:%s; fill:%s; fill-opacity:1.0; stroke-width:1;" cursor="pointer">\n' % (relName, svgColor, svgColor))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
919 pMap = pointMap.setdefault(relCode, {})
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
920 nPoints = 0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
921 rpairs=[]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
922 rgenos=[]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
923 rmeans=[]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
924 rsdevs=[]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
925 rz = []
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
926 for x,rel in enumerate(rels): # all pairs
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
927 if rel == relCode:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
928 s1,s2 = pairs[x]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
929 pid=(s1,s2)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
930 zmean,zsd,r,zrad = pair_data[pid][:4]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
931 rpairs.append(pairs[x])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
932 rgenos.append(ngenoL[x])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
933 rmeans.append(means[x])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
934 rsdevs.append(sdevs[x])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
935 rz.append(zrad)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
936 ### Now add the svg point group for this relationship to the svg file
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
937 for x in range(len(rmeans)):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
938 svgX = '%d' % ((rmeans[x] - 1.0) * PLOT_WIDTH) # changed so mean scale is 1-2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
939 svgY = '%d' % (PLOT_HEIGHT - (rsdevs[x] * PLOT_HEIGHT)) # changed so sd scale is 0-1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
940 s1, s2 = rpairs[x]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
941 (fid1,uid1,did1,mid1,sex1,phe1,iid1,d_sid1,m_sid1) = scache[s1]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
942 (fid2,uid2,did2,mid2,sex2,phe2,iid2,d_sid2,m_sid2) = scache[s2]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
943 ngenos = rgenos[x]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
944 nPoints += 1
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
945 point = pMap.setdefault((svgX, svgY), [])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
946 point.append((rmeans[x], rsdevs[x], fid1, iid1, did1, mid1, fid2, iid2, did2, mid2, ngenos,rz[x]))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
947 for (svgX, svgY) in pMap:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
948 points = pMap[(svgX, svgY)]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
949 svgX = int(svgX)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
950 svgY = int(svgY)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
951 if len(points) > 1:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
952 mmean,dmean = calcMeanSD([p[0] for p in points])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
953 msdev,dsdev = calcMeanSD([p[1] for p in points])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
954 mgeno,dgeno = calcMeanSD([p[-1] for p in points])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
955 mingeno = min([p[-1] for p in points])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
956 maxgeno = max([p[-1] for p in points])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
957 svg.write("""<circle cx="%d" cy="%d" r="2"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
958 onmouseover="showBTT(evt, %d, %1.2f, %1.2f, %1.2f, %1.2f, %d, %d, %d, %d, %d)"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
959 onmouseout="hideBTT(evt)" />\n""" % (svgX, svgY, relCode, mmean, dmean, msdev, dsdev, len(points), mgeno, dgeno, mingeno, maxgeno))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
960 else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
961 mean, sdev, fid1, iid1, did1, mid1, fid2, iid2, did2, mid2, ngenos, zrad = points[0][:12]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
962 rmean = float(relstats[relCode]['mean'][0])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
963 rsdev = float(relstats[relCode]['sd'][0])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
964 if zrad < 4:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
965 zrad = 2
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
966 elif 4 < zrad < 9:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
967 zrad = 3 # to 9
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
968 else: # > 9 5=15+
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
969 zrad=zrad/3
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
970 zrad = min(zrad,5) # scale limit
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
971 if zrad <= 3:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
972 svg.write('<circle cx="%d" cy="%d" r="%s" onmouseover="showOTT(evt, %d, \'%s,%s,%s,%s\', \'%s,%s,%s,%s\', %1.2f, %1.2f, %s, %1.2f, %1.2f)" onmouseout="hideOTT(evt)" />\n' % (svgX, svgY, zrad, relCode, fid1, iid1, did1, mid1, fid2, iid2, did2, mid2, mean, sdev, ngenos, rmean, rsdev))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
973 else: # highlight pairs a long way from expectation by outlining circle in red
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
974 svg.write("""<circle cx="%d" cy="%d" r="%s" style="stroke:red; fill:%s; fill-opacity:1.0; stroke-width:1;"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
975 onmouseover="showOTT(evt, %d, \'%s,%s,%s,%s\', \'%s,%s,%s,%s\', %1.2f, %1.2f, %s, %1.2f, %1.2f)"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
976 onmouseout="hideOTT(evt)" />\n""" % \
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
977 (svgX, svgY, zrad, svgColor, relCode, fid1, iid1, did1, mid1, fid2, iid2, did2, mid2, mean, sdev, ngenos, rmean, rsdev))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
978 svg.write('</g>\n')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
979
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
980 ### Create a pdf as well if indicated on the command line
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
981 ### WARNING! for framingham share, with about 50M pairs, this is a 5.5GB pdf!
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
982 ## if pdftoo:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
983 ## pdfname = '%s.pdf' % (title)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
984 ## rpy.r.pdf(pdfname, 6, 6)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
985 ## rpy.r.par(mai=(1,1,1,0.5))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
986 ## rpy.r('par(xaxs="i",yaxs="i")')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
987 ## rpy.r.plot(means, sdevs, main='%s, %d snp' % (title, nSamples), ylab=Y_AXIS_LABEL, xlab=X_AXIS_LABEL, cex=cexs, col=pointColors, pch=pointStyles, xlim=(0,2), ylim=(0,2))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
988 ## rpy.r.legend(LEGEND_ALIGN, legend=REL_STATES, pch=REL_POINTS, col=REL_COLORS, title=LEGEND_TITLE)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
989 ## rpy.r.grid(nx=10, ny=10, col='lightgray', lty='dotted')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
990 ## rpy.r.dev_off()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
991
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
992 ### Draw polygons
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
993 if showPolygons:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
994 svg.write('<g id="polygons" cursor="pointer">\n')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
995 for rel, poly in POLYGONS.items():
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
996 points = ' '.join(['%s,%s' % ((p[0]-1.0)*float(PLOT_WIDTH), (PLOT_HEIGHT - p[1]*PLOT_HEIGHT)) for p in poly])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
997 svg.write('<polygon points="%s" fill="transparent" style="stroke:%s; stroke-width:1"/>\n' % (points, SVG_COLORS[rel]))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
998 svg.write('</g>\n')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
999
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1000
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1001 svg.write(SVG_FOOTER)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1002 svg.close()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1003 return newfiles,explanations,repOut
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1004
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1005 def doIBS(n=100):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1006 """parse parameters from galaxy
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1007 expect 'input pbed path' 'basename' 'outpath' 'title' 'logpath' 'n'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1008 <command interpreter="python">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1009 rgGRR.py $i.extra_files_path/$i.metadata.base_name "$i.metadata.base_name"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1010 '$out_file1' '$out_file1.files_path' "$title1" '$n' '$Z'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1011 </command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1012
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1013 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1014 u="""<command interpreter="python">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1015 rgGRR.py $i.extra_files_path/$i.metadata.base_name "$i.metadata.base_name"
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1016 '$out_file1' '$out_file1.files_path' "$title1" '$n' '$Z'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1017 </command>
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1018 """
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1019
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1020
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1021 if len(sys.argv) < 7:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1022 print >> sys.stdout, 'Need pbed inpath, basename, out_htmlname, outpath, title, logpath, nSNP, Zcutoff on command line please'
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1023 print >> sys.stdout, u
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1024 sys.exit(1)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1025 ts = '%s%s' % (string.punctuation,string.whitespace)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1026 ptran = string.maketrans(ts,'_'*len(ts))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1027 inpath = sys.argv[1]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1028 ldinpath = os.path.split(inpath)[0]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1029 basename = sys.argv[2]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1030 outhtml = sys.argv[3]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1031 newfilepath = sys.argv[4]
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1032 title = sys.argv[5].translate(ptran)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1033 logfname = 'Log_%s.txt' % title
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1034 logpath = os.path.join(newfilepath,logfname) # log was a child - make part of html extra_files_path zoo
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1035 n = int(sys.argv[6])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1036 try:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1037 Zcutoff = float(sys.argv[7])
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1038 except:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1039 Zcutoff = 2.0
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1040 try:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1041 os.makedirs(newfilepath)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1042 except:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1043 pass
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1044 logf = file(logpath,'w')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1045 efp,ibase_name = os.path.split(inpath) # need to use these for outputs in files_path
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1046 ped = plinkbinJZ.BPed(inpath)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1047 ped.parse(quick=True)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1048 if ped == None:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1049 print >> sys.stderr, '## doIBSpy problem - cannot open %s or %s - cannot run' % (ldreduced,basename)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1050 sys.exit(1)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1051 newfiles,explanations,repOut = doIBSpy(ped=ped,basename=basename,outdir=newfilepath,
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1052 logf=logf,nrsSamples=n,title=title,pdftoo=0,Zcutoff=Zcutoff)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1053 logf.close()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1054 logfs = file(logpath,'r').readlines()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1055 lf = file(outhtml,'w')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1056 lf.write(galhtmlprefix % PROGNAME)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1057 # this is a mess. todo clean up - should each datatype have it's own directory? Yes
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1058 # probably. Then titles are universal - but userId libraries are separate.
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1059 s = '<div>Output from %s run at %s<br>\n' % (PROGNAME,timenow())
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1060 lf.write('<h4>%s</h4>\n' % s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1061 fixed = ["'%s'" % x for x in sys.argv] # add quotes just in case
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1062 s = 'If you need to rerun this analysis, the command line was\n<pre>%s</pre>\n</div>' % (' '.join(fixed))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1063 lf.write(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1064 # various ways of displaying svg - experiments related to missing svg mimetype on test (!)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1065 #s = """<object data="%s" type="image/svg+xml" width="%d" height="%d">
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1066 # <embed src="%s" type="image/svg+xml" width="%d" height="%d" />
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1067 # </object>""" % (newfiles[0],PLOT_WIDTH,PLOT_HEIGHT,newfiles[0],PLOT_WIDTH,PLOT_HEIGHT)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1068 s = """ <embed src="%s" type="image/svg+xml" width="%d" height="%d" />""" % (newfiles[0],PLOT_WIDTH,PLOT_HEIGHT)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1069 #s = """ <iframe src="%s" type="image/svg+xml" width="%d" height="%d" />""" % (newfiles[0],PLOT_WIDTH,PLOT_HEIGHT)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1070 lf.write(s)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1071 lf.write('<div><h4>Click the links below to save output files and plots</h4><br><ol>\n')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1072 for i in range(len(newfiles)):
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1073 if i == 0:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1074 lf.write('<li><a href="%s" type="image/svg+xml" >%s</a></li>\n' % (newfiles[i],explanations[i]))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1075 else:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1076 lf.write('<li><a href="%s">%s</a></li>\n' % (newfiles[i],explanations[i]))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1077 flist = os.listdir(newfilepath)
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1078 for fname in flist:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1079 if not fname in newfiles:
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1080 lf.write('<li><a href="%s">%s</a></li>\n' % (fname,fname))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1081 lf.write('</ol></div>')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1082 lf.write('<div>%s</div>' % ('\n'.join(repOut))) # repOut is a list of tables
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1083 lf.write('<div><hr><h3>Log from this job (also stored in %s)</h3><pre>%s</pre><hr></div>' % (logfname,''.join(logfs)))
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1084 lf.write('</body></html>\n')
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1085 lf.close()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1086 logf.close()
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1087
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1088 if __name__ == '__main__':
9071e359b9a3 Uploaded
xuebing
parents:
diff changeset
1089 doIBS()