annotate bam_to_bigwig/bam_to_wiggle.py @ 1:0ff100a057ef

Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
author brad-chapman
date Tue, 07 Jun 2011 16:26:46 -0400
parents d2c1af657010
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
1 #!/usr/bin/env python
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
2 """Convert BAM files to BigWig file format in a specified region.
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
3
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
4 Usage:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
5 bam_to_wiggle.py <BAM file> [<YAML config>]
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
6 [--outfile=<output file name>
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
7 --chrom=<chrom>
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
8 --start=<start>
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
9 --end=<end>]
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
10
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
11 chrom start and end are optional, in which case they default to everything.
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
12
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
13 The config file is in YAML format and specifies the location of the wigToBigWig
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
14 program from UCSC:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
15
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
16 program:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
17 ucsc_bigwig: wigToBigWig
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
18
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
19 If not specified, these will be assumed to be present in the system path.
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
20
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
21 The script requires:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
22 pysam (http://code.google.com/p/pysam/)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
23 wigToBigWig from UCSC (http://hgdownload.cse.ucsc.edu/admin/exe/)
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
24 If a configuration file is used, then PyYAML is also required (http://pyyaml.org/)
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
25 """
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
26 import os
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
27 import sys
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
28 import subprocess
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
29 import tempfile
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
30 from optparse import OptionParser
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
31 from contextlib import contextmanager, closing
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
32
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
33 import pysam
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
34
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
35 def main(bam_file, config_file=None, chrom='all', start=0, end=None,
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
36 outfile=None):
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
37 if config_file:
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
38 import yaml
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
39 with open(config_file) as in_handle:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
40 config = yaml.load(in_handle)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
41 else:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
42 config = {"program": {"ucsc_bigwig" : "wigToBigWig"}}
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
43 if outfile is None:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
44 outfile = "%s.bigwig" % os.path.splitext(bam_file)[0]
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
45 if start > 0:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
46 start = int(start) - 1
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
47 if end is not None:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
48 end = int(end)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
49 regions = [(chrom, start, end)]
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
50 if os.path.abspath(bam_file) == os.path.abspath(outfile):
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
51 sys.stderr.write("Bad arguments, input and output files are the same.\n")
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
52 sys.exit(1)
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
53 if not (os.path.exists(outfile) and os.path.getsize(outfile) > 0):
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
54 #Use a temp file to avoid any possiblity of not having write permission
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
55 out_handle = tempfile.NamedTemporaryFile(delete=False)
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
56 wig_file = out_handle.name
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
57 with closing(out_handle):
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
58 chr_sizes, wig_valid = write_bam_track(bam_file, regions, config, out_handle)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
59 try:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
60 if wig_valid:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
61 convert_to_bigwig(wig_file, chr_sizes, config, outfile)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
62 finally:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
63 os.remove(wig_file)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
64
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
65 @contextmanager
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
66 def indexed_bam(bam_file, config):
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
67 if not os.path.exists(bam_file + ".bai"):
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
68 pysam.index(bam_file)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
69 sam_reader = pysam.Samfile(bam_file, "rb")
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
70 yield sam_reader
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
71 sam_reader.close()
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
72
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
73 def write_bam_track(bam_file, regions, config, out_handle):
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
74 out_handle.write("track %s\n" % " ".join(["type=wiggle_0",
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
75 "name=%s" % os.path.splitext(os.path.split(bam_file)[-1])[0],
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
76 "visibility=full",
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
77 ]))
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
78 is_valid = False
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
79 with indexed_bam(bam_file, config) as work_bam:
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
80 sizes = zip(work_bam.references, work_bam.lengths)
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
81 if len(regions) == 1 and regions[0][0] == "all":
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
82 regions = [(name, 0, length) for name, length in sizes]
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
83 for chrom, start, end in regions:
1
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
84 if end is None and chrom in work_bam.references:
0ff100a057ef Migrated tool version 0.0.2 from old tool shed archive to new tool shed repository
brad-chapman
parents: 0
diff changeset
85 end = work_bam.lengths[work_bam.references.index(chrom)]
0
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
86 assert end is not None, "Could not find %s in header" % chrom
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
87 out_handle.write("variableStep chrom=%s\n" % chrom)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
88 for col in work_bam.pileup(chrom, start, end):
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
89 out_handle.write("%s %s\n" % (col.pos+1, col.n))
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
90 is_valid = True
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
91 return sizes, is_valid
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
92
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
93 def convert_to_bigwig(wig_file, chr_sizes, config, bw_file=None):
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
94 if not bw_file:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
95 bw_file = "%s.bigwig" % (os.path.splitext(wig_file)[0])
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
96 size_file = "%s-sizes.txt" % (os.path.splitext(wig_file)[0])
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
97 with open(size_file, "w") as out_handle:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
98 for chrom, size in chr_sizes:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
99 out_handle.write("%s\t%s\n" % (chrom, size))
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
100 try:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
101 cl = [config["program"]["ucsc_bigwig"], wig_file, size_file, bw_file]
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
102 subprocess.check_call(cl)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
103 finally:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
104 os.remove(size_file)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
105 return bw_file
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
106
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
107 if __name__ == "__main__":
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
108 parser = OptionParser()
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
109 parser.add_option("-o", "--outfile", dest="outfile")
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
110 parser.add_option("-c", "--chrom", dest="chrom")
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
111 parser.add_option("-s", "--start", dest="start")
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
112 parser.add_option("-e", "--end", dest="end")
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
113 (options, args) = parser.parse_args()
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
114 if len(args) not in [1, 2]:
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
115 print "Incorrect arguments"
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
116 print __doc__
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
117 sys.exit()
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
118 kwargs = dict(
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
119 outfile=options.outfile,
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
120 chrom=options.chrom or 'all',
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
121 start=options.start or 0,
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
122 end=options.end)
d2c1af657010 Migrated tool version 0.0.1 from old tool shed archive to new tool shed repository
brad-chapman
parents:
diff changeset
123 main(*args, **kwargs)