Mercurial > repos > jjohnson > defuse
annotate defuse_trinity_analysis.py @ 13:3a4876d01c7e draft
Uploaded
author | jjohnson |
---|---|
date | Tue, 12 Dec 2017 10:01:42 -0500 |
parents | b22f8634ff84 |
children |
rev | line source |
---|---|
11
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
1 #!/usr/bin/env python |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
2 """ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
3 # |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
4 #------------------------------------------------------------------------------ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
5 # University of Minnesota |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
6 # Copyright 2014, Regents of the University of Minnesota |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
7 #------------------------------------------------------------------------------ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
8 # Author: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
9 # |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
10 # James E Johnson |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
11 # |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
12 #------------------------------------------------------------------------------ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
13 """ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
14 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
15 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
16 """ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
17 This tool takes the defuse results.tsv tab-delimited file, trinity |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
18 and creates a tabular report |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
19 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
20 Would it be possible to create 2 additional files from the deFuse-Trinity comparison program. |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
21 One containing all the Trinity records matched to deFuse records (with the deFuse ID number), |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
22 and the other with the ORFs records matching back to the Trinity records in the first files? |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
23 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
24 M045_Report.csv |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
25 "","deFuse_subset.count","deFuse.gene_name1","deFuse.gene_name2","deFuse.span_count","deFuse.probability","deFuse.gene_chromosome1","deFuse.gene_location1","deFuse.gene_chromosome2","deFuse.gene_location2","deFuse_subset.type" |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
26 "1",1,"Rps6","Dennd4c",7,0.814853504,"4","coding","4","coding","TIC " |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
27 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
28 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
29 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
30 OS03_Matched_Rev.csv |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
31 "count","gene1","gene2","breakpoint","fusion","Trinity_transcript_ID","Trinity_transcript","ID1","protein" |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
32 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
33 "","deFuse.splitr_sequence","deFuse.gene_chromosome1","deFuse.gene_chromosome2","deFuse.gene_location1","deFuse.gene_location2","deFuse.gene_name1","deFuse.gene_name2","deFuse.span_count","deFuse.probability","word1","word2","fusion_part_1","fusion_part_2","fusion_point","fusion_point_rc","count","transcript" |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
34 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
35 """ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
36 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
37 import sys,re,os.path,math |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
38 import textwrap |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
39 import optparse |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
40 from optparse import OptionParser |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
41 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
42 revcompl = lambda x: ''.join([{'A':'T','C':'G','G':'C','T':'A','a':'t','c':'g','g':'c','t':'a','N':'N','n':'n'}[B] for B in x][::-1]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
43 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
44 codon_map = {"UUU":"F", "UUC":"F", "UUA":"L", "UUG":"L", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
45 "UCU":"S", "UCC":"S", "UCA":"S", "UCG":"S", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
46 "UAU":"Y", "UAC":"Y", "UAA":"*", "UAG":"*", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
47 "UGU":"C", "UGC":"C", "UGA":"*", "UGG":"W", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
48 "CUU":"L", "CUC":"L", "CUA":"L", "CUG":"L", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
49 "CCU":"P", "CCC":"P", "CCA":"P", "CCG":"P", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
50 "CAU":"H", "CAC":"H", "CAA":"Q", "CAG":"Q", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
51 "CGU":"R", "CGC":"R", "CGA":"R", "CGG":"R", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
52 "AUU":"I", "AUC":"I", "AUA":"I", "AUG":"M", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
53 "ACU":"T", "ACC":"T", "ACA":"T", "ACG":"T", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
54 "AAU":"N", "AAC":"N", "AAA":"K", "AAG":"K", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
55 "AGU":"S", "AGC":"S", "AGA":"R", "AGG":"R", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
56 "GUU":"V", "GUC":"V", "GUA":"V", "GUG":"V", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
57 "GCU":"A", "GCC":"A", "GCA":"A", "GCG":"A", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
58 "GAU":"D", "GAC":"D", "GAA":"E", "GAG":"E", |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
59 "GGU":"G", "GGC":"G", "GGA":"G", "GGG":"G",} |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
60 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
61 def translate(seq) : |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
62 rna = seq.upper().replace('T','U') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
63 aa = [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
64 for i in range(0,len(rna) - 2, 3): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
65 codon = rna[i:i+3] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
66 aa.append(codon_map[codon] if codon in codon_map else 'X') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
67 return ''.join(aa) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
68 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
69 def get_stop_codons(seq) : |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
70 rna = seq.upper().replace('T','U') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
71 stop_codons = [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
72 for i in range(0,len(rna) - 2, 3): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
73 codon = rna[i:i+3] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
74 aa = codon_map[codon] if codon in codon_map else 'X' |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
75 if aa == '*': |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
76 stop_codons.append(codon) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
77 return stop_codons |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
78 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
79 def read_fasta(fp): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
80 name, seq = None, [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
81 for line in fp: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
82 line = line.rstrip() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
83 if line.startswith(">"): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
84 if name: yield (name, ''.join(seq)) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
85 name, seq = line, [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
86 else: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
87 seq.append(line) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
88 if name: yield (name, ''.join(seq)) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
89 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
90 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
91 def test_rcomplement(seq, target): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
92 try: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
93 comp = revcompl(seq) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
94 return comp in target |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
95 except: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
96 pass |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
97 return False |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
98 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
99 def test_reverse(seq,target): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
100 return options.test_reverse and seq and seq[::-1] in target |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
101 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
102 def cmp_alphanumeric(s1,s2): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
103 if s1 == s2: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
104 return 0 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
105 a1 = re.findall("\d+|[a-zA-Z]+",s1) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
106 a2 = re.findall("\d+|[a-zA-Z]+",s2) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
107 for i in range(min(len(a1),len(a2))): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
108 if a1[i] == a2[i]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
109 continue |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
110 if a1[i].isdigit() and a2[i].isdigit(): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
111 return int(a1[i]) - int(a2[i]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
112 return 1 if a1[i] > a2[i] else -1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
113 return len(a1) - len(a2) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
114 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
115 def parse_defuse_results(inputFile): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
116 defuse_results = [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
117 columns = [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
118 coltype_int = ['expression1', 'expression2', 'gene_start1', 'gene_start2', 'gene_end1', 'gene_end2', 'genomic_break_pos1', 'genomic_break_pos2', 'breakpoint_homology', 'span_count', 'splitr_count', 'splice_score'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
119 coltype_float = ['probability'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
120 coltype_yn = [ 'orf', 'exonboundaries', 'read_through', 'interchromosomal', 'adjacent', 'altsplice', 'deletion', 'eversion', 'inversion'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
121 try: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
122 for linenum,line in enumerate(inputFile): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
123 ## print >> sys.stderr, "%d: %s\n" % (linenum,line) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
124 fields = line.strip().split('\t') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
125 if line.startswith('cluster_id'): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
126 columns = fields |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
127 ## print >> sys.stderr, "columns: %s\n" % columns |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
128 continue |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
129 elif fields and len(fields) == len(columns): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
130 cluster_id = fields[columns.index('cluster_id')] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
131 cluster = dict() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
132 flags = [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
133 defuse_results.append(cluster) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
134 for i,v in enumerate(columns): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
135 if v in coltype_int: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
136 cluster[v] = int(fields[i]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
137 elif v in coltype_float: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
138 cluster[v] = float(fields[i]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
139 elif v in coltype_yn: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
140 cluster[v] = fields[i] == 'Y' |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
141 if cluster[v]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
142 flags.append(columns[i]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
143 else: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
144 cluster[v] = fields[i] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
145 cluster['flags'] = ','.join(flags) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
146 except Exception, e: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
147 print >> sys.stderr, "failed to read cluster_dict: %s" % e |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
148 exit(1) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
149 return defuse_results |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
150 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
151 ## deFuse params to the mapping application? |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
152 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
153 def __main__(): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
154 #Parse Command Line |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
155 parser = optparse.OptionParser() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
156 # files |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
157 parser.add_option( '-i', '--input', dest='input', default=None, help='The input defuse results.tsv file (else read from stdin)' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
158 parser.add_option( '-t', '--transcripts', dest='transcripts', default=None, help='Trinity transcripts' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
159 parser.add_option( '-p', '--peptides', dest='peptides', default=None, help='Trinity ORFs' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
160 parser.add_option( '-o', '--output', dest='output', default=None, help='The output report (else write to stdout)' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
161 parser.add_option( '-m', '--matched', dest='matched', default=None, help='The output matched report' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
162 parser.add_option( '-a', '--transcript_alignment', dest='transcript_alignment', default=None, help='The output alignment file' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
163 parser.add_option( '-A', '--orf_alignment', dest='orf_alignment', default=None, help='The output ORF alignment file' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
164 parser.add_option( '-N', '--nbases', dest='nbases', type='int', default=12, help='Number of bases on either side of the fusion to compare' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
165 parser.add_option( '-L', '--min_pep_len', dest='min_pep_len', type='int', default=100, help='Minimum length of peptide to report' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
166 parser.add_option( '-T', '--ticdist', dest='ticdist', type='int', default=1000000, help='Maximum intrachromosomal distance to be classified a Transcription-induced chimera (TIC)' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
167 parser.add_option( '-P', '--prior_aa', dest='prior_aa', type='int', default=11, help='Number of protein AAs to show preceeding fusion point' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
168 parser.add_option( '-I', '--incomplete_orfs', dest='incomplete_orfs', action='store_true', default=False, help='Count incomplete ORFs' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
169 parser.add_option( '-O', '--orf_type', dest='orf_type', action='append', default=['complete','5prime_partial'], choices=['complete','5prime_partial','3prime_partial','internal'], help='ORF types to report' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
170 parser.add_option( '-r', '--readthrough', dest='readthrough', type='int', default=3, help='Number of stop_codons to read through' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
171 # min_orf_len |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
172 # split_na_len |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
173 # tic_len = 1000000 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
174 # prior |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
175 # deFuse direction reversed |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
176 # in frame ? |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
177 # contain known protein elements |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
178 # what protein change |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
179 # trinity provides full transctipt, defuse doesn't show full |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
180 #parser.add_option( '-r', '--reference', dest='reference', default=None, help='The genomic reference fasta' ) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
181 #parser.add_option( '-g', '--gtf', dest='gtf', default=None, help='The genomic reference gtf feature file') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
182 (options, args) = parser.parse_args() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
183 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
184 # results.tsv input |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
185 if options.input != None: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
186 try: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
187 inputPath = os.path.abspath(options.input) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
188 inputFile = open(inputPath, 'r') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
189 except Exception, e: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
190 print >> sys.stderr, "failed: %s" % e |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
191 exit(2) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
192 else: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
193 inputFile = sys.stdin |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
194 # vcf output |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
195 if options.output != None: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
196 try: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
197 outputPath = os.path.abspath(options.output) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
198 outputFile = open(outputPath, 'w') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
199 except Exception, e: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
200 print >> sys.stderr, "failed: %s" % e |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
201 exit(3) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
202 else: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
203 outputFile = sys.stdout |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
204 outputTxFile = None |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
205 outputOrfFile = None |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
206 if options.transcript_alignment: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
207 try: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
208 outputTxFile = open(options.transcript_alignment,'w') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
209 except Exception, e: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
210 print >> sys.stderr, "failed: %s" % e |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
211 exit(3) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
212 if options.orf_alignment: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
213 try: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
214 outputOrfFile = open(options.orf_alignment,'w') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
215 except Exception, e: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
216 print >> sys.stderr, "failed: %s" % e |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
217 exit(3) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
218 # Add percent match after transcript |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
219 report_fields = ['gene_name1','gene_name2','span_count','probability','gene_chromosome1','gene_location1','gene_chromosome2','gene_location2','fusion_type','Transcript','coverage','Protein','flags','alignments1','alignments2'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
220 report_fields = ['cluster_id','gene_name1','gene_name2','span_count','probability','genomic_bkpt1','gene_location1','genomic_bkpt2','gene_location2','fusion_type','Transcript','coverage','Protein','flags','alignments1','alignments2'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
221 report_colnames = {'gene_name1':'Gene 1','gene_name2':'Gene 2','span_count':'Span cnt','probability':'Probability','gene_chromosome1':'From Chr','gene_location1':'Fusion point','gene_chromosome2':'To Chr','gene_location2':'Fusion point', 'cluster_id':'cluster_id', 'splitr_sequence':'splitr_sequence', 'splitr_count':'splitr_count', 'splitr_span_pvalue':'splitr_span_pvalue', 'splitr_pos_pvalue':'splitr_pos_pvalue', 'splitr_min_pvalue':'splitr_min_pvalue', 'adjacent':'adjacent', 'altsplice':'altsplice', 'break_adj_entropy1':'break_adj_entropy1', 'break_adj_entropy2':'break_adj_entropy2', 'break_adj_entropy_min':'break_adj_entropy_min', 'breakpoint_homology':'breakpoint_homology', 'breakseqs_estislands_percident':'breakseqs_estislands_percident', 'cdna_breakseqs_percident':'cdna_breakseqs_percident', 'deletion':'deletion', 'est_breakseqs_percident':'est_breakseqs_percident', 'eversion':'eversion', 'exonboundaries':'exonboundaries', 'expression1':'expression1', 'expression2':'expression2', 'gene1':'gene1', 'gene2':'gene2', 'gene_align_strand1':'gene_align_strand1', 'gene_align_strand2':'gene_align_strand2', 'gene_end1':'gene_end1', 'gene_end2':'gene_end2', 'gene_start1':'gene_start1', 'gene_start2':'gene_start2', 'gene_strand1':'gene_strand1', 'gene_strand2':'gene_strand2', 'genome_breakseqs_percident':'genome_breakseqs_percident', 'genomic_break_pos1':'genomic_break_pos1', 'genomic_break_pos2':'genomic_break_pos2', 'genomic_strand1':'genomic_strand1', 'genomic_strand2':'genomic_strand2', 'interchromosomal':'interchromosomal', 'interrupted_index1':'interrupted_index1', 'interrupted_index2':'interrupted_index2', 'inversion':'inversion', 'library_name':'library_name', 'max_map_count':'max_map_count', 'max_repeat_proportion':'max_repeat_proportion', 'mean_map_count':'mean_map_count', 'min_map_count':'min_map_count', 'num_multi_map':'num_multi_map', 'num_splice_variants':'num_splice_variants', 'orf':'orf', 'read_through':'read_through', 'repeat_proportion1':'repeat_proportion1', 'repeat_proportion2':'repeat_proportion2', 'span_coverage1':'span_coverage1', 'span_coverage2':'span_coverage2', 'span_coverage_max':'span_coverage_max', 'span_coverage_min':'span_coverage_min', 'splice_score':'splice_score', 'splicing_index1':'splicing_index1', 'splicing_index2':'splicing_index2', 'fusion_type':'Type', 'coverage':'fusion%','Transcript':'Transcript?','Protein':'Protein?','flags':'descriptions','fwd_seq':'fusion','alignments1':'alignments1','alignments2':'alignments2','genomic_bkpt1':'From Chr', 'genomic_bkpt2':'To Chr'} |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
222 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
223 ## Read defuse results |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
224 fusions = parse_defuse_results(inputFile) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
225 ## Create a field with the 12 nt before and after the fusion point. |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
226 ## Create a field with the reverse complement of the 24 nt fusion point field. |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
227 ## Add fusion type filed (INTER, INTRA, TIC) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
228 for i,fusion in enumerate(fusions): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
229 fusion['ordinal'] = i + 1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
230 fusion['genomic_bkpt1'] = "%s:%d" % (fusion['gene_chromosome1'], fusion['genomic_break_pos1']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
231 fusion['genomic_bkpt2'] = "%s:%d" % (fusion['gene_chromosome2'], fusion['genomic_break_pos2']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
232 fusion['alignments1'] = "%s%s%s" % (fusion['genomic_strand1'], fusion['gene_strand1'], fusion['gene_align_strand1']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
233 fusion['alignments2'] = "%s%s%s" % (fusion['genomic_strand2'], fusion['gene_strand2'], fusion['gene_align_strand2']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
234 split_seqs = fusion['splitr_sequence'].split('|') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
235 fusion['split_seqs'] = split_seqs |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
236 fusion['split_seqs'] = split_seqs |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
237 fusion['split_seq_lens'] = [len(split_seqs[0]),len(split_seqs[1])] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
238 fusion['split_max_lens'] = [len(split_seqs[0]),len(split_seqs[1])] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
239 fwd_off = min(abs(options.nbases),len(split_seqs[0])) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
240 rev_off = min(abs(options.nbases),len(split_seqs[1])) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
241 fusion['fwd_off'] = fwd_off |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
242 fusion['rev_off'] = rev_off |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
243 fwd_seq = split_seqs[0][-fwd_off:] + split_seqs[1][:rev_off] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
244 rev_seq = revcompl(fwd_seq) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
245 fusion['fwd_seq'] = fwd_seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
246 fusion['rev_seq'] = rev_seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
247 fusion_type = 'inter' if fusion['gene_chromosome1'] != fusion['gene_chromosome2'] else 'intra' if abs(fusion['genomic_break_pos1'] - fusion['genomic_break_pos2']) > options.ticdist else 'TIC' |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
248 fusion['fusion_type'] = fusion_type |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
249 fusion['transcripts'] = dict() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
250 fusion['Transcript'] = 'No' |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
251 fusion['coverage'] = 0 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
252 fusion['Protein'] = 'No' |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
253 # print >> sys.stdout, "%4d\t%6s\t%s\t%s\t%s\t%s\t%s" % (i,fusion['cluster_id'],fwd_seq,rev_seq,fusion_type,fusion['gene_name1'],fusion['gene_name2']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
254 inputFile.close() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
255 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
256 ## Process Trinity data and compare to deFuse |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
257 matched_transcripts = dict() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
258 matched_orfs = dict() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
259 transcript_orfs = dict() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
260 fusions_with_transcripts = set() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
261 fusions_with_orfs = set() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
262 ## fusion['transcripts'][tx_id] { revcompl:?, bkpt:n, seq1: , seq2: , match1:n, match2:n} |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
263 n = 0 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
264 if options.transcripts: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
265 with open(options.transcripts) as fp: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
266 for tx_full_id, seq in read_fasta(fp): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
267 n += 1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
268 for i,fusion in enumerate(fusions): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
269 if fusion['fwd_seq'] in seq or fusion['rev_seq'] in seq: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
270 fusions_with_transcripts.add(i) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
271 fusion['Transcript'] = 'Yes' |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
272 tx_id = tx_full_id.lstrip('>').split()[0] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
273 matched_transcripts[tx_full_id] = seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
274 fusion['transcripts'][tx_id] = dict() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
275 fusion['transcripts'][tx_id]['seq'] = seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
276 fusion['transcripts'][tx_id]['full_id'] = tx_full_id |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
277 pos = seq.find(fusion['fwd_seq']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
278 if pos >= 0: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
279 tx_bkpt = pos + fusion['fwd_off'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
280 # fusion['transcripts'][tx_full_id] = tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
281 if tx_bkpt > fusion['split_max_lens'][0]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
282 fusion['split_max_lens'][0] = tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
283 len2 = len(seq) - tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
284 if len2 > fusion['split_max_lens'][1]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
285 fusion['split_max_lens'][1] = len2 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
286 fusion['transcripts'][tx_id]['bkpt'] = tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
287 fusion['transcripts'][tx_id]['revcompl'] = False |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
288 fusion['transcripts'][tx_id]['seq1'] = seq[:tx_bkpt] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
289 fusion['transcripts'][tx_id]['seq2'] = seq[tx_bkpt:] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
290 else: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
291 pos = seq.find(fusion['rev_seq']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
292 tx_bkpt = pos + fusion['rev_off'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
293 # fusion['transcripts'][tx_full_id] = -tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
294 if tx_bkpt > fusion['split_max_lens'][1]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
295 fusion['split_max_lens'][1] = tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
296 len2 = len(seq) - tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
297 if len2 > fusion['split_max_lens'][0]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
298 fusion['split_max_lens'][0] = len2 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
299 rseq = revcompl(seq) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
300 pos = rseq.find(fusion['fwd_seq']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
301 tx_bkpt = pos + fusion['fwd_off'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
302 fusion['transcripts'][tx_id]['bkpt'] = tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
303 fusion['transcripts'][tx_id]['revcompl'] = True |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
304 fusion['transcripts'][tx_id]['seq1'] = rseq[:tx_bkpt] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
305 fusion['transcripts'][tx_id]['seq2'] = rseq[tx_bkpt:] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
306 fseq = fusion['split_seqs'][0] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
307 tseq = fusion['transcripts'][tx_id]['seq1'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
308 mlen = min(len(fseq),len(tseq)) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
309 fusion['transcripts'][tx_id]['match1'] = mlen |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
310 for j in range(1,mlen+1): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
311 if fseq[-j] != tseq[-j]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
312 fusion['transcripts'][tx_id]['match1'] = j - 1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
313 break |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
314 fseq = fusion['split_seqs'][1] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
315 tseq = fusion['transcripts'][tx_id]['seq2'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
316 mlen = min(len(fseq),len(tseq)) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
317 fusion['transcripts'][tx_id]['match2'] = mlen |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
318 for j in range(mlen): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
319 if fseq[j] != tseq[j]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
320 fusion['transcripts'][tx_id]['match2'] = j |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
321 break |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
322 # coverage = math.floor(float(fusion['transcripts'][tx_id]['match1'] + fusion['transcripts'][tx_id]['match2']) * 100. / len(fusion['split_seqs'][0]+fusion['split_seqs'][1])) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
323 coverage = int((fusion['transcripts'][tx_id]['match1'] + fusion['transcripts'][tx_id]['match2']) * 1000. / len(fusion['split_seqs'][0]+fusion['split_seqs'][1])) * .1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
324 # print >> sys.stderr, "%s\t%d\t%d\t%d\%s\t\t%d\t%d\t%d\t%d" % (tx_id,fusion['transcripts'][tx_id]['match1'],fusion['transcripts'][tx_id]['match2'],len(fusion['split_seqs'][0]+fusion['split_seqs'][1]),coverage,len( fusion['split_seqs'][0]),len(fusion['transcripts'][tx_id]['seq1']),len(fusion['split_seqs'][1]),len(fusion['transcripts'][tx_id]['seq2'])) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
325 fusion['coverage'] = max(coverage,fusion['coverage']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
326 print >> sys.stdout, "fusions_with_transcripts: %d %s\n matched_transcripts: %d" % (len(fusions_with_transcripts),fusions_with_transcripts,len(matched_transcripts)) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
327 ##for i,fusion in enumerate(fusions): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
328 ## print >> sys.stdout, "%4d\t%6s\t%s\t%s\t%s\t%s\t%s\t%s" % (i,fusion['cluster_id'],fusion['fwd_seq'],fusion['rev_seq'],fusion['fusion_type'],fusion['gene_name1'],fusion['gene_name2'], fusion['transcripts']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
329 ## Process ORFs and compare to matched deFuse and Trinity data. |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
330 ## Proteins must be at least 100 aa long, starting at the first "M" and must end with an "*". |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
331 if options.peptides: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
332 with open(options.peptides) as fp: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
333 for orf_full_id, seq in read_fasta(fp): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
334 n += 1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
335 if len(seq) < options.min_pep_len: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
336 continue |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
337 orf_type = re.match('^.* type:(\S+) .*$',orf_full_id).groups()[0] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
338 ## if not seq[-1] == '*' and not options.incomplete_orfs: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
339 ## if not orf_type 'complete' and not options.incomplete_orfs: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
340 if orf_type not in options.orf_type: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
341 continue |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
342 for i,fusion in enumerate(fusions): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
343 if len(fusion['transcripts']) > 0: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
344 for tx_id in fusion['transcripts']: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
345 ## >m.196252 g.196252 ORF g.196252 m.196252 type:complete len:237 (+) comp100000_c5_seq2:315-1025(+) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
346 ## >m.134565 g.134565 ORF g.134565 m.134565 type:5prime_partial len:126 (-) comp98702_c1_seq21:52-429(-) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
347 if tx_id+':' not in orf_full_id: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
348 continue |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
349 m = re.match("^.*%s:(\d+)-(\d+)[(]([+-])[)].*" % re.sub('([|.{}()$?^])','[\\1]',tx_id),orf_full_id) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
350 if m: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
351 if not m.groups() or len(m.groups()) < 3 or m.groups()[0] == None: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
352 print >> sys.stderr, "Error:\n%s\n%s\n" % (tx_id,orf_full_id) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
353 orf_id = orf_full_id.lstrip('>').split()[0] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
354 if not tx_id in transcript_orfs: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
355 transcript_orfs[tx_id] = [] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
356 alignments = "%s%s%s %s%s%s" % (fusion['genomic_strand1'], fusion['gene_strand1'], fusion['gene_align_strand1'], fusion['genomic_strand2'], fusion['gene_strand2'], fusion['gene_align_strand2']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
357 # print >> sys.stdout, "%d %s bkpt:%d %s rc:%s (%s) %s" % (fusion['ordinal'], tx_id, int(fusion['transcripts'][tx_id]['bkpt']), str(m.groups()), str(fusion['transcripts'][tx_id]['revcompl']), alignments, orf_full_id) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
358 start = seq.find('M') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
359 pep_len = len(seq) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
360 if pep_len - start < options.min_pep_len: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
361 continue |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
362 orf_dict = dict() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
363 transcript_orfs[tx_id].append(orf_dict) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
364 fusions_with_orfs.add(i) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
365 matched_orfs[orf_full_id] = seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
366 fusion['Protein'] = 'Yes' |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
367 tx_start = int(m.groups()[0]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
368 tx_end = int(m.groups()[1]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
369 tx_strand = m.groups()[2] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
370 tx_bkpt = fusion['transcripts'][tx_id]['bkpt'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
371 orf_dict['orf_id'] = orf_id |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
372 orf_dict['tx_start'] = tx_start |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
373 orf_dict['tx_end'] = tx_end |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
374 orf_dict['tx_strand'] = tx_strand |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
375 orf_dict['tx_bkpt'] = tx_bkpt |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
376 orf_dict['seq'] = seq[:start].lower() + seq[start:] if start > 0 else seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
377 ## >m.208656 g.208656 ORF g.208656 m.208656 type:5prime_partial len:303 (+) comp100185_c2_seq9:2-910(+) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
378 ## translate(tx34[1:910]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
379 ## translate(tx34[1:2048]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
380 ## comp99273_c1_seq1 len=3146 (-2772) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
381 ## >m.158338 g.158338 ORF g.158338 m.158338 type:complete len:785 (-) comp99273_c1_seq1:404-2758(-) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
382 ## translate(tx[-2758:-403]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
383 ## comp100185_c2_seq9 len=2048 (904) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
384 ## novel protein sequence |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
385 ## find first novel AA |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
386 ## get prior n AAs |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
387 ## get novel AA seq thru n stop codons |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
388 ### tx_seq = matched_transcripts[tx_full_id] if tx_bkpt >= 0 else revcompl(tx_seq) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
389 tx_seq = fusion['transcripts'][tx_id]['seq'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
390 orf_dict['tx_seq'] = tx_seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
391 novel_tx_seq = tx_seq[tx_start - 1:] if tx_strand == '+' else revcompl(tx_seq[:tx_end]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
392 read_thru_pep = translate(novel_tx_seq) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
393 # fusion['transcripts'][tx_id]['revcompl'] = True |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
394 # tx_bkpt = fusion['transcripts'][tx_id]['bkpt'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
395 # bkpt_aa_pos = tx_bkpt - tx_start - 1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
396 # bkpt_aa_pos = (tx_bkpt - tx_start - 1) / 3 if tx_strand == '+' else tx_end |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
397 # print >> sys.stdout, "%s\n%s" % (seq,read_thru_pep) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
398 stop_codons = get_stop_codons(novel_tx_seq) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
399 if options.readthrough: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
400 readthrough = options.readthrough + 1 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
401 read_thru_pep = '*'.join(read_thru_pep.split('*')[:readthrough]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
402 stop_codons = stop_codons[:readthrough] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
403 orf_dict['read_thru_pep'] = read_thru_pep |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
404 orf_dict['stop_codons'] = ','.join(stop_codons) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
405 print >> sys.stdout, "fusions_with_orfs: %d %s\n matched_orfs: %d" % (len(fusions_with_orfs),fusions_with_orfs,len(matched_orfs)) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
406 ## Alignments 3 columns, seq columns padded out to longest seq, UPPERCASE_match diffs lowercase |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
407 ### defuse_id pre_split_seq post_split_seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
408 ### trinity_id pre_split_seq post_split_seq |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
409 ## Transcripts alignment output |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
410 ## Peptide alignment output |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
411 ## Write reports |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
412 ## OS03_Matched_Rev.csv |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
413 ## "count","gene1","gene2","breakpoint","fusion","Trinity_transcript_ID","Trinity_transcript","ID1","protein" |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
414 if options.transcripts and options.matched: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
415 #match_fields = ['ordinal','gene_name1','gene_name2','fwd_seq'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
416 outputMatchFile = open(options.matched,'w') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
417 #print >> outputMatchFile, '\t'.join(["#fusion_id","cluster_id","gene1","gene2","breakpoint","fusion","Trinity_transcript_ID","Trinity_transcript","Trinity_ORF_Transcript","Trinity_ORF_ID","protein","read_through","stop_codons"]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
418 print >> outputMatchFile, '\t'.join(["#fusion_id","cluster_id","gene1","gene2","breakpoint","fusion","Trinity_transcript_ID","Trinity_transcript","Trinity_ORF_Transcript","Trinity_ORF_ID","protein","stop_codons"]) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
419 for i,fusion in enumerate(fusions): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
420 if len(fusion['transcripts']) > 0: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
421 for tx_id in fusion['transcripts'].keys(): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
422 if tx_id in transcript_orfs: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
423 for orf_dict in transcript_orfs[tx_id]: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
424 if 'tx_seq' not in orf_dict: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
425 print >> sys.stderr, "orf_dict %s" % orf_dict |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
426 #fields = [str(fusion['ordinal']),str(fusion['cluster_id']),fusion['gene_name1'],fusion['gene_name2'],fusion['fwd_seq'],fusion['splitr_sequence'],tx_id, fusion['transcripts'][tx_id]['seq1']+'|'+fusion['transcripts'][tx_id]['seq2'],orf_dict['tx_seq'],orf_dict['orf_id'],orf_dict['seq'],orf_dict['read_thru_pep'],orf_dict['stop_codons']] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
427 fields = [str(fusion['ordinal']),str(fusion['cluster_id']),fusion['gene_name1'],fusion['gene_name2'],fusion['fwd_seq'],fusion['splitr_sequence'],tx_id, fusion['transcripts'][tx_id]['seq1']+'|'+fusion['transcripts'][tx_id]['seq2'],orf_dict['tx_seq'],orf_dict['orf_id'],orf_dict['read_thru_pep'],orf_dict['stop_codons']] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
428 print >> outputMatchFile, '\t'.join(fields) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
429 outputMatchFile.close() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
430 if options.transcripts and options.transcript_alignment: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
431 if outputTxFile: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
432 id_fields = ['gene_name1','alignments1','gene_name2','alignments2','span_count','probability','gene_chromosome1','gene_location1','gene_chromosome2','gene_location2','fusion_type','Transcript','Protein','flags'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
433 fa_width = 80 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
434 for i,fusion in enumerate(fusions): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
435 if len(fusion['transcripts']) > 0: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
436 alignments1 = "%s%s%s" % (fusion['genomic_strand1'], fusion['gene_strand1'], fusion['gene_align_strand1']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
437 alignments2 = "%s%s%s" % (fusion['genomic_strand2'], fusion['gene_strand2'], fusion['gene_align_strand2']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
438 alignments = "%s%s%s %s%s%s" % (fusion['genomic_strand1'], fusion['gene_strand1'], fusion['gene_align_strand1'], fusion['genomic_strand2'], fusion['gene_strand2'], fusion['gene_align_strand2']) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
439 fusion_id = "%s (%s) %s" % (i + 1,alignments,' '.join([str(fusion[x]) for x in report_fields])) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
440 for tx_id in fusion['transcripts'].keys(): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
441 m1 = fusion['transcripts'][tx_id]['match1'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
442 f_seq1 = fusion['split_seqs'][0][:-m1].lower() + fusion['split_seqs'][0][-m1:] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
443 t_seq1 = fusion['transcripts'][tx_id]['seq1'][:-m1].lower() + fusion['transcripts'][tx_id]['seq1'][-m1:] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
444 if len(f_seq1) > len(t_seq1): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
445 t_seq1 = t_seq1.rjust(len(f_seq1),'.') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
446 elif len(f_seq1) < len(t_seq1): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
447 f_seq1 = f_seq1.rjust(len(t_seq1),'.') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
448 m2 = fusion['transcripts'][tx_id]['match2'] |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
449 f_seq2 = fusion['split_seqs'][1][:m2] + fusion['split_seqs'][1][m2:].lower() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
450 t_seq2 = fusion['transcripts'][tx_id]['seq2'][:m2] + fusion['transcripts'][tx_id]['seq2'][m2:].lower() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
451 if len(f_seq2) > len(t_seq2): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
452 t_seq2 = t_seq2.ljust(len(f_seq2),'.') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
453 elif len(f_seq2) < len(t_seq2): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
454 f_seq2 = f_seq2.ljust(len(t_seq2),'.') |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
455 print >> outputTxFile, ">%s\n%s\n%s" % (fusion_id,'\n'.join(textwrap.wrap(f_seq1,fa_width)),'\n'.join(textwrap.wrap(f_seq2,fa_width))) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
456 print >> outputTxFile, "%s bkpt:%d rev_compl:%s\n%s\n%s" % (fusion['transcripts'][tx_id]['full_id'],fusion['transcripts'][tx_id]['bkpt'],str(fusion['transcripts'][tx_id]['revcompl']),'\n'.join(textwrap.wrap(t_seq1,fa_width)),'\n'.join(textwrap.wrap(t_seq2,fa_width))) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
457 """ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
458 if options.peptides and options.orf_alignment: |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
459 pass |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
460 """ |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
461 print >> outputFile,"%s\t%s" % ('#','\t'.join([report_colnames[x] for x in report_fields])) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
462 for i,fusion in enumerate(fusions): |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
463 print >> outputFile,"%s\t%s" % (i + 1,'\t'.join([str(fusion[x]) for x in report_fields])) |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
464 |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
465 if __name__ == "__main__" : __main__() |
b22f8634ff84
planemo upload for repository https://github.com/jj-umn/galaxytools/tree/master/defuse commit 23b94b5747c6956360cd2eca0a07a669929ea141-dirty
jjohnson
parents:
diff
changeset
|
466 |