Mercurial > repos > fubar > jbrowse2
annotate jb2_GFF/GFFParser.py @ 18:2e6c48910819 draft
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a572525c0f1d7b4dbae1c3aaa4e748aa019d8347
author | fubar |
---|---|
date | Mon, 29 Jan 2024 02:34:43 +0000 |
parents | 4c201a3d4755 |
children |
rev | line source |
---|---|
17
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1 """Parse GFF files into features attached to Biopython SeqRecord objects. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
2 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
3 This deals with GFF3 formatted files, a tab delimited format for storing |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
4 sequence features and annotations: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
5 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
6 http://www.sequenceontology.org/gff3.shtml |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
7 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
8 It will also deal with older GFF versions (GTF/GFF2): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
9 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
10 http://www.sanger.ac.uk/Software/formats/GFF/GFF_Spec.shtml |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
11 http://mblab.wustl.edu/GTF22.html |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
12 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
13 The implementation utilizes map/reduce parsing of GFF using Disco. Disco |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
14 (http://discoproject.org) is a Map-Reduce framework for Python utilizing |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
15 Erlang for parallelization. The code works on a single processor without |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
16 Disco using the same architecture. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
17 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
18 import os |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
19 import copy |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
20 import json |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
21 import re |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
22 import collections |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
23 import io |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
24 import itertools |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
25 import warnings |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
26 import six |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
27 from six.moves import urllib |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
28 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
29 from Bio.SeqRecord import SeqRecord |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
30 from Bio import SeqFeature |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
31 from Bio import SeqIO |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
32 from Bio import BiopythonDeprecationWarning |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
33 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
34 import disco |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
35 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
36 # Make defaultdict compatible with versions of python older than 2.4 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
37 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
38 collections.defaultdict |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
39 except AttributeError: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
40 import _utils |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
41 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
42 collections.defaultdict = _utils.defaultdict |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
43 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
44 unknown_seq_avail = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
45 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
46 from Bio.Seq import UnknownSeq |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
47 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
48 unknown_seq_avail = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
49 except ImportError: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
50 # Starting with biopython 1.81, has been removed |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
51 from Bio.Seq import _UndefinedSequenceData |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
52 from Bio.Seq import Seq |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
53 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
54 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
55 warnings.simplefilter("ignore", BiopythonDeprecationWarning) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
56 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
57 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
58 def _gff_line_map(line, params): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
59 """Map part of Map-Reduce; parses a line of GFF into a dictionary. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
60 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
61 Given an input line from a GFF file, this: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
62 - decides if the file passes our filtering limits |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
63 - if so: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
64 - breaks it into component elements |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
65 - determines the type of attribute (flat, parent, child or annotation) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
66 - generates a dictionary of GFF info which can be serialized as JSON |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
67 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
68 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
69 def _merge_keyvals(parts): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
70 """Merge key-values escaped by quotes |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
71 that are improperly split at semicolons.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
72 out = [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
73 for i, p in enumerate(parts): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
74 if ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
75 i > 0 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
76 and len(p) == 1 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
77 and p[0].endswith('"') |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
78 and not p[0].startswith('"') |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
79 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
80 if out[-1][-1].startswith('"'): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
81 prev_p = out.pop(-1) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
82 to_merge = prev_p[-1] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
83 prev_p[-1] = "%s; %s" % (to_merge, p[0]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
84 out.append(prev_p) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
85 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
86 out.append(p) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
87 return out |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
88 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
89 gff3_kw_pat = re.compile(r"\w+=") |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
90 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
91 def _split_keyvals(keyval_str): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
92 """Split key-value pairs in a GFF2, GTF and GFF3 compatible way. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
93 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
94 GFF3 has key value pairs like: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
95 count=9;gene=amx-2;sequence=SAGE:aacggagccg |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
96 GFF2 and GTF have: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
97 Sequence "Y74C9A" ; Note "Clone Y74C9A; Genbank AC024206" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
98 name "fgenesh1_pg.C_chr_1000003"; transcriptId 869 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
99 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
100 quals = collections.defaultdict(list) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
101 if keyval_str is None: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
102 return quals |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
103 # ensembl GTF has a stray semi-colon at the end |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
104 if keyval_str[-1] == ";": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
105 keyval_str = keyval_str[:-1] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
106 # GFF2/GTF has a semi-colon with at least one space after it. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
107 # It can have spaces on both sides; wormbase does this. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
108 # GFF3 works with no spaces. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
109 # Split at the first one we can recognize as working |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
110 parts = keyval_str.split(" ; ") |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
111 if len(parts) == 1: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
112 parts = [x.strip() for x in keyval_str.split(";")] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
113 # check if we have GFF3 style key-vals (with =) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
114 is_gff2 = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
115 if gff3_kw_pat.match(parts[0]): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
116 is_gff2 = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
117 key_vals = _merge_keyvals([p.split("=") for p in parts]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
118 # otherwise, we are separated by a space with a key as the first item |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
119 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
120 pieces = [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
121 for p in parts: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
122 # fix misplaced semi-colons in keys in some GFF2 files |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
123 if p and p[0] == ";": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
124 p = p[1:] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
125 pieces.append(p.strip().split(" ")) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
126 key_vals = [(p[0], " ".join(p[1:])) for p in pieces] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
127 for item in key_vals: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
128 # standard in-spec items are key=value |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
129 if len(item) == 2: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
130 key, val = item |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
131 # out-of-spec files can have just key values. We set an empty value |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
132 # which will be changed to true later to standardize. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
133 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
134 assert len(item) == 1, item |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
135 key = item[0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
136 val = "" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
137 # remove quotes in GFF2 files |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
138 quoted = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
139 if len(val) > 0 and val[0] == '"' and val[-1] == '"': |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
140 quoted = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
141 val = val[1:-1] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
142 if val: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
143 if quoted: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
144 quals[key].append(val) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
145 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
146 quals[key].extend( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
147 [v for v in val.split(",") if v] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
148 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
149 # if we don't have a value, make this a key=True/False style |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
150 # attribute |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
151 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
152 quals[key].append("true") |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
153 for key, vals in quals.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
154 quals[key] = [urllib.parse.unquote(v) for v in vals] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
155 return quals, is_gff2 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
156 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
157 def _nest_gff2_features(gff_parts): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
158 """Provide nesting of GFF2 transcript parts with transcript IDs. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
159 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
160 exons and coding sequences are mapped to a parent with a transcript_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
161 in GFF2. This is implemented differently at different genome centers |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
162 and this function attempts to resolve that and map things to the GFF3 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
163 way of doing them. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
164 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
165 # map protein or transcript ids to a parent |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
166 for transcript_id in [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
167 "transcript_id", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
168 "transcriptId", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
169 "proteinId", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
170 ]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
171 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
172 gff_parts["quals"]["Parent"] = gff_parts["quals"][ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
173 transcript_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
174 ] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
175 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
176 except KeyError: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
177 pass |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
178 # case for WormBase GFF -- everything labelled as Transcript or CDS |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
179 for flat_name in ["Transcript", "CDS"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
180 if flat_name in gff_parts["quals"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
181 # parent types |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
182 if gff_parts["type"] in [flat_name]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
183 if not gff_parts["id"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
184 gff_parts["id"] = gff_parts["quals"][ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
185 flat_name |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
186 ][0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
187 gff_parts["quals"]["ID"] = [gff_parts["id"]] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
188 # children types |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
189 elif gff_parts["type"] in [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
190 "intron", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
191 "exon", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
192 "three_prime_UTR", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
193 "coding_exon", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
194 "five_prime_UTR", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
195 "CDS", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
196 "stop_codon", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
197 "start_codon", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
198 ]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
199 gff_parts["quals"]["Parent"] = gff_parts["quals"][ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
200 flat_name |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
201 ] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
202 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
203 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
204 return gff_parts |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
205 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
206 strand_map = {"+": 1, "-": -1, "?": None, None: None} |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
207 line = line.strip() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
208 if line[:2] == "##": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
209 return [("directive", line[2:])] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
210 elif line and line[0] != "#": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
211 parts = line.split("\t") |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
212 should_do = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
213 if params.limit_info: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
214 for limit_name, limit_values in params.limit_info.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
215 cur_id = tuple( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
216 [parts[i] for i in params.filter_info[limit_name]] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
217 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
218 if cur_id not in limit_values: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
219 should_do = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
220 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
221 if should_do: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
222 assert len(parts) >= 8, line |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
223 # not python2.4 compatible but easier to understand |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
224 # gff_parts = [(None if p == '.' else p) for p in parts] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
225 gff_parts = [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
226 for p in parts: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
227 if p == ".": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
228 gff_parts.append(None) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
229 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
230 gff_parts.append(p) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
231 gff_info = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
232 # collect all of the base qualifiers for this item |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
233 if len(parts) > 8: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
234 quals, is_gff2 = _split_keyvals(gff_parts[8]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
235 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
236 quals, is_gff2 = collections.defaultdict(list), False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
237 gff_info["is_gff2"] = is_gff2 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
238 if gff_parts[1]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
239 quals["source"].append(gff_parts[1]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
240 if gff_parts[5]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
241 quals["score"].append(gff_parts[5]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
242 if gff_parts[7]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
243 quals["phase"].append(gff_parts[7]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
244 gff_info["quals"] = dict(quals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
245 gff_info["rec_id"] = gff_parts[0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
246 # if we are describing a location, then we are a feature |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
247 if gff_parts[3] and gff_parts[4]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
248 gff_info["location"] = [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
249 int(gff_parts[3]) - 1, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
250 int(gff_parts[4]), |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
251 ] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
252 gff_info["type"] = gff_parts[2] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
253 gff_info["id"] = quals.get("ID", [""])[0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
254 gff_info["strand"] = strand_map.get( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
255 gff_parts[6], None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
256 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
257 if is_gff2: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
258 gff_info = _nest_gff2_features(gff_info) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
259 # features that have parents need to link so we can pick up |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
260 # the relationship |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
261 if "Parent" in gff_info["quals"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
262 # check for self referential parent/child relationships |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
263 # remove the ID, which is not useful |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
264 for p in gff_info["quals"]["Parent"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
265 if p == gff_info["id"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
266 gff_info["id"] = "" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
267 del gff_info["quals"]["ID"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
268 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
269 final_key = "child" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
270 elif gff_info["id"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
271 final_key = "parent" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
272 # Handle flat features |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
273 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
274 final_key = "feature" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
275 # otherwise, associate these annotations with the full record |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
276 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
277 final_key = "annotation" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
278 if params.jsonify: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
279 return [(final_key, json.dumps(gff_info))] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
280 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
281 return [(final_key, gff_info)] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
282 return [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
283 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
284 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
285 def _gff_line_reduce(map_results, out, params): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
286 """Reduce part of Map-Reduce; combines results of parsed features.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
287 final_items = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
288 for gff_type, final_val in map_results: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
289 if params.jsonify and gff_type not in ["directive"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
290 final_val = json.loads(final_val) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
291 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
292 final_items[gff_type].append(final_val) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
293 except KeyError: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
294 final_items[gff_type] = [final_val] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
295 for key, vals in final_items.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
296 if params.jsonify: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
297 vals = json.dumps(vals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
298 out.add(key, vals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
299 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
300 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
301 class _MultiIDRemapper: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
302 """Provide an ID remapping for cases where a parent has a non-unique ID. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
303 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
304 Real life GFF3 cases have non-unique ID attributes, which we fix here |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
305 by using the unique sequence region to assign children to the right |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
306 parent. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
307 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
308 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
309 def __init__(self, base_id, all_parents): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
310 self._base_id = base_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
311 self._parents = all_parents |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
312 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
313 def remap_id(self, feature_dict): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
314 rstart, rend = feature_dict["location"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
315 for index, parent in enumerate(self._parents): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
316 pstart, pend = parent["location"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
317 if rstart >= pstart and rend <= pend: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
318 if index > 0: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
319 return "%s_%s" % (self._base_id, index + 1) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
320 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
321 return self._base_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
322 # if we haven't found a location match but parents are umabiguous, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
323 # return that |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
324 if len(self._parents) == 1: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
325 return self._base_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
326 raise ValueError( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
327 "Did not find remapped ID location: %s, %s, %s" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
328 % ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
329 self._base_id, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
330 [p["location"] for p in self._parents], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
331 feature_dict["location"], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
332 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
333 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
334 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
335 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
336 class _AbstractMapReduceGFF: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
337 """Base class providing general GFF parsing for local and remote classes. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
338 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
339 This class should be subclassed to provide a concrete class to parse |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
340 GFF under specific conditions. These classes need to implement |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
341 the _gff_process function, which returns a dictionary of SeqRecord |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
342 information. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
343 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
344 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
345 def __init__(self, create_missing=True): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
346 """Initialize GFF parser |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
347 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
348 create_missing - If True, create blank records for GFF ids not in |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
349 the base_dict. If False, an error will be raised. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
350 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
351 self._create_missing = create_missing |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
352 self._map_fn = _gff_line_map |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
353 self._reduce_fn = _gff_line_reduce |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
354 self._examiner = GFFExaminer() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
355 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
356 def _gff_process(self, gff_files, limit_info, target_lines=None): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
357 raise NotImplementedError("Derived class must define") |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
358 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
359 def parse(self, gff_files, base_dict=None, limit_info=None): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
360 """Parse a GFF file, returning an iterator of SeqRecords. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
361 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
362 limit_info - A dictionary specifying the regions of the GFF file |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
363 which should be extracted. This allows only relevant portions of a file |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
364 to be parsed. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
365 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
366 base_dict - A base dictionary of SeqRecord objects which may be |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
367 pre-populated with sequences and other features. The new features from |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
368 the GFF file will be added to this dictionary. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
369 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
370 for rec in self.parse_in_parts( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
371 gff_files, base_dict, limit_info |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
372 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
373 yield rec |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
374 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
375 def parse_in_parts( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
376 self, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
377 gff_files, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
378 base_dict=None, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
379 limit_info=None, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
380 target_lines=None, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
381 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
382 """Parse a region of a GFF file specified, returning info as generated. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
383 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
384 target_lines -- The number of lines in the file which should be used |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
385 for each partial parse. This should be determined based on available |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
386 memory. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
387 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
388 for results in self.parse_simple( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
389 gff_files, limit_info, target_lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
390 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
391 if base_dict is None: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
392 cur_dict = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
393 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
394 cur_dict = copy.deepcopy(base_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
395 cur_dict = self._results_to_features(cur_dict, results) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
396 all_ids = list(cur_dict.keys()) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
397 all_ids.sort() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
398 for cur_id in all_ids: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
399 yield cur_dict[cur_id] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
400 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
401 def parse_simple( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
402 self, gff_files, limit_info=None, target_lines=1 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
403 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
404 """Simple parse which does not build or nest features. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
405 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
406 This returns a simple dictionary representation of each line in the |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
407 GFF file. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
408 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
409 # gracefully handle a single file passed |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
410 if not isinstance(gff_files, (list, tuple)): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
411 gff_files = [gff_files] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
412 limit_info = self._normalize_limit_info(limit_info) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
413 for results in self._gff_process( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
414 gff_files, limit_info, target_lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
415 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
416 yield results |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
417 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
418 def _normalize_limit_info(self, limit_info): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
419 """Turn all limit information into tuples for identical comparisons.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
420 final_limit_info = {} |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
421 if limit_info: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
422 for key, values in limit_info.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
423 final_limit_info[key] = [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
424 for v in values: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
425 if isinstance(v, str): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
426 final_limit_info[key].append((v,)) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
427 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
428 final_limit_info[key].append(tuple(v)) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
429 return final_limit_info |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
430 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
431 def _results_to_features(self, base, results): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
432 """Add parsed dictionaries of results to Biopython SeqFeatures.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
433 base = self._add_annotations( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
434 base, results.get("annotation", []) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
435 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
436 for feature in results.get("feature", []): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
437 (_, base) = self._add_toplevel_feature(base, feature) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
438 base = self._add_parent_child_features( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
439 base, results.get("parent", []), results.get("child", []) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
440 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
441 base = self._add_seqs(base, results.get("fasta", [])) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
442 base = self._add_directives( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
443 base, results.get("directive", []) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
444 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
445 return base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
446 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
447 def _add_directives(self, base, directives): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
448 """Handle any directives or meta-data in the GFF file. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
449 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
450 Relevant items are added as annotation meta-data to each record. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
451 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
452 dir_keyvals = collections.defaultdict(list) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
453 for directive in directives: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
454 parts = directive.split() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
455 if len(parts) > 1: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
456 key = parts[0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
457 if len(parts) == 2: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
458 val = parts[1] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
459 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
460 val = tuple(parts[1:]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
461 # specific directives that need special handling |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
462 if ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
463 key == "sequence-region" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
464 ): # convert to Python 0-based coordinates |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
465 if len(val) == 2: # handle regions missing contig |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
466 val = (int(val[0]) - 1, int(val[1])) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
467 elif len(val) == 3: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
468 val = (val[0], int(val[1]) - 1, int(val[2])) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
469 dir_keyvals[key].append(val) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
470 for key, vals in dir_keyvals.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
471 for rec in base.values(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
472 self._add_ann_to_rec(rec, key, vals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
473 return base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
474 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
475 def _get_matching_record_id(self, base, find_id): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
476 """Find a matching base record with the test identifier, handling tricky cases. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
477 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
478 NCBI IDs https://en.wikipedia.org/wiki/FASTA_format#NCBI_identifiers |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
479 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
480 # Straight matches for identifiers |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
481 if find_id in base: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
482 return find_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
483 # NCBI style IDs in find_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
484 elif find_id and find_id.find("|") > 0: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
485 for test_id in [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
486 x.strip() for x in find_id.split("|")[1:] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
487 ]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
488 if test_id and test_id in base: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
489 return test_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
490 # NCBI style IDs in base IDs |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
491 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
492 for base_id in base.keys(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
493 if base_id.find("|") > 0: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
494 for test_id in [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
495 x.strip() for x in base_id.split("|")[1:] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
496 ]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
497 if test_id and test_id == find_id: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
498 return base_id |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
499 return None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
500 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
501 def _add_seqs(self, base, recs): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
502 """Add sequence information contained in the GFF3 to records.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
503 for rec in recs: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
504 match_id = self._get_matching_record_id(base, rec.id) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
505 if match_id: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
506 base[match_id].seq = rec.seq |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
507 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
508 base[rec.id] = rec |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
509 return base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
510 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
511 def _add_parent_child_features(self, base, parents, children): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
512 """Add nested features with parent child relationships.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
513 multi_remap = self._identify_dup_ids(parents) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
514 # add children features |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
515 children_prep = collections.defaultdict(list) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
516 for child_dict in children: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
517 child_feature = self._get_feature(child_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
518 for pindex, pid in enumerate( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
519 child_feature.qualifiers["Parent"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
520 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
521 if pid in multi_remap: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
522 pid = multi_remap[pid].remap_id(child_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
523 child_feature.qualifiers["Parent"][pindex] = pid |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
524 children_prep[pid].append( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
525 (child_dict["rec_id"], child_feature) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
526 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
527 children = dict(children_prep) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
528 # add children to parents that exist |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
529 for cur_parent_dict in parents: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
530 cur_id = cur_parent_dict["id"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
531 if cur_id in multi_remap: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
532 cur_parent_dict["id"] = multi_remap[cur_id].remap_id( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
533 cur_parent_dict |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
534 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
535 cur_parent, base = self._add_toplevel_feature( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
536 base, cur_parent_dict |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
537 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
538 cur_parent, children = self._add_children_to_parent( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
539 cur_parent, children |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
540 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
541 # create parents for children without them (GFF2 or split/bad files) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
542 while len(children) > 0: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
543 parent_id, cur_children = next( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
544 itertools.islice(children.items(), 1) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
545 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
546 # one child, do not nest it |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
547 if len(cur_children) == 1: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
548 rec_id, child = cur_children[0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
549 loc = (child.location.start, child.location.end) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
550 rec, base = self._get_rec( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
551 base, dict(rec_id=rec_id, location=loc) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
552 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
553 rec.features.append(child) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
554 del children[parent_id] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
555 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
556 cur_parent, base = self._add_missing_parent( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
557 base, parent_id, cur_children |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
558 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
559 cur_parent, children = self._add_children_to_parent( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
560 cur_parent, children |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
561 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
562 return base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
563 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
564 def _identify_dup_ids(self, parents): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
565 """Identify duplicated ID attributes in potential nested parents. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
566 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
567 According to the GFF3 spec ID attributes are supposed to be unique |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
568 for a file, but this is not always true in practice. This looks |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
569 for duplicates, and provides unique IDs sorted by locations. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
570 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
571 multi_ids = collections.defaultdict(list) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
572 for parent in parents: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
573 multi_ids[parent["id"]].append(parent) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
574 multi_ids = [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
575 (mid, ps) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
576 for (mid, ps) in multi_ids.items() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
577 if len(parents) > 1 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
578 ] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
579 multi_remap = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
580 for mid, parents in multi_ids: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
581 multi_remap[mid] = _MultiIDRemapper(mid, parents) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
582 return multi_remap |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
583 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
584 def _add_children_to_parent(self, cur_parent, children): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
585 """Recursively add children to parent features.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
586 if cur_parent.id in children: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
587 cur_children = children[cur_parent.id] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
588 ready_children = [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
589 for _, cur_child in cur_children: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
590 cur_child, _ = self._add_children_to_parent( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
591 cur_child, children |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
592 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
593 ready_children.append(cur_child) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
594 # Support Biopython features for 1.62+ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
595 # CompoundLocations and pre-1.62 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
596 if not hasattr(SeqFeature, "CompoundLocation"): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
597 cur_parent.location_operator = "join" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
598 for cur_child in ready_children: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
599 cur_parent.sub_features.append(cur_child) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
600 del children[cur_parent.id] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
601 return cur_parent, children |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
602 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
603 def _add_annotations(self, base, anns): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
604 """Add annotation data from the GFF file to records.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
605 # add these as a list of annotations, checking not to overwrite |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
606 # current values |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
607 for ann in anns: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
608 rec, base = self._get_rec(base, ann) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
609 for key, vals in ann["quals"].items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
610 self._add_ann_to_rec(rec, key, vals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
611 return base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
612 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
613 def _add_ann_to_rec(self, rec, key, vals): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
614 """Add a key/value annotation to the given SeqRecord.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
615 if key in rec.annotations: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
616 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
617 rec.annotations[key].extend(vals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
618 except AttributeError: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
619 rec.annotations[key] = [rec.annotations[key]] + vals |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
620 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
621 rec.annotations[key] = vals |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
622 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
623 def _get_rec(self, base, info_dict): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
624 """Retrieve a record to add features to.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
625 max_loc = info_dict.get("location", (0, 1))[1] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
626 match_id = self._get_matching_record_id( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
627 base, info_dict["rec_id"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
628 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
629 if match_id: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
630 cur_rec = base[match_id] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
631 # update generated unknown sequences |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
632 if unknown_seq_avail and isinstance( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
633 cur_rec.seq, UnknownSeq |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
634 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
635 cur_rec.seq._length = max( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
636 [max_loc, cur_rec.seq._length] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
637 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
638 elif not unknown_seq_avail and isinstance( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
639 cur_rec.seq._data, _UndefinedSequenceData |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
640 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
641 cur_rec.seq._data._length = max( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
642 [max_loc, cur_rec.seq._data._length] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
643 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
644 return cur_rec, base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
645 elif self._create_missing: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
646 if unknown_seq_avail: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
647 new_rec = SeqRecord( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
648 UnknownSeq(max_loc), info_dict["rec_id"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
649 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
650 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
651 new_rec = SeqRecord( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
652 Seq(None, length=max_loc), info_dict["rec_id"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
653 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
654 base[info_dict["rec_id"]] = new_rec |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
655 return new_rec, base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
656 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
657 raise KeyError( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
658 "Did not find matching record in %s for %s" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
659 % (base.keys(), info_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
660 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
661 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
662 def _add_missing_parent(self, base, parent_id, cur_children): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
663 """Add a new feature that is missing from the GFF file.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
664 base_rec_id = list(set(c[0] for c in cur_children)) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
665 child_strands = list( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
666 set(c[1].location.strand for c in cur_children) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
667 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
668 inferred_strand = ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
669 child_strands[0] if len(child_strands) == 1 else None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
670 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
671 assert len(base_rec_id) > 0 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
672 feature_dict = dict( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
673 id=parent_id, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
674 strand=inferred_strand, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
675 type="inferred_parent", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
676 quals=dict(ID=[parent_id]), |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
677 rec_id=base_rec_id[0], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
678 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
679 coords = [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
680 (c.location.start, c.location.end) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
681 for r, c in cur_children |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
682 ] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
683 feature_dict["location"] = ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
684 min([c[0] for c in coords]), |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
685 max([c[1] for c in coords]), |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
686 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
687 return self._add_toplevel_feature(base, feature_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
688 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
689 def _add_toplevel_feature(self, base, feature_dict): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
690 """Add a toplevel non-nested feature to the appropriate record.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
691 new_feature = self._get_feature(feature_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
692 rec, base = self._get_rec(base, feature_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
693 rec.features.append(new_feature) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
694 return new_feature, base |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
695 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
696 def _get_feature(self, feature_dict): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
697 """Retrieve a Biopython feature from our dictionary representation.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
698 # location = SeqFeature.FeatureLocation(*feature_dict['location']) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
699 rstart, rend = feature_dict["location"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
700 new_feature = SeqFeature.SeqFeature( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
701 SeqFeature.SimpleLocation( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
702 start=rstart, end=rend, strand=feature_dict["strand"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
703 ), |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
704 feature_dict["type"], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
705 id=feature_dict["id"], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
706 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
707 # Support for Biopython 1.68 and above, which removed sub_features |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
708 if not hasattr(new_feature, "sub_features"): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
709 new_feature.sub_features = [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
710 new_feature.qualifiers = feature_dict["quals"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
711 return new_feature |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
712 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
713 def _parse_fasta(self, in_handle): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
714 """Parse FASTA sequence information contained in the GFF3 file.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
715 return list(SeqIO.parse(in_handle, "fasta")) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
716 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
717 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
718 class _GFFParserLocalOut: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
719 """Provide a collector for local GFF MapReduce file parsing.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
720 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
721 def __init__(self, smart_breaks=False): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
722 self._items = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
723 self._smart_breaks = smart_breaks |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
724 self._missing_keys = collections.defaultdict(int) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
725 self._last_parent = None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
726 self.can_break = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
727 self.num_lines = 0 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
728 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
729 def add(self, key, vals): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
730 if self._smart_breaks: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
731 # if we are not GFF2 we expect parents and break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
732 # based on not having missing ones |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
733 if key == "directive": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
734 if vals[0] == "#": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
735 self.can_break = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
736 self._last_parent = None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
737 elif not vals[0].get("is_gff2", False): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
738 self._update_missing_parents(key, vals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
739 self.can_break = len(self._missing_keys) == 0 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
740 # break when we are done with stretches of child features |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
741 elif key != "child": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
742 self.can_break = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
743 self._last_parent = None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
744 # break when we have lots of child features in a row |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
745 # and change between parents |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
746 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
747 cur_parent = vals[0]["quals"]["Parent"][0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
748 if self._last_parent: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
749 self.can_break = cur_parent != self._last_parent |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
750 self._last_parent = cur_parent |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
751 self.num_lines += 1 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
752 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
753 self._items[key].extend(vals) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
754 except KeyError: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
755 self._items[key] = vals |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
756 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
757 def _update_missing_parents(self, key, vals): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
758 # smart way of deciding if we can break this. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
759 # if this is too much, can go back to not breaking in the |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
760 # middle of children |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
761 if key in ["child"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
762 for val in vals: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
763 for p_id in val["quals"]["Parent"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
764 self._missing_keys[p_id] += 1 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
765 for val in vals: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
766 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
767 del self._missing_keys[val["quals"]["ID"][0]] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
768 except KeyError: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
769 pass |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
770 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
771 def has_items(self): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
772 return len(self._items) > 0 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
773 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
774 def get_results(self): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
775 self._last_parent = None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
776 return self._items |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
777 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
778 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
779 class GFFParser(_AbstractMapReduceGFF): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
780 """Local GFF parser providing standardized parsing of GFF files.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
781 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
782 def __init__(self, line_adjust_fn=None, create_missing=True): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
783 _AbstractMapReduceGFF.__init__( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
784 self, create_missing=create_missing |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
785 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
786 self._line_adjust_fn = line_adjust_fn |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
787 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
788 def _gff_process(self, gff_files, limit_info, target_lines): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
789 """Process GFF addition without any parallelization. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
790 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
791 In addition to limit filtering, this accepts a target_lines attribute |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
792 which provides a number of lines to parse before returning results. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
793 This allows partial parsing of a file to prevent memory issues. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
794 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
795 line_gen = self._file_line_generator(gff_files) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
796 for out in self._lines_to_out_info( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
797 line_gen, limit_info, target_lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
798 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
799 yield out |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
800 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
801 def _file_line_generator(self, gff_files): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
802 """Generate single lines from a set of GFF files.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
803 for gff_file in gff_files: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
804 if hasattr(gff_file, "read"): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
805 need_close = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
806 in_handle = gff_file |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
807 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
808 need_close = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
809 in_handle = open(gff_file) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
810 while 1: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
811 line = in_handle.readline() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
812 if not line: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
813 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
814 yield line |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
815 if need_close: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
816 in_handle.close() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
817 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
818 def _lines_to_out_info(self, line_iter, limit_info=None, target_lines=None): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
819 """Generate SeqRecord and SeqFeatures from GFF file lines.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
820 params = self._examiner._get_local_params(limit_info) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
821 out_info = _GFFParserLocalOut( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
822 (target_lines is not None and target_lines > 1) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
823 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
824 found_seqs = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
825 for line in line_iter: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
826 results = self._map_fn(line, params) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
827 if self._line_adjust_fn and results: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
828 if results[0][0] not in ["directive"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
829 results = [ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
830 ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
831 results[0][0], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
832 self._line_adjust_fn(results[0][1]), |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
833 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
834 ] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
835 self._reduce_fn(results, out_info, params) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
836 if ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
837 target_lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
838 and out_info.num_lines >= target_lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
839 and out_info.can_break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
840 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
841 yield out_info.get_results() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
842 out_info = _GFFParserLocalOut( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
843 (target_lines is not None and target_lines > 1) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
844 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
845 if ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
846 results |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
847 and results[0][0] == "directive" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
848 and results[0][1] == "FASTA" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
849 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
850 found_seqs = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
851 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
852 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
853 class FakeHandle: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
854 def __init__(self, line_iter): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
855 self._iter = line_iter |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
856 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
857 def __iter__(self): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
858 return self |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
859 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
860 def __next__(self): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
861 return next(self._iter) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
862 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
863 next = __next__ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
864 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
865 def read(self, size=-1): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
866 if size < 0: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
867 return "".join(x for x in self._iter) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
868 elif size == 0: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
869 return "" # Used by Biopython to sniff unicode vs bytes |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
870 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
871 raise NotImplementedError |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
872 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
873 def readline(self): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
874 try: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
875 return next(self._iter) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
876 except StopIteration: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
877 return "" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
878 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
879 if found_seqs: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
880 fasta_recs = self._parse_fasta(FakeHandle(line_iter)) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
881 out_info.add("fasta", fasta_recs) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
882 if out_info.has_items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
883 yield out_info.get_results() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
884 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
885 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
886 class DiscoGFFParser(_AbstractMapReduceGFF): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
887 """GFF Parser with parallelization (http://discoproject.org.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
888 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
889 def __init__(self, disco_host, create_missing=True): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
890 """Initialize parser. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
891 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
892 disco_host - Web reference to a Disco host which will be used for |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
893 parallelizing the GFF reading job. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
894 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
895 _AbstractMapReduceGFF.__init__( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
896 self, create_missing=create_missing |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
897 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
898 self._disco_host = disco_host |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
899 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
900 def _gff_process(self, gff_files, limit_info, target_lines=None): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
901 """Process GFF addition, using Disco to parallelize the process.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
902 assert target_lines is None, "Cannot split parallelized jobs" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
903 # make these imports local; only need them when using disco |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
904 # absolute path names unless they are special disco files |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
905 full_files = [] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
906 for f in gff_files: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
907 if f.split(":")[0] != "disco": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
908 full_files.append(os.path.abspath(f)) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
909 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
910 full_files.append(f) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
911 results = disco.job( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
912 self._disco_host, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
913 name="gff_reader", |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
914 input=full_files, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
915 params=disco.Params( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
916 limit_info=limit_info, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
917 jsonify=True, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
918 filter_info=self._examiner._filter_info, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
919 ), |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
920 required_modules=["json", "collections", "re"], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
921 map=self._map_fn, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
922 reduce=self._reduce_fn, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
923 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
924 processed = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
925 for out_key, out_val in disco.result_iterator(results): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
926 processed[out_key] = json.loads(out_val) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
927 yield processed |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
928 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
929 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
930 def parse( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
931 gff_files, base_dict=None, limit_info=None, target_lines=None |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
932 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
933 """parse GFF files into SeqRecords and SeqFeatures.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
934 parser = GFFParser() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
935 for rec in parser.parse_in_parts( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
936 gff_files, base_dict, limit_info, target_lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
937 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
938 yield rec |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
939 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
940 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
941 def parse_simple(gff_files, limit_info=None): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
942 """Parse GFF files as line by line dictionary of parts.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
943 parser = GFFParser() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
944 for rec in parser.parse_simple(gff_files, limit_info=limit_info): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
945 if "child" in rec: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
946 assert "parent" not in rec |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
947 yield rec["child"][0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
948 elif "parent" in rec: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
949 yield rec["parent"][0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
950 elif "feature" in rec: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
951 yield rec["feature"][0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
952 # ignore directive lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
953 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
954 assert "directive" in rec |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
955 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
956 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
957 def _file_or_handle(fn): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
958 """Decorator to handle either an input handle or a file.""" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
959 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
960 def _file_or_handle_inside(*args, **kwargs): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
961 in_file = args[1] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
962 if hasattr(in_file, "read"): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
963 need_close = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
964 in_handle = in_file |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
965 if six.PY3 and not isinstance(in_handle, io.TextIOBase): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
966 raise TypeError( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
967 "input handle must be opened in text mode" |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
968 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
969 else: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
970 need_close = True |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
971 in_handle = open(in_file) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
972 args = (args[0], in_handle) + args[2:] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
973 out = fn(*args, **kwargs) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
974 if need_close: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
975 in_handle.close() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
976 return out |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
977 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
978 return _file_or_handle_inside |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
979 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
980 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
981 class GFFExaminer: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
982 """Provide high level details about a GFF file to refine parsing. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
983 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
984 GFF is a spec and is provided by many different centers. Real life files |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
985 will present the same information in slightly different ways. Becoming |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
986 familiar with the file you are dealing with is the best way to extract the |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
987 information you need. This class provides high level summary details to |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
988 help in learning. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
989 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
990 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
991 def __init__(self): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
992 self._filter_info = dict( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
993 gff_id=[0], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
994 gff_source_type=[1, 2], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
995 gff_source=[1], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
996 gff_type=[2], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
997 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
998 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
999 def _get_local_params(self, limit_info=None): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1000 class _LocalParams: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1001 def __init__(self): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1002 self.jsonify = False |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1003 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1004 params = _LocalParams() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1005 params.limit_info = limit_info |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1006 params.filter_info = self._filter_info |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1007 return params |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1008 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1009 @_file_or_handle |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1010 def available_limits(self, gff_handle): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1011 """Return dictionary information on possible limits for this file. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1012 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1013 This returns a nested dictionary with the following structure: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1014 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1015 keys -- names of items to filter by |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1016 values -- dictionary with: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1017 keys -- filter choice |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1018 value -- counts of that filter in this file |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1019 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1020 Not a parallelized map-reduce implementation. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1021 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1022 cur_limits = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1023 for filter_key in self._filter_info.keys(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1024 cur_limits[filter_key] = collections.defaultdict(int) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1025 for line in gff_handle: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1026 # when we hit FASTA sequences, we are done with annotations |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1027 if line.startswith("##FASTA"): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1028 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1029 # ignore empty and comment lines |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1030 if line.strip() and line.strip()[0] != "#": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1031 parts = [p.strip() for p in line.split("\t")] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1032 assert len(parts) >= 8, line |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1033 parts = parts[:9] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1034 for ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1035 filter_key, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1036 cur_indexes, |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1037 ) in self._filter_info.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1038 cur_id = tuple([parts[i] for i in cur_indexes]) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1039 cur_limits[filter_key][cur_id] += 1 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1040 # get rid of the default dicts |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1041 final_dict = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1042 for key, value_dict in cur_limits.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1043 if len(key) == 1: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1044 key = key[0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1045 final_dict[key] = dict(value_dict) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1046 gff_handle.close() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1047 return final_dict |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1048 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1049 @_file_or_handle |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1050 def parent_child_map(self, gff_handle): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1051 """Provide a mapping of parent to child relationships in the file. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1052 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1053 Returns a dictionary of parent child relationships: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1054 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1055 keys -- tuple of (source, type) for each parent |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1056 values -- tuple of (source, type) as children of that parent |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1057 |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1058 Not a parallelized map-reduce implementation. |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1059 """ |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1060 # collect all of the parent and child types mapped to IDs |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1061 parent_sts = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1062 child_sts = collections.defaultdict(list) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1063 for line in gff_handle: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1064 # when we hit FASTA sequences, we are done with annotations |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1065 if line.startswith("##FASTA"): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1066 break |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1067 if line.strip() and not line.startswith("#"): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1068 line_type, line_info = _gff_line_map( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1069 line, self._get_local_params() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1070 )[0] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1071 if line_type == "parent" or ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1072 line_type == "child" and line_info["id"] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1073 ): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1074 parent_sts[line_info["id"]] = ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1075 line_info["quals"].get("source", [""])[0], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1076 line_info["type"], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1077 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1078 if line_type == "child": |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1079 for parent_id in line_info["quals"]["Parent"]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1080 child_sts[parent_id].append( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1081 ( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1082 line_info["quals"].get( |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1083 "source", [""] |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1084 )[0], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1085 line_info["type"], |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1086 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1087 ) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1088 # print parent_sts, child_sts |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1089 # generate a dictionary of the unique final type relationships |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1090 pc_map = collections.defaultdict(list) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1091 for parent_id, parent_type in parent_sts.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1092 for child_type in child_sts[parent_id]: |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1093 pc_map[parent_type].append(child_type) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1094 pc_final_map = dict() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1095 for ptype, ctypes in pc_map.items(): |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1096 unique_ctypes = list(set(ctypes)) |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1097 unique_ctypes.sort() |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1098 pc_final_map[ptype] = unique_ctypes |
4c201a3d4755
planemo upload for repository https://github.com/usegalaxy-eu/temporary-tools/tree/master/jbrowse2 commit a37bfdfc108501b11c7b2aa15efb1bd16f0c4b66
fubar
parents:
diff
changeset
|
1099 return pc_final_map |