annotate env/lib/python3.9/site-packages/rdflib/compare.py @ 0:4f3585e2f14b draft default tip

"planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
author shellac
date Mon, 22 Mar 2021 18:12:50 +0000
parents
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
1 # -*- coding: utf-8 -*-
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
2 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
3 A collection of utilities for canonicalizing and inspecting graphs.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
4
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
5 Among other things, they solve of the problem of deterministic bnode
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
6 comparisons.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
7
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
8 Warning: the time to canonicalize bnodes may increase exponentially on
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
9 degenerate larger graphs. Use with care!
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
10
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
11 Example of comparing two graphs::
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
12
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
13 >>> g1 = Graph().parse(format='n3', data='''
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
14 ... @prefix : <http://example.org/ns#> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
15 ... <http://example.org> :rel
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
16 ... <http://example.org/same>,
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
17 ... [ :label "Same" ],
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
18 ... <http://example.org/a>,
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
19 ... [ :label "A" ] .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
20 ... ''')
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
21 >>> g2 = Graph().parse(format='n3', data='''
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
22 ... @prefix : <http://example.org/ns#> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
23 ... <http://example.org> :rel
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
24 ... <http://example.org/same>,
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
25 ... [ :label "Same" ],
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
26 ... <http://example.org/b>,
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
27 ... [ :label "B" ] .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
28 ... ''')
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
29 >>>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
30 >>> iso1 = to_isomorphic(g1)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
31 >>> iso2 = to_isomorphic(g2)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
32
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
33 These are not isomorphic::
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
34
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
35 >>> iso1 == iso2
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
36 False
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
37
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
38 Diff the two graphs::
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
39
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
40 >>> in_both, in_first, in_second = graph_diff(iso1, iso2)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
41
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
42 Present in both::
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
43
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
44 >>> def dump_nt_sorted(g):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
45 ... for l in sorted(g.serialize(format='nt').splitlines()):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
46 ... if l: print(l.decode('ascii'))
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
47
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
48 >>> dump_nt_sorted(in_both) #doctest: +SKIP
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
49 <http://example.org>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
50 <http://example.org/ns#rel> <http://example.org/same> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
51 <http://example.org>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
52 <http://example.org/ns#rel> _:cbcaabaaba17fecbc304a64f8edee4335e .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
53 _:cbcaabaaba17fecbc304a64f8edee4335e
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
54 <http://example.org/ns#label> "Same" .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
55
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
56 Only in first::
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
57
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
58 >>> dump_nt_sorted(in_first) #doctest: +SKIP
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
59 <http://example.org>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
60 <http://example.org/ns#rel> <http://example.org/a> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
61 <http://example.org>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
62 <http://example.org/ns#rel> _:cb124e4c6da0579f810c0ffe4eff485bd9 .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
63 _:cb124e4c6da0579f810c0ffe4eff485bd9
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
64 <http://example.org/ns#label> "A" .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
65
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
66 Only in second::
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
67
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
68 >>> dump_nt_sorted(in_second) #doctest: +SKIP
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
69 <http://example.org>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
70 <http://example.org/ns#rel> <http://example.org/b> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
71 <http://example.org>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
72 <http://example.org/ns#rel> _:cb558f30e21ddfc05ca53108348338ade8 .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
73 _:cb558f30e21ddfc05ca53108348338ade8
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
74 <http://example.org/ns#label> "B" .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
75 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
76 from __future__ import absolute_import
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
77 from __future__ import division
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
78 from __future__ import print_function
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
79
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
80
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
81 # TODO:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
82 # - Doesn't handle quads.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
83 # - Add warning and/or safety mechanism before working on large graphs?
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
84 # - use this in existing Graph.isomorphic?
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
85
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
86 __all__ = ['IsomorphicGraph', 'to_isomorphic', 'isomorphic',
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
87 'to_canonical_graph', 'graph_diff', 'similar']
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
88
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
89 from rdflib.graph import Graph, ConjunctiveGraph, ReadOnlyGraphAggregate
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
90 from rdflib.term import BNode, Node
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
91 from hashlib import sha256
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
92
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
93 from datetime import datetime
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
94 from collections import defaultdict
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
95
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
96 from six import text_type
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
97
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
98
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
99 def _total_seconds(td):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
100 result = td.days * 24 * 60 * 60
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
101 result += td.seconds
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
102 result += td.microseconds / 1000000.0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
103 return result
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
104
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
105
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
106 class _runtime(object):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
107 def __init__(self, label):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
108 self.label = label
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
109
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
110 def __call__(self, f):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
111 if self.label is None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
112 self.label = f.__name__ + "_runtime"
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
113
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
114 def wrapped_f(*args, **kwargs):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
115 start = datetime.now()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
116 result = f(*args, **kwargs)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
117 if 'stats' in kwargs and kwargs['stats'] is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
118 stats = kwargs['stats']
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
119 stats[self.label] = _total_seconds(datetime.now() - start)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
120 return result
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
121 return wrapped_f
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
122
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
123
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
124 class _call_count(object):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
125 def __init__(self, label):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
126 self.label = label
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
127
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
128 def __call__(self, f):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
129 if self.label is None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
130 self.label = f.__name__ + "_runtime"
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
131
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
132 def wrapped_f(*args, **kwargs):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
133 if 'stats' in kwargs and kwargs['stats'] is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
134 stats = kwargs['stats']
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
135 if self.label not in stats:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
136 stats[self.label] = 0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
137 stats[self.label] += 1
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
138 return f(*args, **kwargs)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
139 return wrapped_f
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
140
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
141
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
142 class IsomorphicGraph(ConjunctiveGraph):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
143 """An implementation of the RGDA1 graph digest algorithm.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
144
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
145 An implementation of RGDA1 (publication below),
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
146 a combination of Sayers & Karp's graph digest algorithm using
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
147 sum and SHA-256 <http://www.hpl.hp.com/techreports/2003/HPL-2003-235R1.pdf>
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
148 and traces <http://pallini.di.uniroma1.it>, an average case
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
149 polynomial time algorithm for graph canonicalization.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
150
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
151 McCusker, J. P. (2015). WebSig: A Digital Signature Framework for the Web.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
152 Rensselaer Polytechnic Institute, Troy, NY.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
153 http://gradworks.umi.com/3727015.pdf
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
154 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
155
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
156 def __init__(self, **kwargs):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
157 super(IsomorphicGraph, self).__init__(**kwargs)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
158
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
159 def __eq__(self, other):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
160 """Graph isomorphism testing."""
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
161 if not isinstance(other, IsomorphicGraph):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
162 return False
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
163 elif len(self) != len(other):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
164 return False
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
165 return self.internal_hash() == other.internal_hash()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
166
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
167 def __ne__(self, other):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
168 """Negative graph isomorphism testing."""
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
169 return not self.__eq__(other)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
170
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
171 def __hash__(self):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
172 return super(IsomorphicGraph, self).__hash__()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
173
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
174 def graph_digest(self, stats=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
175 """Synonym for IsomorphicGraph.internal_hash."""
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
176 return self.internal_hash(stats=stats)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
177
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
178 def internal_hash(self, stats=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
179 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
180 This is defined instead of __hash__ to avoid a circular recursion
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
181 scenario with the Memory store for rdflib which requires a hash lookup
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
182 in order to return a generator of triples.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
183 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
184 return _TripleCanonicalizer(self).to_hash(stats=stats)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
185
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
186
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
187 class Color:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
188 def __init__(self, nodes, hashfunc, color=(), hash_cache=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
189 if hash_cache is None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
190 hash_cache = {}
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
191 self._hash_cache = hash_cache
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
192 self.color = color
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
193 self.nodes = nodes
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
194 self.hashfunc = hashfunc
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
195 self._hash_color = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
196
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
197 def __str__(self):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
198 nodes, color = self.key()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
199 return "Color %s (%s nodes)" % (color, nodes)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
200
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
201 def key(self):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
202 return (len(self.nodes), self.hash_color())
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
203
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
204 def hash_color(self, color=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
205 if color is None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
206 color = self.color
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
207 if color in self._hash_cache:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
208 return self._hash_cache[color]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
209
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
210 def stringify(x):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
211 if isinstance(x, Node):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
212 return x.n3()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
213 else:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
214 return text_type(x)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
215 if isinstance(color, Node):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
216 return stringify(color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
217 value = 0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
218 for triple in color:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
219 value += self.hashfunc(' '.join([stringify(x) for x in triple]))
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
220 val = u"%x" % value
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
221 self._hash_cache[color] = val
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
222 return val
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
223
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
224 def distinguish(self, W, graph):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
225 colors = {}
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
226 for n in self.nodes:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
227 new_color = list(self.color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
228 for node in W.nodes:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
229 new_color += [
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
230 (1, p, W.hash_color())
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
231 for s, p, o in graph.triples((n, None, node))]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
232 new_color += [
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
233 (W.hash_color(), p, 3)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
234 for s, p, o in graph.triples((node, None, n))]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
235 new_color = tuple(new_color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
236 new_hash_color = self.hash_color(new_color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
237
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
238 if new_hash_color not in colors:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
239 c = Color(
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
240 [], self.hashfunc, new_color,
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
241 hash_cache=self._hash_cache)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
242 colors[new_hash_color] = c
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
243 colors[new_hash_color].nodes.append(n)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
244 return colors.values()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
245
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
246 def discrete(self):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
247 return len(self.nodes) == 1
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
248
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
249 def copy(self):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
250 return Color(
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
251 self.nodes[:], self.hashfunc, self.color,
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
252 hash_cache=self._hash_cache)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
253
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
254
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
255 class _TripleCanonicalizer(object):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
256
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
257 def __init__(self, graph, hashfunc=sha256):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
258 self.graph = graph
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
259
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
260 def _hashfunc(s):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
261 h = hashfunc()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
262 h.update(text_type(s).encode("utf8"))
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
263 return int(h.hexdigest(), 16)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
264 self._hash_cache = {}
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
265 self.hashfunc = _hashfunc
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
266
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
267 def _discrete(self, coloring):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
268 return len([c for c in coloring if not c.discrete()]) == 0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
269
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
270 def _initial_color(self):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
271 """Finds an initial color for the graph.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
272
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
273 Finds an initial color fo the graph by finding all blank nodes and
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
274 non-blank nodes that are adjacent. Nodes that are not adjacent to blank
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
275 nodes are not included, as they are a) already colored (by URI or literal)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
276 and b) do not factor into the color of any blank node.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
277 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
278 bnodes = set()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
279 others = set()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
280 self._neighbors = defaultdict(set)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
281 for s, p, o in self.graph:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
282 nodes = set([s, p, o])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
283 b = set([x for x in nodes if isinstance(x, BNode)])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
284 if len(b) > 0:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
285 others |= nodes - b
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
286 bnodes |= b
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
287 if isinstance(s, BNode):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
288 self._neighbors[s].add(o)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
289 if isinstance(o, BNode):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
290 self._neighbors[o].add(s)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
291 if isinstance(p, BNode):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
292 self._neighbors[p].add(s)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
293 self._neighbors[p].add(p)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
294 if len(bnodes) > 0:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
295 return [
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
296 Color(list(bnodes), self.hashfunc, hash_cache=self._hash_cache)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
297 ] + [
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
298 Color([x], self.hashfunc, x, hash_cache=self._hash_cache)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
299 for x in others
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
300 ]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
301 else:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
302 return []
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
303
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
304 def _individuate(self, color, individual):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
305 new_color = list(color.color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
306 new_color.append((len(color.nodes),))
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
307
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
308 color.nodes.remove(individual)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
309 c = Color([individual], self.hashfunc, tuple(new_color),
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
310 hash_cache=self._hash_cache)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
311 return c
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
312
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
313 def _get_candidates(self, coloring):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
314 candidates = [c for c in coloring if not c.discrete()]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
315 for c in [c for c in coloring if not c.discrete()]:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
316 for node in c.nodes:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
317 yield node, c
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
318
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
319 def _refine(self, coloring, sequence):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
320 sequence = sorted(sequence, key=lambda x: x.key(), reverse=True)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
321 coloring = coloring[:]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
322 while len(sequence) > 0 and not self._discrete(coloring):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
323 W = sequence.pop()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
324 for c in coloring[:]:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
325 if len(c.nodes) > 1 or isinstance(c.nodes[0], BNode):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
326 colors = sorted(c.distinguish(W, self.graph),
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
327 key=lambda x: x.key(),
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
328 reverse=True)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
329 coloring.remove(c)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
330 coloring.extend(colors)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
331 try:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
332 si = sequence.index(c)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
333 sequence = sequence[:si] + colors + sequence[si+1:]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
334 except ValueError:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
335 sequence = colors[1:] + sequence
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
336 combined_colors = []
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
337 combined_color_map = dict()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
338 for color in coloring:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
339 color_hash = color.hash_color()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
340 # This is a hash collision, and be combined into a single color for individuation.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
341 if color_hash in combined_color_map:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
342 combined_color_map[color_hash].nodes.extend(color.nodes)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
343 else:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
344 combined_colors.append(color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
345 combined_color_map[color_hash] = color
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
346 return combined_colors
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
347
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
348 @_runtime("to_hash_runtime")
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
349 def to_hash(self, stats=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
350 result = 0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
351 for triple in self.canonical_triples(stats=stats):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
352 result += self.hashfunc(' '.join([x.n3() for x in triple]))
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
353 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
354 stats['graph_digest'] = "%x" % result
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
355 return result
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
356
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
357 def _experimental_path(self, coloring):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
358 coloring = [c.copy() for c in coloring]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
359 while not self._discrete(coloring):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
360 color = [x for x in coloring if not x.discrete()][0]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
361 node = color.nodes[0]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
362 new_color = self._individuate(color, node)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
363 coloring.append(new_color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
364 coloring = self._refine(coloring, [new_color])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
365 return coloring
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
366
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
367 def _create_generator(self, colorings, groupings=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
368 if not groupings:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
369 groupings = defaultdict(set)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
370 for group in zip(*colorings):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
371 g = set([c.nodes[0] for c in group])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
372 for n in group:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
373 g |= groupings[n]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
374 for n in g:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
375 groupings[n] = g
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
376 return groupings
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
377
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
378 @_call_count("individuations")
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
379 def _traces(self, coloring, stats=None, depth=[0]):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
380 if stats is not None and 'prunings' not in stats:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
381 stats['prunings'] = 0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
382 depth[0] += 1
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
383 candidates = self._get_candidates(coloring)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
384 best = []
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
385 best_score = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
386 best_experimental = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
387 best_experimental_score = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
388 last_coloring = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
389 generator = defaultdict(set)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
390 visited = set()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
391 for candidate, color in candidates:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
392 if candidate in generator:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
393 v = generator[candidate] & visited
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
394 if len(v) > 0:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
395 visited.add(candidate)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
396 continue
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
397 visited.add(candidate)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
398 coloring_copy = []
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
399 color_copy = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
400 for c in coloring:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
401 c_copy = c.copy()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
402 coloring_copy.append(c_copy)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
403 if c == color:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
404 color_copy = c_copy
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
405 new_color = self._individuate(color_copy, candidate)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
406 coloring_copy.append(new_color)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
407 refined_coloring = self._refine(coloring_copy, [new_color])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
408 color_score = tuple([c.key() for c in refined_coloring])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
409 experimental = self._experimental_path(coloring_copy)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
410 experimental_score = set([c.key() for c in experimental])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
411 if last_coloring:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
412 generator = self._create_generator(
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
413 [last_coloring, experimental],
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
414 generator)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
415 last_coloring = experimental
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
416 if best_score is None or best_score < color_score:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
417 best = [refined_coloring]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
418 best_score = color_score
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
419 best_experimental = experimental
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
420 best_experimental_score = experimental_score
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
421 elif best_score > color_score:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
422 # prune this branch.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
423 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
424 stats['prunings'] += 1
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
425 elif experimental_score != best_experimental_score:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
426 best.append(refined_coloring)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
427 else:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
428 # prune this branch.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
429 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
430 stats['prunings'] += 1
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
431 discrete = [x for x in best if self._discrete(x)]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
432 if len(discrete) == 0:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
433 best_score = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
434 best_depth = None
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
435 for coloring in best:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
436 d = [depth[0]]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
437 new_color = self._traces(coloring, stats=stats, depth=d)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
438 color_score = tuple([c.key() for c in refined_coloring])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
439 if best_score is None or color_score > best_score:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
440 discrete = [new_color]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
441 best_score = color_score
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
442 best_depth = d[0]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
443 depth[0] = best_depth
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
444 return discrete[0]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
445
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
446 def canonical_triples(self, stats=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
447 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
448 start_canonicalization = datetime.now()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
449 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
450 start_coloring = datetime.now()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
451 coloring = self._initial_color()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
452 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
453 stats['triple_count'] = len(self.graph)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
454 stats['adjacent_nodes'] = max(0, len(coloring) - 1)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
455 coloring = self._refine(coloring, coloring[:])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
456 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
457 stats['initial_coloring_runtime'] = _total_seconds(datetime.now() - start_coloring)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
458 stats['initial_color_count'] = len(coloring)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
459
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
460 if not self._discrete(coloring):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
461 depth = [0]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
462 coloring = self._traces(coloring, stats=stats, depth=depth)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
463 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
464 stats['tree_depth'] = depth[0]
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
465 elif stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
466 stats['individuations'] = 0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
467 stats['tree_depth'] = 0
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
468 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
469 stats['color_count'] = len(coloring)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
470
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
471 bnode_labels = dict([(c.nodes[0], c.hash_color()) for c in coloring])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
472 if stats is not None:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
473 stats["canonicalize_triples_runtime"] = _total_seconds(datetime.now() - start_coloring)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
474 for triple in self.graph:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
475 result = tuple(self._canonicalize_bnodes(triple, bnode_labels))
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
476 yield result
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
477
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
478 def _canonicalize_bnodes(self, triple, labels):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
479 for term in triple:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
480 if isinstance(term, BNode):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
481 yield BNode(value="cb%s" % labels[term])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
482 else:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
483 yield term
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
484
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
485
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
486 def to_isomorphic(graph):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
487 if isinstance(graph, IsomorphicGraph):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
488 return graph
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
489 result = IsomorphicGraph()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
490 if hasattr(graph, "identifier"):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
491 result = IsomorphicGraph(identifier=graph.identifier)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
492 result += graph
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
493 return result
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
494
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
495
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
496 def isomorphic(graph1, graph2):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
497 """Compare graph for equality.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
498
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
499 Uses an algorithm to compute unique hashes which takes bnodes into account.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
500
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
501 Examples::
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
502
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
503 >>> g1 = Graph().parse(format='n3', data='''
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
504 ... @prefix : <http://example.org/ns#> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
505 ... <http://example.org> :rel <http://example.org/a> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
506 ... <http://example.org> :rel <http://example.org/b> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
507 ... <http://example.org> :rel [ :label "A bnode." ] .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
508 ... ''')
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
509 >>> g2 = Graph().parse(format='n3', data='''
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
510 ... @prefix ns: <http://example.org/ns#> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
511 ... <http://example.org> ns:rel [ ns:label "A bnode." ] .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
512 ... <http://example.org> ns:rel <http://example.org/b>,
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
513 ... <http://example.org/a> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
514 ... ''')
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
515 >>> isomorphic(g1, g2)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
516 True
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
517
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
518 >>> g3 = Graph().parse(format='n3', data='''
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
519 ... @prefix : <http://example.org/ns#> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
520 ... <http://example.org> :rel <http://example.org/a> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
521 ... <http://example.org> :rel <http://example.org/b> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
522 ... <http://example.org> :rel <http://example.org/c> .
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
523 ... ''')
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
524 >>> isomorphic(g1, g3)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
525 False
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
526 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
527 gd1 = _TripleCanonicalizer(graph1).to_hash()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
528 gd2 = _TripleCanonicalizer(graph2).to_hash()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
529 return gd1 == gd2
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
530
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
531
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
532 def to_canonical_graph(g1, stats=None):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
533 """Creates a canonical, read-only graph.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
534
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
535 Creates a canonical, read-only graph where all bnode id:s are based on
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
536 deterministical SHA-256 checksums, correlated with the graph contents.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
537 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
538 graph = Graph()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
539 graph += _TripleCanonicalizer(g1).canonical_triples(stats=stats)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
540 return ReadOnlyGraphAggregate([graph])
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
541
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
542
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
543 def graph_diff(g1, g2):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
544 """Returns three sets of triples: "in both", "in first" and "in second"."""
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
545 # bnodes have deterministic values in canonical graphs:
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
546 cg1 = to_canonical_graph(g1)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
547 cg2 = to_canonical_graph(g2)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
548 in_both = cg1 * cg2
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
549 in_first = cg1 - cg2
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
550 in_second = cg2 - cg1
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
551 return (in_both, in_first, in_second)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
552
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
553
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
554 _MOCK_BNODE = BNode()
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
555
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
556
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
557 def similar(g1, g2):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
558 """Checks if the two graphs are "similar".
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
559
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
560 Checks if the two graphs are "similar", by comparing sorted triples where
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
561 all bnodes have been replaced by a singular mock bnode (the
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
562 ``_MOCK_BNODE``).
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
563
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
564 This is a much cheaper, but less reliable, alternative to the comparison
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
565 algorithm in ``isomorphic``.
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
566 """
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
567 return all(t1 == t2 for (t1, t2) in _squashed_graphs_triples(g1, g2))
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
568
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
569
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
570 def _squashed_graphs_triples(g1, g2):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
571 for (t1, t2) in zip(sorted(_squash_graph(g1)), sorted(_squash_graph(g2))):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
572 yield t1, t2
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
573
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
574
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
575 def _squash_graph(graph):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
576 return (_squash_bnodes(triple) for triple in graph)
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
577
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
578
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
579 def _squash_bnodes(triple):
4f3585e2f14b "planemo upload commit 60cee0fc7c0cda8592644e1aad72851dec82c959"
shellac
parents:
diff changeset
580 return tuple((isinstance(t, BNode) and _MOCK_BNODE) or t for t in triple)