annotate mercurial/pvec.py @ 45390:7d24201b6447

worker: don't expose readinto() on _blockingreader since pickle is picky The `pickle` module expects the input to be buffered and a whole object to be available when `pickle.load()` is called, which is not necessarily true when we send data from workers back to the parent process (i.e., it seems like a bad assumption for the `pickle` module to make). We added a workaround for that in https://phab.mercurial-scm.org/D8076, which made `read()` continue until all the requested bytes have been read. As we found out at work after a lot of investigation (I've spent the last two days on this), the native version of `pickle.load()` has started calling `readinto()` on the input since Python 3.8. That started being called in https://github.com/python/cpython/commit/91f4380cedbae32b49adbea2518014a5624c6523 (and only by the C version of `pickle.load()`)). Before that, it was only `read()` and `readline()` that were called. The problem with that was that `readinto()` on our `_blockingreader` was simply delegating to the underlying, *unbuffered* object. The symptom we saw was that `hg fix` started failing sometimes on Python 3.8 on Mac. It failed very relyable in some cases. I still haven't figured out under what circumstances it fails and I've been unable to reproduce it in test cases (I've tried writing larger amounts of data, using different numbers of workers, and making the formatters sleep). I have, however, been able to reproduce it 3-4 times on Linux, but then it stopped reproducing on the following few hundred attempts. To fix the problem, we can simply remove the implementation of `readinto()`, since the unpickler will then fall back to calling `read()`. The fallback was added a bit later, in https://github.com/python/cpython/commit/b19f7ecfa3adc6ba1544225317b9473649815b38. However, that commit also added checking that what `read()` returns is a `bytes`, so we also need to convert the `bytearray` we use into that. I was able to add a test for that failure at least. Differential Revision: https://phab.mercurial-scm.org/D8928
author Martin von Zweigbergk <martinvonz@google.com>
date Fri, 14 Aug 2020 20:45:49 -0700
parents a89aa2d7b34d
children d4ba4d51f85f
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
1 # pvec.py - probabilistic vector clocks for Mercurial
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
2 #
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
3 # Copyright 2012 Matt Mackall <mpm@selenic.com>
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
4 #
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
7
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
8 '''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
9 A "pvec" is a changeset property based on the theory of vector clocks
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
10 that can be compared to discover relatedness without consulting a
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
11 graph. This can be useful for tasks like determining how a
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
12 disconnected patch relates to a repository.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
13
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
14 Currently a pvec consist of 448 bits, of which 24 are 'depth' and the
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
15 remainder are a bit vector. It is represented as a 70-character base85
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
16 string.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
17
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
18 Construction:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
19
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
20 - a root changeset has a depth of 0 and a bit vector based on its hash
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
21 - a normal commit has a changeset where depth is increased by one and
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
22 one bit vector bit is flipped based on its hash
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
23 - a merge changeset pvec is constructed by copying changes from one pvec into
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
24 the other to balance its depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
25
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
26 Properties:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
27
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
28 - for linear changes, difference in depth is always <= hamming distance
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
29 - otherwise, changes are probably divergent
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
30 - when hamming distance is < 200, we can reliably detect when pvecs are near
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
31
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
32 Issues:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
33
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
34 - hamming distance ceases to work over distances of ~ 200
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
35 - detecting divergence is less accurate when the common ancestor is very close
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
36 to either revision or total distance is high
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
37 - this could probably be improved by modeling the relation between
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
38 delta and hdist
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
39
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
40 Uses:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
41
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
42 - a patch pvec can be used to locate the nearest available common ancestor for
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
43 resolving conflicts
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
44 - ordering of patches can be established without a DAG
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
45 - two head pvecs can be compared to determine whether push/pull/merge is needed
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
46 and approximately how many changesets are involved
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
47 - can be used to find a heuristic divergence measure between changesets on
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
48 different branches
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
49 '''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
50
44602
a89aa2d7b34d pvec: drop an unused `from __future__ import division`
Martin von Zweigbergk <martinvonz@google.com>
parents: 43793
diff changeset
51 from __future__ import absolute_import
27501
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
52
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
53 from .node import nullrev
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
54 from . import (
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
55 pycompat,
27501
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
56 util,
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
57 )
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
58
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
59 _size = 448 # 70 chars b85-encoded
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
60 _bytes = _size // 8
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
61 _depthbits = 24
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
62 _depthbytes = _depthbits // 8
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
63 _vecbytes = _bytes - _depthbytes
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
64 _vecbits = _vecbytes * 8
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
65 _radius = (_vecbits - 30) // 2 # high probability vectors are related
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
66
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
67
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
68 def _bin(bs):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
69 '''convert a bytestring to a long'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
70 v = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
71 for b in bs:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
72 v = v * 256 + ord(b)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
73 return v
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
74
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
75
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
76 def _str(v, l):
43479
8e89b6e1e0cd pvec: add an explicit type hint to help pytype
Augie Fackler <augie@google.com>
parents: 43465
diff changeset
77 # type: (int, int) -> bytes
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
78 bs = b""
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
79 for p in pycompat.xrange(l):
43463
271af23d01a9 pvec: fix overlooked chr() call
Augie Fackler <augie@google.com>
parents: 43115
diff changeset
80 bs = pycompat.bytechr(v & 255) + bs
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
81 v >>= 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
82 return bs
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
83
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
84
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
85 def _split(b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
86 '''depth and bitvec'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
87 return _bin(b[:_depthbytes]), _bin(b[_depthbytes:])
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
88
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
89
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
90 def _join(depth, bitvec):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
91 return _str(depth, _depthbytes) + _str(bitvec, _vecbytes)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
92
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
93
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
94 def _hweight(x):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
95 c = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
96 while x:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
97 if x & 1:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
98 c += 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
99 x >>= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
100 return c
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
101
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
102
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
103 _htab = [_hweight(x) for x in pycompat.xrange(256)]
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
104
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
105
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
106 def _hamming(a, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
107 '''find the hamming distance between two longs'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
108 d = a ^ b
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
109 c = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
110 while d:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
111 c += _htab[d & 0xFF]
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
112 d >>= 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
113 return c
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
114
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
115
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
116 def _mergevec(x, y, c):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
117 # Ideally, this function would be x ^ y ^ ancestor, but finding
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
118 # ancestors is a nuisance. So instead we find the minimal number
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
119 # of changes to balance the depth and hamming distance
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
120
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
121 d1, v1 = x
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
122 d2, v2 = y
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
123 if d1 < d2:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
124 d1, d2, v1, v2 = d2, d1, v2, v1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
125
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
126 hdist = _hamming(v1, v2)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
127 ddist = d1 - d2
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
128 v = v1
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
129 m = v1 ^ v2 # mask of different bits
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
130 i = 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
131
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
132 if hdist > ddist:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
133 # if delta = 10 and hdist = 100, then we need to go up 55 steps
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
134 # to the ancestor and down 45
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
135 changes = (hdist - ddist + 1) // 2
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
136 else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
137 # must make at least one change
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
138 changes = 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
139 depth = d1 + changes
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
140
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
141 # copy changes from v2
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
142 if m:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
143 while changes:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
144 if m & i:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
145 v ^= i
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
146 changes -= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
147 i <<= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
148 else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
149 v = _flipbit(v, c)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
150
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
151 return depth, v
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
152
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
153
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
154 def _flipbit(v, node):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
155 # converting bit strings to longs is slow
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
156 bit = (hash(node) & 0xFFFFFFFF) % _vecbits
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
157 return v ^ (1 << bit)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
158
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
159
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
160 def ctxpvec(ctx):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
161 '''construct a pvec for ctx while filling in the cache'''
24339
bcc319d936a3 pvec: replace 'ctx._repo' with 'ctx.repo()'
Matt Harbison <matt_harbison@yahoo.com>
parents: 18918
diff changeset
162 r = ctx.repo()
43115
4aa72cdf616f py3: delete b'' prefix from safehasattr arguments
Martin von Zweigbergk <martinvonz@google.com>
parents: 43077
diff changeset
163 if not util.safehasattr(r, "_pveccache"):
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
164 r._pveccache = {}
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
165 pvc = r._pveccache
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
166 if ctx.rev() not in pvc:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
167 cl = r.changelog
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
168 for n in pycompat.xrange(ctx.rev() + 1):
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
169 if n not in pvc:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
170 node = cl.node(n)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
171 p1, p2 = cl.parentrevs(n)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
172 if p1 == nullrev:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
173 # start with a 'random' vector at root
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
174 pvc[n] = (0, _bin((node * 3)[:_vecbytes]))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
175 elif p2 == nullrev:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
176 d, v = pvc[p1]
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
177 pvc[n] = (d + 1, _flipbit(v, node))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
178 else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
179 pvc[n] = _mergevec(pvc[p1], pvc[p2], node)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
180 bs = _join(*pvc[ctx.rev()])
32201
4462a981e8df base85: proxy through util module
Yuya Nishihara <yuya@tcha.org>
parents: 27501
diff changeset
181 return pvec(util.b85encode(bs))
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
182
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
183
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
184 class pvec(object):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
185 def __init__(self, hashorctx):
43746
10662ac7849e pvec: fix a `str` type conditional for py3
Matt Harbison <matt_harbison@yahoo.com>
parents: 43115
diff changeset
186 if isinstance(hashorctx, bytes):
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
187 self._bs = hashorctx
32201
4462a981e8df base85: proxy through util module
Yuya Nishihara <yuya@tcha.org>
parents: 27501
diff changeset
188 self._depth, self._vec = _split(util.b85decode(hashorctx))
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
189 else:
18918
5093d2a87ff6 pvec: use the correct name for an identifier
Bryan O'Sullivan <bryano@fb.com>
parents: 17424
diff changeset
190 self._vec = ctxpvec(hashorctx)
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
191
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
192 def __str__(self):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
193 return self._bs
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
194
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
195 def __eq__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
196 return self._vec == b._vec and self._depth == b._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
197
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
198 def __lt__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
199 delta = b._depth - self._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
200 if delta < 0:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
201 return False # always correct
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
202 if _hamming(self._vec, b._vec) > delta:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
203 return False
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
204 return True
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
205
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
206 def __gt__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
207 return b < self
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
208
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
209 def __or__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
210 delta = abs(b._depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
211 if _hamming(self._vec, b._vec) <= delta:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
212 return False
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
213 return True
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
214
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
215 def __sub__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
216 if self | b:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
217 raise ValueError(b"concurrent pvecs")
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
218 return self._depth - b._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
219
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
220 def distance(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
221 d = abs(b._depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
222 h = _hamming(self._vec, b._vec)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
223 return max(d, h)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
224
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
225 def near(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
226 dist = abs(b.depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
227 if dist > _radius or _hamming(self._vec, b._vec) > _radius:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
228 return False