mercurial/pvec.py
author Martin von Zweigbergk <martinvonz@google.com>
Mon, 09 Mar 2015 17:13:22 -0700
changeset 24276 2720f967a7b1
parent 18918 5093d2a87ff6
child 24339 bcc319d936a3
permissions -rw-r--r--
largefiles: replace manifestdict.__contains__, don't extend class We're soon going to add an alternative manifest class (treemanifest). Rather than extending both those classes by largesfiles versions, let's replace the method on the manifest instances.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     1
# pvec.py - probabilistic vector clocks for Mercurial
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     2
#
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     3
# Copyright 2012 Matt Mackall <mpm@selenic.com>
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     4
#
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     7
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     8
'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     9
A "pvec" is a changeset property based on the theory of vector clocks
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    10
that can be compared to discover relatedness without consulting a
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    11
graph. This can be useful for tasks like determining how a
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    12
disconnected patch relates to a repository.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    13
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    14
Currently a pvec consist of 448 bits, of which 24 are 'depth' and the
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    15
remainder are a bit vector. It is represented as a 70-character base85
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    16
string.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    17
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    18
Construction:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    19
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    20
- a root changeset has a depth of 0 and a bit vector based on its hash
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    21
- a normal commit has a changeset where depth is increased by one and
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    22
  one bit vector bit is flipped based on its hash
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    23
- a merge changeset pvec is constructed by copying changes from one pvec into
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    24
  the other to balance its depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    25
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    26
Properties:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    27
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    28
- for linear changes, difference in depth is always <= hamming distance
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    29
- otherwise, changes are probably divergent
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    30
- when hamming distance is < 200, we can reliably detect when pvecs are near
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    31
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    32
Issues:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    33
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    34
- hamming distance ceases to work over distances of ~ 200
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    35
- detecting divergence is less accurate when the common ancestor is very close
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    36
  to either revision or total distance is high
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    37
- this could probably be improved by modeling the relation between
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    38
  delta and hdist
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    39
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    40
Uses:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    41
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    42
- a patch pvec can be used to locate the nearest available common ancestor for
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    43
  resolving conflicts
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    44
- ordering of patches can be established without a DAG
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    45
- two head pvecs can be compared to determine whether push/pull/merge is needed
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    46
  and approximately how many changesets are involved
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    47
- can be used to find a heuristic divergence measure between changesets on
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    48
  different branches
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    49
'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    50
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    51
import base85, util
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    52
from node import nullrev
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    53
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    54
_size = 448 # 70 chars b85-encoded
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    55
_bytes = _size / 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    56
_depthbits = 24
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    57
_depthbytes = _depthbits / 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    58
_vecbytes = _bytes - _depthbytes
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    59
_vecbits = _vecbytes * 8
17424
e7cfe3587ea4 fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents: 16249
diff changeset
    60
_radius = (_vecbits - 30) / 2 # high probability vectors are related
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    61
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    62
def _bin(bs):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    63
    '''convert a bytestring to a long'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    64
    v = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    65
    for b in bs:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    66
        v = v * 256 + ord(b)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    67
    return v
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    68
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    69
def _str(v, l):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    70
    bs = ""
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    71
    for p in xrange(l):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    72
        bs = chr(v & 255) + bs
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    73
        v >>= 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    74
    return bs
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    75
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    76
def _split(b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    77
    '''depth and bitvec'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    78
    return _bin(b[:_depthbytes]), _bin(b[_depthbytes:])
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    79
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    80
def _join(depth, bitvec):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    81
    return _str(depth, _depthbytes) + _str(bitvec, _vecbytes)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    82
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    83
def _hweight(x):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    84
    c = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    85
    while x:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    86
        if x & 1:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    87
            c += 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    88
        x >>= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    89
    return c
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    90
_htab = [_hweight(x) for x in xrange(256)]
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    91
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    92
def _hamming(a, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    93
    '''find the hamming distance between two longs'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    94
    d = a ^ b
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    95
    c = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    96
    while d:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    97
        c += _htab[d & 0xff]
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    98
        d >>= 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    99
    return c
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   100
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   101
def _mergevec(x, y, c):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   102
    # Ideally, this function would be x ^ y ^ ancestor, but finding
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   103
    # ancestors is a nuisance. So instead we find the minimal number
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   104
    # of changes to balance the depth and hamming distance
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   105
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   106
    d1, v1 = x
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   107
    d2, v2 = y
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   108
    if d1 < d2:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   109
        d1, d2, v1, v2 = d2, d1, v2, v1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   110
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   111
    hdist = _hamming(v1, v2)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   112
    ddist = d1 - d2
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   113
    v = v1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   114
    m = v1 ^ v2 # mask of different bits
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   115
    i = 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   116
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   117
    if hdist > ddist:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   118
        # if delta = 10 and hdist = 100, then we need to go up 55 steps
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   119
        # to the ancestor and down 45
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   120
        changes = (hdist - ddist + 1) / 2
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   121
    else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   122
        # must make at least one change
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   123
        changes = 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   124
    depth = d1 + changes
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   125
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   126
    # copy changes from v2
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   127
    if m:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   128
        while changes:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   129
            if m & i:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   130
                v ^= i
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   131
                changes -= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   132
            i <<= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   133
    else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   134
        v = _flipbit(v, c)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   135
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   136
    return depth, v
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   137
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   138
def _flipbit(v, node):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   139
    # converting bit strings to longs is slow
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   140
    bit = (hash(node) & 0xffffffff) % _vecbits
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   141
    return v ^ (1<<bit)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   142
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   143
def ctxpvec(ctx):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   144
    '''construct a pvec for ctx while filling in the cache'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   145
    r = ctx._repo
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   146
    if not util.safehasattr(r, "_pveccache"):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   147
        r._pveccache = {}
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   148
    pvc = r._pveccache
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   149
    if ctx.rev() not in pvc:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   150
        cl = r.changelog
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   151
        for n in xrange(ctx.rev() + 1):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   152
            if n not in pvc:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   153
                node = cl.node(n)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   154
                p1, p2 = cl.parentrevs(n)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   155
                if p1 == nullrev:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   156
                    # start with a 'random' vector at root
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   157
                    pvc[n] = (0, _bin((node * 3)[:_vecbytes]))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   158
                elif p2 == nullrev:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   159
                    d, v = pvc[p1]
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   160
                    pvc[n] = (d + 1, _flipbit(v, node))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   161
                else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   162
                    pvc[n] = _mergevec(pvc[p1], pvc[p2], node)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   163
    bs = _join(*pvc[ctx.rev()])
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   164
    return pvec(base85.b85encode(bs))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   165
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   166
class pvec(object):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   167
    def __init__(self, hashorctx):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   168
        if isinstance(hashorctx, str):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   169
            self._bs = hashorctx
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   170
            self._depth, self._vec = _split(base85.b85decode(hashorctx))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   171
        else:
18918
5093d2a87ff6 pvec: use the correct name for an identifier
Bryan O'Sullivan <bryano@fb.com>
parents: 17424
diff changeset
   172
            self._vec = ctxpvec(hashorctx)
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   173
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   174
    def __str__(self):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   175
        return self._bs
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   176
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   177
    def __eq__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   178
        return self._vec == b._vec and self._depth == b._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   179
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   180
    def __lt__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   181
        delta = b._depth - self._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   182
        if delta < 0:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   183
            return False # always correct
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   184
        if _hamming(self._vec, b._vec) > delta:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   185
            return False
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   186
        return True
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   187
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   188
    def __gt__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   189
        return b < self
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   190
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   191
    def __or__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   192
        delta = abs(b._depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   193
        if _hamming(self._vec, b._vec) <= delta:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   194
            return False
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   195
        return True
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   196
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   197
    def __sub__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   198
        if self | b:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   199
            raise ValueError("concurrent pvecs")
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   200
        return self._depth - b._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   201
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   202
    def distance(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   203
        d = abs(b._depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   204
        h = _hamming(self._vec, b._vec)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   205
        return max(d, h)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   206
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   207
    def near(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   208
        dist = abs(b.depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   209
        if dist > _radius or _hamming(self._vec, b._vec) > _radius:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   210
            return False