mercurial/pvec.py
author Anton Shestakov <av6@dwimlabs.net>
Fri, 07 Jan 2022 11:53:23 +0300
changeset 48687 f8f2ecdde4b5
parent 46819 d4ba4d51f85f
child 48875 6000f5b25c9b
permissions -rw-r--r--
branchmap: skip obsolete revisions while computing heads It's time to make this part of core Mercurial obsolescence-aware. Not considering obsolete revisions when computing heads is clearly what Mercurial should do. But there are a couple of small issues: - Let's say tip of the repo is obsolete. There are two ways of finding tiprev for branchcache (both are in use): looking at input data for update() and looking at computed heads after update(). Previously, repo tip would be tiprev of the branchcache. With this patch, an obsolete revision can no longer be tiprev. And depending on what way we use for finding tiprev (input data vs computed heads) we'll get a different result. This is relevant when recomputing cache key from cache contents, and may lead to updating cache for obsolete revisions multiple times (not from scratch, because it still would be considered valid for a subset of revisions in the repo). - If all commits on a branch are obsolete, the branchcache will include that branch, but the list of heads will be empty (that's why there's now `if not heads` when recomputing tiprev/tipnode from cache contents). Having an entry for every branch is currently required for notify extension (and test-notify.t to pass), because notify doesn't handle revsets in its subscription config very well and will throw an error if e.g. a branch doesn't exist. - Cloning static HTTP repos may try to stat() a non-existent obsstore file. The issue is that we now care about obsolescence during clone, but statichttpvfs doesn't implement a stat method, so a regular vfs.stat() is used, and it assumes that file is local and calls os.stat(). During a clone, we're trying to stat() .hg/store/obsstore, but in static HTTP case we provide a literal URL to the obsstore file on the remote as if it were a local file path. On windows it actually results in a failure in test-static-http.t. The first issue is going to be addressed in a series dedicated to making sure branchcache is properly and timely written on disk (it wasn't perfect even before this patch, but there aren't enough tests to demonstrate that). The second issue will be addressed in a future patch for notify extension that will make it not raise an exception if a branch doesn't exist. And the third one was partially addressed in the previous patch in this series and will be properly fixed in a future patch when this series is accepted. filteredhash() grows a keyword argument to make sure that branchcache is also invalidated when there are new obsolete revisions in its repo view. This way the on-disk cache format is unchanged and compatible between versions (although it will obviously be recomputed when switching versions before/after this patch and the repo has obsolete revisions). There's one test that uses plain `hg up` without arguments while updated to a pruned commit. To make this test pass, simply return current working directory parent. Later in this series this code will be replaced by what prune command does: updating to the closest non-obsolete ancestor. Test changes: test-branch-change.t: update branch head and cache update message. The head of default listed in hg heads is changed because revision 2 was rewritten as 7, and 1 is the closest ancestor on the same branch, so it's the head of default now. The cache invalidation message appears now because of the cache hash change, since we're now accounting for obsolete revisions. Here's some context: "served.hidden" repo filter means everything is visible (no filtered revisions), so before this series branch2-served.hidden file would not contain any cache hash, only revnum and node. Now it also has a hash when there are obsolete changesets in the repo. The command that the message appears for is changing branch of 5 and 6, which are now obsolete, so the cache hash changes. In general, when cache is simply out-of-date, it can be updated using the old version as a base. But if cache hash differs, then the cache for that particular repo filter is recomputed (at least with the current implementation). This is what happens here. test-obsmarker-template.t: the pull reports 2 heads changed, but after that the repo correctly sees only 1. The new message could be better, but it's still an improvement over the previous one where hg pull suggested merging with an obsolete revision. test-obsolete.t: we can see these revisions in hg log --hidden, but they shouldn't be considered heads even with --hidden. test-rebase-obsolete{,2}.t: there were new heads created previously after making new orphan changesets, but they weren't detected. Now we are properly detecting and reporting them. test-rebase-obsolete4.t: there's only one head now because the other head is pruned and was falsely reported before. test-static-http.t: add obsstore to the list of requested files. This file doesn't exist on the remotes, but clients want it anyway (they get 404). This is fine, because there are other nonexistent files that clients request, like .hg/bookmarks or .hg/cache/tags2-served. Differential Revision: https://phab.mercurial-scm.org/D12097
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     1
# pvec.py - probabilistic vector clocks for Mercurial
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     2
#
46819
d4ba4d51f85f contributor: change mentions of mpm to olivia
Raphaël Gomès <rgomes@octobus.net>
parents: 44602
diff changeset
     3
# Copyright 2012 Olivia Mackall <olivia@selenic.com>
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     4
#
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     5
# This software may be used and distributed according to the terms of the
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     6
# GNU General Public License version 2 or any later version.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     7
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     8
'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     9
A "pvec" is a changeset property based on the theory of vector clocks
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    10
that can be compared to discover relatedness without consulting a
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    11
graph. This can be useful for tasks like determining how a
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    12
disconnected patch relates to a repository.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    13
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    14
Currently a pvec consist of 448 bits, of which 24 are 'depth' and the
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    15
remainder are a bit vector. It is represented as a 70-character base85
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    16
string.
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    17
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    18
Construction:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    19
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    20
- a root changeset has a depth of 0 and a bit vector based on its hash
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    21
- a normal commit has a changeset where depth is increased by one and
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    22
  one bit vector bit is flipped based on its hash
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    23
- a merge changeset pvec is constructed by copying changes from one pvec into
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    24
  the other to balance its depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    25
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    26
Properties:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    27
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    28
- for linear changes, difference in depth is always <= hamming distance
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    29
- otherwise, changes are probably divergent
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    30
- when hamming distance is < 200, we can reliably detect when pvecs are near
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    31
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    32
Issues:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    33
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    34
- hamming distance ceases to work over distances of ~ 200
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    35
- detecting divergence is less accurate when the common ancestor is very close
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    36
  to either revision or total distance is high
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    37
- this could probably be improved by modeling the relation between
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    38
  delta and hdist
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    39
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    40
Uses:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    41
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    42
- a patch pvec can be used to locate the nearest available common ancestor for
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    43
  resolving conflicts
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    44
- ordering of patches can be established without a DAG
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    45
- two head pvecs can be compared to determine whether push/pull/merge is needed
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    46
  and approximately how many changesets are involved
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    47
- can be used to find a heuristic divergence measure between changesets on
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    48
  different branches
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    49
'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    50
44602
a89aa2d7b34d pvec: drop an unused `from __future__ import division`
Martin von Zweigbergk <martinvonz@google.com>
parents: 43793
diff changeset
    51
from __future__ import absolute_import
27501
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
    52
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
    53
from .node import nullrev
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
    54
from . import (
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
    55
    pycompat,
27501
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
    56
    util,
983e93d88193 pvec: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 24339
diff changeset
    57
)
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    58
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
    59
_size = 448  # 70 chars b85-encoded
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
    60
_bytes = _size // 8
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    61
_depthbits = 24
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
    62
_depthbytes = _depthbits // 8
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    63
_vecbytes = _bytes - _depthbytes
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    64
_vecbits = _vecbytes * 8
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
    65
_radius = (_vecbits - 30) // 2  # high probability vectors are related
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
    66
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    67
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    68
def _bin(bs):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    69
    '''convert a bytestring to a long'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    70
    v = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    71
    for b in bs:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    72
        v = v * 256 + ord(b)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    73
    return v
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    74
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
    75
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    76
def _str(v, l):
43479
8e89b6e1e0cd pvec: add an explicit type hint to help pytype
Augie Fackler <augie@google.com>
parents: 43465
diff changeset
    77
    # type: (int, int) -> bytes
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    78
    bs = b""
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
    79
    for p in pycompat.xrange(l):
43463
271af23d01a9 pvec: fix overlooked chr() call
Augie Fackler <augie@google.com>
parents: 43115
diff changeset
    80
        bs = pycompat.bytechr(v & 255) + bs
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    81
        v >>= 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    82
    return bs
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    83
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
    84
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    85
def _split(b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    86
    '''depth and bitvec'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    87
    return _bin(b[:_depthbytes]), _bin(b[_depthbytes:])
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    88
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
    89
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    90
def _join(depth, bitvec):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    91
    return _str(depth, _depthbytes) + _str(bitvec, _vecbytes)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    92
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
    93
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    94
def _hweight(x):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    95
    c = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    96
    while x:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    97
        if x & 1:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    98
            c += 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    99
        x >>= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   100
    return c
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   101
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   102
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
   103
_htab = [_hweight(x) for x in pycompat.xrange(256)]
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   104
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   105
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   106
def _hamming(a, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   107
    '''find the hamming distance between two longs'''
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   108
    d = a ^ b
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   109
    c = 0
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   110
    while d:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   111
        c += _htab[d & 0xFF]
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   112
        d >>= 8
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   113
    return c
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   114
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   115
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   116
def _mergevec(x, y, c):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   117
    # Ideally, this function would be x ^ y ^ ancestor, but finding
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   118
    # ancestors is a nuisance. So instead we find the minimal number
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   119
    # of changes to balance the depth and hamming distance
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   120
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   121
    d1, v1 = x
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   122
    d2, v2 = y
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   123
    if d1 < d2:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   124
        d1, d2, v1, v2 = d2, d1, v2, v1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   125
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   126
    hdist = _hamming(v1, v2)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   127
    ddist = d1 - d2
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   128
    v = v1
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   129
    m = v1 ^ v2  # mask of different bits
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   130
    i = 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   131
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   132
    if hdist > ddist:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   133
        # if delta = 10 and hdist = 100, then we need to go up 55 steps
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   134
        # to the ancestor and down 45
43465
90aac60b6697 pvec: migrate to modern integer division
Augie Fackler <augie@google.com>
parents: 43463
diff changeset
   135
        changes = (hdist - ddist + 1) // 2
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   136
    else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   137
        # must make at least one change
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   138
        changes = 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   139
    depth = d1 + changes
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   140
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   141
    # copy changes from v2
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   142
    if m:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   143
        while changes:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   144
            if m & i:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   145
                v ^= i
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   146
                changes -= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   147
            i <<= 1
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   148
    else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   149
        v = _flipbit(v, c)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   150
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   151
    return depth, v
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   152
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   153
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   154
def _flipbit(v, node):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   155
    # converting bit strings to longs is slow
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   156
    bit = (hash(node) & 0xFFFFFFFF) % _vecbits
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   157
    return v ^ (1 << bit)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   158
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   159
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   160
def ctxpvec(ctx):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   161
    '''construct a pvec for ctx while filling in the cache'''
24339
bcc319d936a3 pvec: replace 'ctx._repo' with 'ctx.repo()'
Matt Harbison <matt_harbison@yahoo.com>
parents: 18918
diff changeset
   162
    r = ctx.repo()
43115
4aa72cdf616f py3: delete b'' prefix from safehasattr arguments
Martin von Zweigbergk <martinvonz@google.com>
parents: 43077
diff changeset
   163
    if not util.safehasattr(r, "_pveccache"):
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   164
        r._pveccache = {}
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   165
    pvc = r._pveccache
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   166
    if ctx.rev() not in pvc:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   167
        cl = r.changelog
38783
e7aa113b14f7 global: use pycompat.xrange()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32201
diff changeset
   168
        for n in pycompat.xrange(ctx.rev() + 1):
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   169
            if n not in pvc:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   170
                node = cl.node(n)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   171
                p1, p2 = cl.parentrevs(n)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   172
                if p1 == nullrev:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   173
                    # start with a 'random' vector at root
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   174
                    pvc[n] = (0, _bin((node * 3)[:_vecbytes]))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   175
                elif p2 == nullrev:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   176
                    d, v = pvc[p1]
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   177
                    pvc[n] = (d + 1, _flipbit(v, node))
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   178
                else:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   179
                    pvc[n] = _mergevec(pvc[p1], pvc[p2], node)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   180
    bs = _join(*pvc[ctx.rev()])
32201
4462a981e8df base85: proxy through util module
Yuya Nishihara <yuya@tcha.org>
parents: 27501
diff changeset
   181
    return pvec(util.b85encode(bs))
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   182
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   183
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   184
class pvec(object):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   185
    def __init__(self, hashorctx):
43746
10662ac7849e pvec: fix a `str` type conditional for py3
Matt Harbison <matt_harbison@yahoo.com>
parents: 43115
diff changeset
   186
        if isinstance(hashorctx, bytes):
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   187
            self._bs = hashorctx
32201
4462a981e8df base85: proxy through util module
Yuya Nishihara <yuya@tcha.org>
parents: 27501
diff changeset
   188
            self._depth, self._vec = _split(util.b85decode(hashorctx))
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   189
        else:
18918
5093d2a87ff6 pvec: use the correct name for an identifier
Bryan O'Sullivan <bryano@fb.com>
parents: 17424
diff changeset
   190
            self._vec = ctxpvec(hashorctx)
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   191
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   192
    def __str__(self):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   193
        return self._bs
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   194
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   195
    def __eq__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   196
        return self._vec == b._vec and self._depth == b._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   197
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   198
    def __lt__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   199
        delta = b._depth - self._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   200
        if delta < 0:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 38783
diff changeset
   201
            return False  # always correct
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   202
        if _hamming(self._vec, b._vec) > delta:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   203
            return False
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   204
        return True
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   205
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   206
    def __gt__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   207
        return b < self
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   208
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   209
    def __or__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   210
        delta = abs(b._depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   211
        if _hamming(self._vec, b._vec) <= delta:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   212
            return False
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   213
        return True
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   214
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   215
    def __sub__(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   216
        if self | b:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   217
            raise ValueError(b"concurrent pvecs")
16249
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   218
        return self._depth - b._depth
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   219
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   220
    def distance(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   221
        d = abs(b._depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   222
        h = _hamming(self._vec, b._vec)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   223
        return max(d, h)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   224
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   225
    def near(self, b):
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   226
        dist = abs(b.depth - self._depth)
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   227
        if dist > _radius or _hamming(self._vec, b._vec) > _radius:
0d175ac527c1 pvec: introduce pvecs
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   228
            return False