author | Martin von Zweigbergk <martinvonz@google.com> |
Thu, 23 Oct 2014 13:43:20 -0700 | |
changeset 23236 | 7ea2ef4479f2 |
parent 23192 | 73cfaa348650 |
child 23343 | f8a2647fe020 |
permissions | -rw-r--r-- |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
1 |
# setdiscovery.py - improved discovery of common nodeset for mercurial |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
2 |
# |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
3 |
# Copyright 2010 Benoit Boissinot <bboissin@gmail.com> |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
4 |
# and Peter Arrenbrecht <peter@arrenbrecht.ch> |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
5 |
# |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
6 |
# This software may be used and distributed according to the terms of the |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
7 |
# GNU General Public License version 2 or any later version. |
20656
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
8 |
""" |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
9 |
Algorithm works in the following way. You have two repository: local and |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
10 |
remote. They both contains a DAG of changelists. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
11 |
|
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
12 |
The goal of the discovery protocol is to find one set of node *common*, |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
13 |
the set of nodes shared by local and remote. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
14 |
|
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
15 |
One of the issue with the original protocol was latency, it could |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
16 |
potentially require lots of roundtrips to discover that the local repo was a |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
17 |
subset of remote (which is a very common case, you usually have few changes |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
18 |
compared to upstream, while upstream probably had lots of development). |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
19 |
|
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
20 |
The new protocol only requires one interface for the remote repo: `known()`, |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
21 |
which given a set of changelists tells you if they are present in the DAG. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
22 |
|
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
23 |
The algorithm then works as follow: |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
24 |
|
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
25 |
- We will be using three sets, `common`, `missing`, `unknown`. Originally |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
26 |
all nodes are in `unknown`. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
27 |
- Take a sample from `unknown`, call `remote.known(sample)` |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
28 |
- For each node that remote knows, move it and all its ancestors to `common` |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
29 |
- For each node that remote doesn't know, move it and all its descendants |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
30 |
to `missing` |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
31 |
- Iterate until `unknown` is empty |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
32 |
|
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
33 |
There are a couple optimizations, first is instead of starting with a random |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
34 |
sample of missing, start by sending all heads, in the case where the local |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
35 |
repo is a subset, you computed the answer in one round trip. |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
36 |
|
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
37 |
Then you can do something similar to the bisecting strategy used when |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
38 |
finding faulty changesets. Instead of random samples, you can try picking |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
39 |
nodes that will maximize the number of nodes that will be |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
40 |
classified with it (since all ancestors or descendants will be marked as well). |
cdecbc5ab504
setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents:
20034
diff
changeset
|
41 |
""" |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
42 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
43 |
from node import nullid |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
44 |
from i18n import _ |
20034
1e5b38a919dd
cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents:
17426
diff
changeset
|
45 |
import random |
1e5b38a919dd
cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents:
17426
diff
changeset
|
46 |
import util, dagutil |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
47 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
48 |
def _updatesample(dag, nodes, sample, always, quicksamplesize=0): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
49 |
# if nodes is empty we scan the entire graph |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
50 |
if nodes: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
51 |
heads = dag.headsetofconnecteds(nodes) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
52 |
else: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
53 |
heads = dag.heads() |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
54 |
dist = {} |
16834
cafd8a8fb713
util: subclass deque for Python 2.4 backwards compatibility
Bryan O'Sullivan <bryano@fb.com>
parents:
16683
diff
changeset
|
55 |
visit = util.deque(heads) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
56 |
seen = set() |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
57 |
factor = 1 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
58 |
while visit: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
59 |
curr = visit.popleft() |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
60 |
if curr in seen: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
61 |
continue |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
62 |
d = dist.setdefault(curr, 1) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
63 |
if d > factor: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
64 |
factor *= 2 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
65 |
if d == factor: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
66 |
if curr not in always: # need this check for the early exit below |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
67 |
sample.add(curr) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
68 |
if quicksamplesize and (len(sample) >= quicksamplesize): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
69 |
return |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
70 |
seen.add(curr) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
71 |
for p in dag.parents(curr): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
72 |
if not nodes or p in nodes: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
73 |
dist.setdefault(p, d + 1) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
74 |
visit.append(p) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
75 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
76 |
def _setupsample(dag, nodes, size): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
77 |
if len(nodes) <= size: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
78 |
return set(nodes), None, 0 |
15063
c20688b7c061
setdiscovery: fix hang when #heads>200 (issue2971)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14981
diff
changeset
|
79 |
always = dag.headsetofconnecteds(nodes) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
80 |
desiredlen = size - len(always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
81 |
if desiredlen <= 0: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
82 |
# This could be bad if there are very many heads, all unknown to the |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
83 |
# server. We're counting on long request support here. |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
84 |
return always, None, desiredlen |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
85 |
return always, set(), desiredlen |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
86 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
87 |
def _takequicksample(dag, nodes, size, initial): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
88 |
always, sample, desiredlen = _setupsample(dag, nodes, size) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
89 |
if sample is None: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
90 |
return always |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
91 |
if initial: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
92 |
fromset = None |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
93 |
else: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
94 |
fromset = nodes |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
95 |
_updatesample(dag, fromset, sample, always, quicksamplesize=desiredlen) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
96 |
sample.update(always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
97 |
return sample |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
98 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
99 |
def _takefullsample(dag, nodes, size): |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
100 |
always, sample, desiredlen = _setupsample(dag, nodes, size) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
101 |
if sample is None: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
102 |
return always |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
103 |
# update from heads |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
104 |
_updatesample(dag, nodes, sample, always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
105 |
# update from roots |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
106 |
_updatesample(dag.inverse(), nodes, sample, always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
107 |
assert sample |
23083
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
108 |
sample = _limitsample(sample, desiredlen) |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
109 |
if len(sample) < desiredlen: |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
110 |
more = desiredlen - len(sample) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
111 |
sample.update(random.sample(list(nodes - sample - always), more)) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
112 |
sample.update(always) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
113 |
return sample |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
114 |
|
23083
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
115 |
def _limitsample(sample, desiredlen): |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
116 |
"""return a random subset of sample of at most desiredlen item""" |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
117 |
if len(sample) > desiredlen: |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
118 |
sample = set(random.sample(sample, desiredlen)) |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
119 |
return sample |
ee45f5c2ffcc
setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
20656
diff
changeset
|
120 |
|
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
121 |
def findcommonheads(ui, local, remote, |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
122 |
initialsamplesize=100, |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
123 |
fullsamplesize=200, |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
124 |
abortwhenunrelated=True): |
14206
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
125 |
'''Return a tuple (common, anyincoming, remoteheads) used to identify |
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
126 |
missing nodes from or in remote. |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
127 |
''' |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
128 |
roundtrips = 0 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
129 |
cl = local.changelog |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
130 |
dag = dagutil.revlogdag(cl) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
131 |
|
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
132 |
# early exit if we know all the specified remote heads already |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
133 |
ui.debug("query 1; heads\n") |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
134 |
roundtrips += 1 |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
135 |
ownheads = dag.heads() |
23084
3ef893520a85
setdiscovery: limit the size of the initial sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23083
diff
changeset
|
136 |
sample = _limitsample(ownheads, initialsamplesize) |
23192
73cfaa348650
discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents:
23191
diff
changeset
|
137 |
# indices between sample and externalized version must match |
73cfaa348650
discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents:
23191
diff
changeset
|
138 |
sample = list(sample) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
139 |
if remote.local(): |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
140 |
# stopgap until we have a proper localpeer that supports batch() |
17204
4feb55e6931f
localpeer: return only visible heads and branchmap
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents:
17191
diff
changeset
|
141 |
srvheadhashes = remote.heads() |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
142 |
yesno = remote.known(dag.externalizeall(sample)) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
143 |
elif remote.capable('batch'): |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
144 |
batch = remote.batch() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
145 |
srvheadhashesref = batch.heads() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
146 |
yesnoref = batch.known(dag.externalizeall(sample)) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
147 |
batch.submit() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
148 |
srvheadhashes = srvheadhashesref.value |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
149 |
yesno = yesnoref.value |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
150 |
else: |
17424
e7cfe3587ea4
fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents:
17204
diff
changeset
|
151 |
# compatibility with pre-batch, but post-known remotes during 1.9 |
e7cfe3587ea4
fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents:
17204
diff
changeset
|
152 |
# development |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
153 |
srvheadhashes = remote.heads() |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
154 |
sample = [] |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
155 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
156 |
if cl.tip() == nullid: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
157 |
if srvheadhashes != [nullid]: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
158 |
return [nullid], True, srvheadhashes |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
159 |
return [nullid], False, [] |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
160 |
|
14206
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
161 |
# start actual discovery (we note this before the next "if" for |
2bf60f158ecb
setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents:
14164
diff
changeset
|
162 |
# compatibility reasons) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
163 |
ui.status(_("searching for changes\n")) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
164 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
165 |
srvheads = dag.internalizeall(srvheadhashes, filterunknown=True) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
166 |
if len(srvheads) == len(srvheadhashes): |
14833
308e1b5acc87
discovery: quiet note about heads
Matt Mackall <mpm@selenic.com>
parents:
14624
diff
changeset
|
167 |
ui.debug("all remote heads known locally\n") |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
168 |
return (srvheadhashes, False, srvheadhashes,) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
169 |
|
23191
86c35b7ae300
discovery: limit 'all local heads known remotely' to real 'all' (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents:
23130
diff
changeset
|
170 |
if sample and len(ownheads) <= initialsamplesize and util.all(yesno): |
15497
9bea3aed6ee1
add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents:
15063
diff
changeset
|
171 |
ui.note(_("all local heads known remotely\n")) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
172 |
ownheadhashes = dag.externalizeall(ownheads) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
173 |
return (ownheadhashes, True, srvheadhashes,) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
174 |
|
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
175 |
# full blown discovery |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
176 |
|
16683 | 177 |
# own nodes where I don't know if remote knows them |
178 |
undecided = dag.nodeset() |
|
179 |
# own nodes I know we both know |
|
180 |
common = set() |
|
181 |
# own nodes I know remote lacks |
|
182 |
missing = set() |
|
183 |
||
184 |
# treat remote heads (and maybe own heads) as a first implicit sample |
|
185 |
# response |
|
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
186 |
common.update(dag.ancestorset(srvheads)) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
187 |
undecided.difference_update(common) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
188 |
|
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
189 |
full = False |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
190 |
while undecided: |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
191 |
|
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
192 |
if sample: |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
193 |
commoninsample = set(n for i, n in enumerate(sample) if yesno[i]) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
194 |
common.update(dag.ancestorset(commoninsample, common)) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
195 |
|
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
196 |
missinginsample = [n for i, n in enumerate(sample) if not yesno[i]] |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
197 |
missing.update(dag.descendantset(missinginsample, missing)) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
198 |
|
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
199 |
undecided.difference_update(missing) |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
200 |
undecided.difference_update(common) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
201 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
202 |
if not undecided: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
203 |
break |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
204 |
|
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
205 |
if full: |
15497
9bea3aed6ee1
add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents:
15063
diff
changeset
|
206 |
ui.note(_("sampling from both directions\n")) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
207 |
sample = _takefullsample(dag, undecided, size=fullsamplesize) |
23130
ced632394371
setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23084
diff
changeset
|
208 |
targetsize = fullsamplesize |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
209 |
elif common: |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
210 |
# use cheapish initial sample |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
211 |
ui.debug("taking initial sample\n") |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
212 |
sample = _takefullsample(dag, undecided, size=fullsamplesize) |
23130
ced632394371
setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23084
diff
changeset
|
213 |
targetsize = fullsamplesize |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
214 |
else: |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
215 |
# use even cheaper initial sample |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
216 |
ui.debug("taking quick initial sample\n") |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
217 |
sample = _takequicksample(dag, undecided, size=initialsamplesize, |
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
218 |
initial=True) |
23130
ced632394371
setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23084
diff
changeset
|
219 |
targetsize = initialsamplesize |
ced632394371
setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents:
23084
diff
changeset
|
220 |
sample = _limitsample(sample, targetsize) |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
221 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
222 |
roundtrips += 1 |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
223 |
ui.progress(_('searching'), roundtrips, unit=_('queries')) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
224 |
ui.debug("query %i; still undecided: %i, sample size is: %i\n" |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
225 |
% (roundtrips, len(undecided), len(sample))) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
226 |
# indices between sample and externalized version must match |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
227 |
sample = list(sample) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
228 |
yesno = remote.known(dag.externalizeall(sample)) |
14624
f03c82d1f50a
setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
14206
diff
changeset
|
229 |
full = True |
14164
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
230 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
231 |
result = dag.headsetofconnecteds(common) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
232 |
ui.progress(_('searching'), None) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
233 |
ui.debug("%d total queries\n" % roundtrips) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
234 |
|
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
235 |
if not result and srvheadhashes != [nullid]: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
236 |
if abortwhenunrelated: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
237 |
raise util.Abort(_("repository is unrelated")) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
238 |
else: |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
239 |
ui.warn(_("warning: repository is unrelated\n")) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
240 |
return (set([nullid]), True, srvheadhashes,) |
cb98fed52495
discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff
changeset
|
241 |
|
14981
192e02680d09
setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents:
14833
diff
changeset
|
242 |
anyincoming = (srvheadhashes != [nullid]) |
192e02680d09
setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents:
14833
diff
changeset
|
243 |
return dag.externalizeall(result), anyincoming, srvheadhashes |