annotate mercurial/setdiscovery.py @ 23807:e97e363a7000

setdiscovery: delay sample building calls to gather them in a single place Some of the logic around sample building is duplicated in the sample builders, it would clean up thing to extract it in the top function, but this requires all codes to be in the same place. This changeset mostly exists to make the next one more clear.
author Pierre-Yves David <pierre-yves.david@fb.com>
date Wed, 07 Jan 2015 09:30:06 -0800
parents d6cbbe3baef0
children 07d0f59e0ba7
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
1 # setdiscovery.py - improved discovery of common nodeset for mercurial
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
2 #
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
3 # Copyright 2010 Benoit Boissinot <bboissin@gmail.com>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
4 # and Peter Arrenbrecht <peter@arrenbrecht.ch>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
5 #
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
6 # This software may be used and distributed according to the terms of the
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
7 # GNU General Public License version 2 or any later version.
20656
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
8 """
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
9 Algorithm works in the following way. You have two repository: local and
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
10 remote. They both contains a DAG of changelists.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
11
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
12 The goal of the discovery protocol is to find one set of node *common*,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
13 the set of nodes shared by local and remote.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
14
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
15 One of the issue with the original protocol was latency, it could
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
16 potentially require lots of roundtrips to discover that the local repo was a
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
17 subset of remote (which is a very common case, you usually have few changes
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
18 compared to upstream, while upstream probably had lots of development).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
19
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
20 The new protocol only requires one interface for the remote repo: `known()`,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
21 which given a set of changelists tells you if they are present in the DAG.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
22
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
23 The algorithm then works as follow:
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
24
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
25 - We will be using three sets, `common`, `missing`, `unknown`. Originally
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
26 all nodes are in `unknown`.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
27 - Take a sample from `unknown`, call `remote.known(sample)`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
28 - For each node that remote knows, move it and all its ancestors to `common`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
29 - For each node that remote doesn't know, move it and all its descendants
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
30 to `missing`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
31 - Iterate until `unknown` is empty
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
32
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
33 There are a couple optimizations, first is instead of starting with a random
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
34 sample of missing, start by sending all heads, in the case where the local
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
35 repo is a subset, you computed the answer in one round trip.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
36
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
37 Then you can do something similar to the bisecting strategy used when
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
38 finding faulty changesets. Instead of random samples, you can try picking
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
39 nodes that will maximize the number of nodes that will be
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
40 classified with it (since all ancestors or descendants will be marked as well).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
41 """
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
42
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
43 from node import nullid, nullrev
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
44 from i18n import _
20034
1e5b38a919dd cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents: 17426
diff changeset
45 import random
1e5b38a919dd cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents: 17426
diff changeset
46 import util, dagutil
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
47
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
48 def _updatesample(dag, nodes, sample, always, quicksamplesize=0):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
49 # if nodes is empty we scan the entire graph
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
50 if nodes:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
51 heads = dag.headsetofconnecteds(nodes)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
52 else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
53 heads = dag.heads()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
54 dist = {}
16834
cafd8a8fb713 util: subclass deque for Python 2.4 backwards compatibility
Bryan O'Sullivan <bryano@fb.com>
parents: 16683
diff changeset
55 visit = util.deque(heads)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
56 seen = set()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
57 factor = 1
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
58 while visit:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
59 curr = visit.popleft()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
60 if curr in seen:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
61 continue
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
62 d = dist.setdefault(curr, 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
63 if d > factor:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
64 factor *= 2
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
65 if d == factor:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
66 if curr not in always: # need this check for the early exit below
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
67 sample.add(curr)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
68 if quicksamplesize and (len(sample) >= quicksamplesize):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
69 return
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
70 seen.add(curr)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
71 for p in dag.parents(curr):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
72 if not nodes or p in nodes:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
73 dist.setdefault(p, d + 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
74 visit.append(p)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
75
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
76 def _setupsample(dag, nodes, size):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
77 if len(nodes) <= size:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
78 return set(nodes), None, 0
15063
c20688b7c061 setdiscovery: fix hang when #heads>200 (issue2971)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14981
diff changeset
79 always = dag.headsetofconnecteds(nodes)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
80 desiredlen = size - len(always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
81 if desiredlen <= 0:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
82 # This could be bad if there are very many heads, all unknown to the
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
83 # server. We're counting on long request support here.
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
84 return always, None, desiredlen
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
85 return always, set(), desiredlen
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
86
23806
d6cbbe3baef0 setdiscovery: drop unused 'initial' argument for '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23747
diff changeset
87 def _takequicksample(dag, nodes, size):
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
88 always, sample, desiredlen = _setupsample(dag, nodes, size)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
89 if sample is None:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
90 return always
23806
d6cbbe3baef0 setdiscovery: drop unused 'initial' argument for '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23747
diff changeset
91 _updatesample(dag, None, sample, always, quicksamplesize=desiredlen)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
92 sample.update(always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
93 return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
94
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
95 def _takefullsample(dag, nodes, size):
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
96 always, sample, desiredlen = _setupsample(dag, nodes, size)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
97 if sample is None:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
98 return always
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
99 # update from heads
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
100 _updatesample(dag, nodes, sample, always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
101 # update from roots
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
102 _updatesample(dag.inverse(), nodes, sample, always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
103 assert sample
23083
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
104 sample = _limitsample(sample, desiredlen)
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
105 if len(sample) < desiredlen:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
106 more = desiredlen - len(sample)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
107 sample.update(random.sample(list(nodes - sample - always), more))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
108 sample.update(always)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
109 return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
110
23083
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
111 def _limitsample(sample, desiredlen):
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
112 """return a random subset of sample of at most desiredlen item"""
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
113 if len(sample) > desiredlen:
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
114 sample = set(random.sample(sample, desiredlen))
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
115 return sample
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
116
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
117 def findcommonheads(ui, local, remote,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
118 initialsamplesize=100,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
119 fullsamplesize=200,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
120 abortwhenunrelated=True):
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
121 '''Return a tuple (common, anyincoming, remoteheads) used to identify
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
122 missing nodes from or in remote.
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
123 '''
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
124 roundtrips = 0
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
125 cl = local.changelog
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
126 dag = dagutil.revlogdag(cl)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
127
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
128 # early exit if we know all the specified remote heads already
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
129 ui.debug("query 1; heads\n")
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
130 roundtrips += 1
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
131 ownheads = dag.heads()
23084
3ef893520a85 setdiscovery: limit the size of the initial sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23083
diff changeset
132 sample = _limitsample(ownheads, initialsamplesize)
23192
73cfaa348650 discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents: 23191
diff changeset
133 # indices between sample and externalized version must match
73cfaa348650 discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents: 23191
diff changeset
134 sample = list(sample)
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
135 if remote.local():
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
136 # stopgap until we have a proper localpeer that supports batch()
17204
4feb55e6931f localpeer: return only visible heads and branchmap
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 17191
diff changeset
137 srvheadhashes = remote.heads()
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
138 yesno = remote.known(dag.externalizeall(sample))
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
139 elif remote.capable('batch'):
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
140 batch = remote.batch()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
141 srvheadhashesref = batch.heads()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
142 yesnoref = batch.known(dag.externalizeall(sample))
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
143 batch.submit()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
144 srvheadhashes = srvheadhashesref.value
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
145 yesno = yesnoref.value
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
146 else:
17424
e7cfe3587ea4 fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents: 17204
diff changeset
147 # compatibility with pre-batch, but post-known remotes during 1.9
e7cfe3587ea4 fix trivial spelling errors
Mads Kiilerich <mads@kiilerich.com>
parents: 17204
diff changeset
148 # development
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
149 srvheadhashes = remote.heads()
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
150 sample = []
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
151
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
152 if cl.tip() == nullid:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
153 if srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
154 return [nullid], True, srvheadhashes
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
155 return [nullid], False, []
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
156
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
157 # start actual discovery (we note this before the next "if" for
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
158 # compatibility reasons)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
159 ui.status(_("searching for changes\n"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
160
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
161 srvheads = dag.internalizeall(srvheadhashes, filterunknown=True)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
162 if len(srvheads) == len(srvheadhashes):
14833
308e1b5acc87 discovery: quiet note about heads
Matt Mackall <mpm@selenic.com>
parents: 14624
diff changeset
163 ui.debug("all remote heads known locally\n")
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
164 return (srvheadhashes, False, srvheadhashes,)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
165
23191
86c35b7ae300 discovery: limit 'all local heads known remotely' to real 'all' (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents: 23130
diff changeset
166 if sample and len(ownheads) <= initialsamplesize and util.all(yesno):
15497
9bea3aed6ee1 add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents: 15063
diff changeset
167 ui.note(_("all local heads known remotely\n"))
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
168 ownheadhashes = dag.externalizeall(ownheads)
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
169 return (ownheadhashes, True, srvheadhashes,)
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
170
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
171 # full blown discovery
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
172
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
173 # own nodes I know we both know
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
174 # treat remote heads (and maybe own heads) as a first implicit sample
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
175 # response
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
176 common = cl.incrementalmissingrevs(srvheads)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
177 commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
178 common.addbases(commoninsample)
23746
4ef2f2fa8b8b setdiscovery: drop shadowed 'undecided' assignment
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23343
diff changeset
179 # own nodes where I don't know if remote knows them
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
180 undecided = set(common.missingancestors(ownheads))
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
181 # own nodes I know remote lacks
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
182 missing = set()
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
183
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
184 full = False
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
185 while undecided:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
186
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
187 if sample:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
188 missinginsample = [n for i, n in enumerate(sample) if not yesno[i]]
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
189 missing.update(dag.descendantset(missinginsample, missing))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
190
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
191 undecided.difference_update(missing)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
192
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
193 if not undecided:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
194 break
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
195
23747
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
196 if full or common.hasbases():
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
197 if full:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
198 ui.note(_("sampling from both directions\n"))
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
199 else:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
200 ui.debug("taking initial sample\n")
23807
e97e363a7000 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23806
diff changeset
201 samplefunc = _takefullsample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
202 targetsize = fullsamplesize
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
203 else:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
204 # use even cheaper initial sample
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
205 ui.debug("taking quick initial sample\n")
23807
e97e363a7000 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23806
diff changeset
206 samplefunc = _takequicksample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
207 targetsize = initialsamplesize
23807
e97e363a7000 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23806
diff changeset
208 sample = samplefunc(dag, undecided, targetsize)
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
209 sample = _limitsample(sample, targetsize)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
210
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
211 roundtrips += 1
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
212 ui.progress(_('searching'), roundtrips, unit=_('queries'))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
213 ui.debug("query %i; still undecided: %i, sample size is: %i\n"
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
214 % (roundtrips, len(undecided), len(sample)))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
215 # indices between sample and externalized version must match
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
216 sample = list(sample)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
217 yesno = remote.known(dag.externalizeall(sample))
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
218 full = True
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
219
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
220 if sample:
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
221 commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
222 common.addbases(commoninsample)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
223 common.removeancestorsfrom(undecided)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
224
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
225 # heads(common) == heads(common.bases) since common represents common.bases
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
226 # and all its ancestors
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
227 result = dag.headsetofconnecteds(common.bases)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
228 # common.bases can include nullrev, but our contract requires us to not
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
229 # return any heads in that case, so discard that
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
230 result.discard(nullrev)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
231 ui.progress(_('searching'), None)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
232 ui.debug("%d total queries\n" % roundtrips)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
233
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
234 if not result and srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
235 if abortwhenunrelated:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
236 raise util.Abort(_("repository is unrelated"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
237 else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
238 ui.warn(_("warning: repository is unrelated\n"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
239 return (set([nullid]), True, srvheadhashes,)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
240
14981
192e02680d09 setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents: 14833
diff changeset
241 anyincoming = (srvheadhashes != [nullid])
192e02680d09 setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents: 14833
diff changeset
242 return dag.externalizeall(result), anyincoming, srvheadhashes