mercurial/setdiscovery.py
author Pierre-Yves David <pierre-yves.david@octobus.net>
Tue, 21 May 2019 13:08:22 +0200
changeset 42376 dbd0fcca6dfc
parent 42212 6631f3e89b6f
child 42421 5b34972a0094
permissions -rw-r--r--
discovery: slowly increase sampling size Some pathological discovery runs can requires many roundtrip. When this happens things can get very slow. To make the algorithm more resilience again such pathological case. We slowly increase the sample size with each roundtrip (+5%). This will have a negligible impact on "normal" discovery with few roundtrips, but a large positive impact of case with many roundtrips. Asking more question per roundtrip helps to reduce the undecided set faster. Instead of reducing the undecided set a linear speed (in the worst case), we reduce it as a guaranteed (small) exponential rate. The data below show this slow ramp up in sample size: round trip | 1 | 5 | 10 | 20 | 50 | 100 | 130 | sample size | 200 | 254 | 321 | 517 | 2 199 | 25 123 | 108 549 | covered nodes | 200 | 1 357 | 2 821 | 7 031 | 42 658 | 524 530 | 2 276 755 | To be a bit more concrete, lets take a very pathological case as an example. We are doing discovery from a copy of Mozilla-try to a more recent version of mozilla-unified. Mozilla-unified heads are unknown to the mozilla-try repo and there are over 1 million "missing" changesets. (the discovery is "local" to avoid network interference) Without this change, the discovery: - last 1858 seconds (31 minutes), - does 1700 round trip, - asking about 340 000 nodes. With this change, the discovery: - last 218 seconds (3 minutes, 38 seconds a -88% improvement), - does 94 round trip (-94%), - asking about 344 211 nodes (+1%). Of course, this is an extreme case (and 3 minutes is still slow). However this give a good example of how this sample size increase act as a safety net catching any bad situations. We could image a steeper increase than 5%. For example 10% would give the following number: round trip | 1 | 5 | 10 | 20 | 50 | 75 | 100 | sample size | 200 | 321 | 514 | 1 326 | 23 060 | 249 812 | 2 706 594 | covered nodes | 200 | 1 541 | 3 690 | 12 671 | 251 871 | 2 746 254 | 29 770 966 | In parallel, it is useful to understand these pathological cases and improve them. However the current change provides a general purpose safety net to smooth the impact of pathological cases. To avoid issue with older http server, the increase in sample size only occurs if the protocol has not limit on command argument size.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     1
# setdiscovery.py - improved discovery of common nodeset for mercurial
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     2
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     3
# Copyright 2010 Benoit Boissinot <bboissin@gmail.com>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     4
# and Peter Arrenbrecht <peter@arrenbrecht.ch>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     5
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     6
# This software may be used and distributed according to the terms of the
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     7
# GNU General Public License version 2 or any later version.
20656
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     8
"""
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     9
Algorithm works in the following way. You have two repository: local and
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    10
remote. They both contains a DAG of changelists.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    11
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    12
The goal of the discovery protocol is to find one set of node *common*,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    13
the set of nodes shared by local and remote.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    14
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    15
One of the issue with the original protocol was latency, it could
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    16
potentially require lots of roundtrips to discover that the local repo was a
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    17
subset of remote (which is a very common case, you usually have few changes
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    18
compared to upstream, while upstream probably had lots of development).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    19
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    20
The new protocol only requires one interface for the remote repo: `known()`,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    21
which given a set of changelists tells you if they are present in the DAG.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    22
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    23
The algorithm then works as follow:
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    24
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    25
 - We will be using three sets, `common`, `missing`, `unknown`. Originally
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    26
 all nodes are in `unknown`.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    27
 - Take a sample from `unknown`, call `remote.known(sample)`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    28
   - For each node that remote knows, move it and all its ancestors to `common`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    29
   - For each node that remote doesn't know, move it and all its descendants
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    30
   to `missing`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    31
 - Iterate until `unknown` is empty
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    32
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    33
There are a couple optimizations, first is instead of starting with a random
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    34
sample of missing, start by sending all heads, in the case where the local
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    35
repo is a subset, you computed the answer in one round trip.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    36
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    37
Then you can do something similar to the bisecting strategy used when
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    38
finding faulty changesets. Instead of random samples, you can try picking
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    39
nodes that will maximize the number of nodes that will be
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    40
classified with it (since all ancestors or descendants will be marked as well).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    41
"""
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    42
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    43
from __future__ import absolute_import
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    44
25113
0ca8410ea345 util: drop alias for collections.deque
Martin von Zweigbergk <martinvonz@google.com>
parents: 23817
diff changeset
    45
import collections
20034
1e5b38a919dd cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents: 17426
diff changeset
    46
import random
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    47
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    48
from .i18n import _
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    49
from .node import (
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    50
    nullid,
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    51
    nullrev,
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    52
)
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    53
from . import (
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25973
diff changeset
    54
    error,
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
    55
    util,
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    56
)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    57
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    58
def _updatesample(revs, heads, sample, parentfn, quicksamplesize=0):
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    59
    """update an existing sample to match the expected size
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    60
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    61
    The sample is updated with revs exponentially distant from each head of the
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    62
    <revs> set. (H~1, H~2, H~4, H~8, etc).
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    63
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    64
    If a target size is specified, the sampling will stop once this size is
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    65
    reached. Otherwise sampling will happen until roots of the <revs> set are
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    66
    reached.
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    67
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    68
    :revs:  set of revs we want to discover (if None, assume the whole dag)
39203
754f389b87f2 setdiscovery: pass heads into _updatesample()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39202
diff changeset
    69
    :heads: set of DAG head revs
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    70
    :sample: a sample to update
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    71
    :parentfn: a callable to resolve parents for a revision
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    72
    :quicksamplesize: optional target size of the sample"""
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    73
    dist = {}
25113
0ca8410ea345 util: drop alias for collections.deque
Martin von Zweigbergk <martinvonz@google.com>
parents: 23817
diff changeset
    74
    visit = collections.deque(heads)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    75
    seen = set()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    76
    factor = 1
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    77
    while visit:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    78
        curr = visit.popleft()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    79
        if curr in seen:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    80
            continue
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    81
        d = dist.setdefault(curr, 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    82
        if d > factor:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    83
            factor *= 2
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    84
        if d == factor:
23814
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    85
            sample.add(curr)
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    86
            if quicksamplesize and (len(sample) >= quicksamplesize):
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    87
                return
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    88
        seen.add(curr)
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    89
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    90
        for p in parentfn(curr):
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    91
            if p != nullrev and (not revs or p in revs):
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    92
                dist.setdefault(p, d + 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    93
                visit.append(p)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    94
23083
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
    95
def _limitsample(sample, desiredlen):
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
    96
    """return a random subset of sample of at most desiredlen item"""
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
    97
    if len(sample) > desiredlen:
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
    98
        sample = set(random.sample(sample, desiredlen))
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
    99
    return sample
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   100
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   101
class partialdiscovery(object):
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   102
    """an object representing ongoing discovery
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   103
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   104
    Feed with data from the remote repository, this object keep track of the
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   105
    current set of changeset in various states:
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   106
41172
3dcc96582627 discovery: improve partial discovery documentation
Boris Feld <boris.feld@octobus.net>
parents: 41171
diff changeset
   107
    - common:    revs also known remotely
3dcc96582627 discovery: improve partial discovery documentation
Boris Feld <boris.feld@octobus.net>
parents: 41171
diff changeset
   108
    - undecided: revs we don't have information on yet
3dcc96582627 discovery: improve partial discovery documentation
Boris Feld <boris.feld@octobus.net>
parents: 41171
diff changeset
   109
    - missing:   revs missing remotely
3dcc96582627 discovery: improve partial discovery documentation
Boris Feld <boris.feld@octobus.net>
parents: 41171
diff changeset
   110
    (all tracked revisions are known locally)
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   111
    """
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   112
41167
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   113
    def __init__(self, repo, targetheads):
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   114
        self._repo = repo
41167
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   115
        self._targetheads = targetheads
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   116
        self._common = repo.changelog.incrementalmissingrevs()
41167
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   117
        self._undecided = None
41170
96201120cdf5 discovery: move missing tracking inside the partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41169
diff changeset
   118
        self.missing = set()
41890
5baf06d2bb41 discovery: cache the children mapping used during each discovery
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41889
diff changeset
   119
        self._childrenmap = None
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   120
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   121
    def addcommons(self, commons):
42212
6631f3e89b6f setdiscovery: fix a few typos
Joerg Sonnenberger <joerg@bec.de>
parents: 42201
diff changeset
   122
        """register nodes known as common"""
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   123
        self._common.addbases(commons)
41303
76873548b051 partialdiscovery: avoid `undecided` related computation sooner than necessary
Boris Feld <boris.feld@octobus.net>
parents: 41280
diff changeset
   124
        if self._undecided is not None:
76873548b051 partialdiscovery: avoid `undecided` related computation sooner than necessary
Boris Feld <boris.feld@octobus.net>
parents: 41280
diff changeset
   125
            self._common.removeancestorsfrom(self._undecided)
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   126
41170
96201120cdf5 discovery: move missing tracking inside the partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41169
diff changeset
   127
    def addmissings(self, missings):
42212
6631f3e89b6f setdiscovery: fix a few typos
Joerg Sonnenberger <joerg@bec.de>
parents: 42201
diff changeset
   128
        """register some nodes as missing"""
41280
f4277a35c42c discovery: compute newly discovered missing in a more efficient way
Boris Feld <boris.feld@octobus.net>
parents: 41245
diff changeset
   129
        newmissing = self._repo.revs('%ld::%ld', missings, self.undecided)
f4277a35c42c discovery: compute newly discovered missing in a more efficient way
Boris Feld <boris.feld@octobus.net>
parents: 41245
diff changeset
   130
        if newmissing:
f4277a35c42c discovery: compute newly discovered missing in a more efficient way
Boris Feld <boris.feld@octobus.net>
parents: 41245
diff changeset
   131
            self.missing.update(newmissing)
f4277a35c42c discovery: compute newly discovered missing in a more efficient way
Boris Feld <boris.feld@octobus.net>
parents: 41245
diff changeset
   132
            self.undecided.difference_update(newmissing)
41170
96201120cdf5 discovery: move missing tracking inside the partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41169
diff changeset
   133
41171
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   134
    def addinfo(self, sample):
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   135
        """consume an iterable of (rev, known) tuples"""
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   136
        common = set()
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   137
        missing = set()
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   138
        for rev, known in sample:
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   139
            if known:
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   140
                common.add(rev)
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   141
            else:
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   142
                missing.add(rev)
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   143
        if common:
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   144
            self.addcommons(common)
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   145
        if missing:
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   146
            self.addmissings(missing)
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   147
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   148
    def hasinfo(self):
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   149
        """return True is we have any clue about the remote state"""
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   150
        return self._common.hasbases()
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   151
41169
3ce5b96482c6 discovery: add a `iscomplete` method to the `partialdiscovery` object
Boris Feld <boris.feld@octobus.net>
parents: 41168
diff changeset
   152
    def iscomplete(self):
3ce5b96482c6 discovery: add a `iscomplete` method to the `partialdiscovery` object
Boris Feld <boris.feld@octobus.net>
parents: 41168
diff changeset
   153
        """True if all the necessary data have been gathered"""
3ce5b96482c6 discovery: add a `iscomplete` method to the `partialdiscovery` object
Boris Feld <boris.feld@octobus.net>
parents: 41168
diff changeset
   154
        return self._undecided is not None and not self._undecided
3ce5b96482c6 discovery: add a `iscomplete` method to the `partialdiscovery` object
Boris Feld <boris.feld@octobus.net>
parents: 41168
diff changeset
   155
41167
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   156
    @property
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   157
    def undecided(self):
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   158
        if self._undecided is not None:
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   159
            return self._undecided
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   160
        self._undecided = set(self._common.missingancestors(self._targetheads))
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   161
        return self._undecided
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   162
42103
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   163
    def stats(self):
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   164
        return {
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   165
            'undecided': len(self.undecided),
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   166
        }
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   167
41116
9815d3337f9b discovery: move common heads computation inside partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41115
diff changeset
   168
    def commonheads(self):
9815d3337f9b discovery: move common heads computation inside partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41115
diff changeset
   169
        """the heads of the known common set"""
9815d3337f9b discovery: move common heads computation inside partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41115
diff changeset
   170
        # heads(common) == heads(common.bases) since common represents
9815d3337f9b discovery: move common heads computation inside partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41115
diff changeset
   171
        # common.bases and all its ancestors
41245
2a8782cc2e16 discovery: using the new basesheads()
Georges Racinet <georges.racinet@octobus.net>
parents: 41172
diff changeset
   172
        return self._common.basesheads()
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   173
41886
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   174
    def _parentsgetter(self):
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   175
        getrev = self._repo.changelog.index.__getitem__
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   176
        def getparents(r):
41981
0d467e4de4ae discovery: fix embarrassing typo in slice definition
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41894
diff changeset
   177
            return getrev(r)[5:7]
41886
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   178
        return getparents
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   179
41891
a05f0bbefdd9 discovery: explicitly use `undecided` for the children mapping
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41890
diff changeset
   180
    def _childrengetter(self):
41889
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   181
41890
5baf06d2bb41 discovery: cache the children mapping used during each discovery
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41889
diff changeset
   182
        if self._childrenmap is not None:
41894
3ba9ca537f57 discovery: clarify why the caching of children is valid
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41891
diff changeset
   183
            # During discovery, the `undecided` set keep shrinking.
3ba9ca537f57 discovery: clarify why the caching of children is valid
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41891
diff changeset
   184
            # Therefore, the map computed for an iteration N will be
3ba9ca537f57 discovery: clarify why the caching of children is valid
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41891
diff changeset
   185
            # valid for iteration N+1. Instead of computing the same
3ba9ca537f57 discovery: clarify why the caching of children is valid
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41891
diff changeset
   186
            # data over and over we cached it the first time.
41890
5baf06d2bb41 discovery: cache the children mapping used during each discovery
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41889
diff changeset
   187
            return self._childrenmap.__getitem__
5baf06d2bb41 discovery: cache the children mapping used during each discovery
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41889
diff changeset
   188
41889
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   189
        # _updatesample() essentially does interaction over revisions to look
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   190
        # up their children. This lookup is expensive and doing it in a loop is
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   191
        # quadratic. We precompute the children for all relevant revisions and
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   192
        # make the lookup in _updatesample() a simple dict lookup.
41890
5baf06d2bb41 discovery: cache the children mapping used during each discovery
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41889
diff changeset
   193
        self._childrenmap = children = {}
41889
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   194
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   195
        parentrevs = self._parentsgetter()
41891
a05f0bbefdd9 discovery: explicitly use `undecided` for the children mapping
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41890
diff changeset
   196
        revs = self.undecided
41889
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   197
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   198
        for rev in sorted(revs):
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   199
            # Always ensure revision has an entry so we don't need to worry
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   200
            # about missing keys.
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   201
            children[rev] = []
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   202
            for prev in parentrevs(rev):
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   203
                if prev == nullrev:
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   204
                    continue
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   205
                c = children.get(prev)
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   206
                if c is not None:
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   207
                    c.append(rev)
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   208
        return children.__getitem__
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   209
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   210
    def takequicksample(self, headrevs, size):
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   211
        """takes a quick sample of size <size>
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   212
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   213
        It is meant for initial sampling and focuses on querying heads and close
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   214
        ancestors of heads.
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   215
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   216
        :headrevs: set of head revisions in local DAG to consider
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   217
        :size: the maximum size of the sample"""
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   218
        revs = self.undecided
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   219
        if len(revs) <= size:
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   220
            return list(revs)
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   221
        sample = set(self._repo.revs('heads(%ld)', revs))
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   222
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   223
        if len(sample) >= size:
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   224
            return _limitsample(sample, size)
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   225
41886
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   226
        _updatesample(None, headrevs, sample, self._parentsgetter(),
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   227
                      quicksamplesize=size)
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   228
        return sample
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   229
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   230
    def takefullsample(self, headrevs, size):
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   231
        revs = self.undecided
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   232
        if len(revs) <= size:
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   233
            return list(revs)
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   234
        repo = self._repo
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   235
        sample = set(repo.revs('heads(%ld)', revs))
41886
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   236
        parentrevs = self._parentsgetter()
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   237
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   238
        # update from heads
41885
55919b96c02a discovery: avoid computing identical sets of heads twice
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41884
diff changeset
   239
        revsheads = sample.copy()
41886
e514799e4e07 discovery: use a lower level but faster way to retrieve parents
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41885
diff changeset
   240
        _updatesample(revs, revsheads, sample, parentrevs)
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   241
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   242
        # update from roots
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   243
        revsroots = set(repo.revs('roots(%ld)', revs))
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   244
41891
a05f0bbefdd9 discovery: explicitly use `undecided` for the children mapping
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41890
diff changeset
   245
        childrenrevs = self._childrengetter()
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   246
41889
d5e6ae6e8012 discovery: move children computation in its own method
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41888
diff changeset
   247
        _updatesample(revs, revsroots, sample, childrenrevs)
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   248
        assert sample
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   249
        sample = _limitsample(sample, size)
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   250
        if len(sample) < size:
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   251
            more = size - len(sample)
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   252
            sample.update(random.sample(list(revs - sample), more))
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   253
        return sample
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   254
36738
613954a17a25 setdiscovery: back out changeset 5cfdf6137af8 (issue5809)
Martin von Zweigbergk <martinvonz@google.com>
parents: 35889
diff changeset
   255
def findcommonheads(ui, local, remote,
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   256
                    initialsamplesize=100,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   257
                    fullsamplesize=200,
35313
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32788
diff changeset
   258
                    abortwhenunrelated=True,
42376
dbd0fcca6dfc discovery: slowly increase sampling size
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42212
diff changeset
   259
                    ancestorsof=None,
dbd0fcca6dfc discovery: slowly increase sampling size
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42212
diff changeset
   260
                    samplegrowth=1.05):
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   261
    '''Return a tuple (common, anyincoming, remoteheads) used to identify
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   262
    missing nodes from or in remote.
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   263
    '''
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   264
    start = util.timer()
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   265
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   266
    roundtrips = 0
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   267
    cl = local.changelog
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   268
    clnode = cl.node
39194
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   269
    clrev = cl.rev
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   270
35313
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32788
diff changeset
   271
    if ancestorsof is not None:
39198
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   272
        ownheads = [clrev(n) for n in ancestorsof]
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   273
    else:
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   274
        ownheads = [rev for rev in cl.headrevs() if rev != nullrev]
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   275
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   276
    # early exit if we know all the specified remote heads already
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   277
    ui.debug("query 1; heads\n")
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   278
    roundtrips += 1
42201
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   279
    # We also ask remote about all the local heads. That set can be arbitrarily
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   280
    # large, so we used to limit it size to `initialsamplesize`. We no longer
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   281
    # do as it proved counter productive. The skipped heads could lead to a
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   282
    # large "undecided" set, slower to be clarified than if we asked the
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   283
    # question for all heads right away.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   284
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   285
    # We are already fetching all server heads using the `heads` commands,
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   286
    # sending a equivalent number of heads the other way should not have a
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   287
    # significant impact.  In addition, it is very likely that we are going to
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   288
    # have to issue "known" request for an equivalent amount of revisions in
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   289
    # order to decide if theses heads are common or missing.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   290
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   291
    # find a detailled analysis below.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   292
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   293
    # Case A: local and server both has few heads
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   294
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   295
    #     Ownheads is below initialsamplesize, limit would not have any effect.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   296
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   297
    # Case B: local has few heads and server has many
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   298
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   299
    #     Ownheads is below initialsamplesize, limit would not have any effect.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   300
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   301
    # Case C: local and server both has many heads
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   302
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   303
    #     We now transfert some more data, but not significantly more than is
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   304
    #     already transfered to carry the server heads.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   305
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   306
    # Case D: local has many heads, server has few
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   307
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   308
    #   D.1 local heads are mostly known remotely
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   309
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   310
    #     All the known head will have be part of a `known` request at some
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   311
    #     point for the discovery to finish. Sending them all earlier is
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   312
    #     actually helping.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   313
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   314
    #     (This case is fairly unlikely, it requires the numerous heads to all
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   315
    #     be merged server side in only a few heads)
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   316
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   317
    #   D.2 local heads are mostly missing remotely
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   318
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   319
    #     To determine that the heads are missing, we'll have to issue `known`
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   320
    #     request for them or one of their ancestors. This amount of `known`
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   321
    #     request will likely be in the same order of magnitude than the amount
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   322
    #     of local heads.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   323
    #
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   324
    #     The only case where we can be more efficient using `known` request on
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   325
    #     ancestors are case were all the "missing" local heads are based on a
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   326
    #     few changeset, also "missing".  This means we would have a "complex"
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   327
    #     graph (with many heads) attached to, but very independant to a the
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   328
    #     "simple" graph on the server. This is a fairly usual case and have
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   329
    #     not been met in the wild so far.
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   330
    if remote.limitedarguments:
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   331
        sample = _limitsample(ownheads, initialsamplesize)
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   332
        # indices between sample and externalized version must match
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   333
        sample = list(sample)
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   334
    else:
4f9a89837f07 setdiscovery: stop limiting the number of local head we initially send
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42103
diff changeset
   335
        sample = ownheads
37631
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   336
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   337
    with remote.commandexecutor() as e:
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   338
        fheads = e.callcommand('heads', {})
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   339
        fknown = e.callcommand('known', {
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   340
            'nodes': [clnode(r) for r in sample],
37631
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   341
        })
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   342
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   343
    srvheadhashes, yesno = fheads.result(), fknown.result()
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   344
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   345
    if cl.tip() == nullid:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   346
        if srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   347
            return [nullid], True, srvheadhashes
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   348
        return [nullid], False, []
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   349
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   350
    # start actual discovery (we note this before the next "if" for
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   351
    # compatibility reasons)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   352
    ui.status(_("searching for changes\n"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   353
41883
82884bbf8d2b discovery: rename `srvheads` to `knownsrvheads`
Georges Racinet <georges.racinet@octobus.net>
parents: 41303
diff changeset
   354
    knownsrvheads = []  # revnos of remote heads that are known locally
39194
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   355
    for node in srvheadhashes:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   356
        if node == nullid:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   357
            continue
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   358
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   359
        try:
41883
82884bbf8d2b discovery: rename `srvheads` to `knownsrvheads`
Georges Racinet <georges.racinet@octobus.net>
parents: 41303
diff changeset
   360
            knownsrvheads.append(clrev(node))
39194
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   361
        # Catches unknown and filtered nodes.
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   362
        except error.LookupError:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   363
            continue
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   364
41883
82884bbf8d2b discovery: rename `srvheads` to `knownsrvheads`
Georges Racinet <georges.racinet@octobus.net>
parents: 41303
diff changeset
   365
    if len(knownsrvheads) == len(srvheadhashes):
14833
308e1b5acc87 discovery: quiet note about heads
Matt Mackall <mpm@selenic.com>
parents: 14624
diff changeset
   366
        ui.debug("all remote heads known locally\n")
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   367
        return srvheadhashes, False, srvheadhashes
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   368
36739
bf485b70d0ae setdiscovery: remove initialsamplesize from a condition
Martin von Zweigbergk <martinvonz@google.com>
parents: 36738
diff changeset
   369
    if len(sample) == len(ownheads) and all(yesno):
15497
9bea3aed6ee1 add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents: 15063
diff changeset
   370
        ui.note(_("all local heads known remotely\n"))
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   371
        ownheadhashes = [clnode(r) for r in ownheads]
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   372
        return ownheadhashes, True, srvheadhashes
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   373
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   374
    # full blown discovery
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   375
41167
870a89c6909d discovery: move undecided set on the partialdiscovery
Boris Feld <boris.feld@octobus.net>
parents: 41162
diff changeset
   376
    disco = partialdiscovery(local, ownheads)
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   377
    # treat remote heads (and maybe own heads) as a first implicit sample
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   378
    # response
41883
82884bbf8d2b discovery: rename `srvheads` to `knownsrvheads`
Georges Racinet <georges.racinet@octobus.net>
parents: 41303
diff changeset
   379
    disco.addcommons(knownsrvheads)
41171
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   380
    disco.addinfo(zip(sample, yesno))
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   381
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   382
    full = False
38356
9e70690a21ac setdiscovery: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 37631
diff changeset
   383
    progress = ui.makeprogress(_('searching'), unit=_('queries'))
41169
3ce5b96482c6 discovery: add a `iscomplete` method to the `partialdiscovery` object
Boris Feld <boris.feld@octobus.net>
parents: 41168
diff changeset
   384
    while not disco.iscomplete():
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   385
41115
3023bc4b3da0 discovery: introduce a partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41114
diff changeset
   386
        if full or disco.hasinfo():
23747
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   387
            if full:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   388
                ui.note(_("sampling from both directions\n"))
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   389
            else:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   390
                ui.debug("taking initial sample\n")
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   391
            samplefunc = disco.takefullsample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   392
            targetsize = fullsamplesize
42376
dbd0fcca6dfc discovery: slowly increase sampling size
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42212
diff changeset
   393
            if not remote.limitedarguments:
dbd0fcca6dfc discovery: slowly increase sampling size
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42212
diff changeset
   394
                fullsamplesize = int(fullsamplesize * samplegrowth)
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   395
        else:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   396
            # use even cheaper initial sample
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   397
            ui.debug("taking quick initial sample\n")
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   398
            samplefunc = disco.takequicksample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   399
            targetsize = initialsamplesize
41884
e5ece0f46b40 discovery: moved sampling functions inside discovery object
Georges Racinet <georges.racinet@octobus.net>
parents: 41883
diff changeset
   400
        sample = samplefunc(ownheads, targetsize)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   401
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   402
        roundtrips += 1
38356
9e70690a21ac setdiscovery: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 37631
diff changeset
   403
        progress.update(roundtrips)
42103
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   404
        stats = disco.stats()
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   405
        ui.debug("query %i; still undecided: %i, sample size is: %i\n"
42103
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   406
                 % (roundtrips, stats['undecided'], len(sample)))
362726923ba3 discovery: stop direct use of attribute of partialdiscovery
Georges Racinet <georges.racinet@octobus.net>
parents: 41981
diff changeset
   407
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   408
        # indices between sample and externalized version must match
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   409
        sample = list(sample)
37630
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   410
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   411
        with remote.commandexecutor() as e:
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   412
            yesno = e.callcommand('known', {
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   413
                'nodes': [clnode(r) for r in sample],
37630
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   414
            }).result()
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   415
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   416
        full = True
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   417
41171
f46ffd23dae8 discovery: add a simple `addinfo` method
Boris Feld <boris.feld@octobus.net>
parents: 41170
diff changeset
   418
        disco.addinfo(zip(sample, yesno))
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   419
41116
9815d3337f9b discovery: move common heads computation inside partialdiscovery object
Boris Feld <boris.feld@octobus.net>
parents: 41115
diff changeset
   420
    result = disco.commonheads()
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   421
    elapsed = util.timer() - start
38379
ef692614e601 progress: hide update(None) in a new complete() method
Martin von Zweigbergk <martinvonz@google.com>
parents: 38356
diff changeset
   422
    progress.complete()
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   423
    ui.debug("%d total queries in %.4fs\n" % (roundtrips, elapsed))
32788
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32733
diff changeset
   424
    msg = ('found %d common and %d unknown server heads,'
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32733
diff changeset
   425
           ' %d roundtrips in %.4fs\n')
41883
82884bbf8d2b discovery: rename `srvheads` to `knownsrvheads`
Georges Racinet <georges.racinet@octobus.net>
parents: 41303
diff changeset
   426
    missing = set(result) - set(knownsrvheads)
32788
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32733
diff changeset
   427
    ui.log('discovery', msg, len(result), len(missing), roundtrips,
32733
28240b75e880 discovery: log discovery result in non-trivial cases
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32732
diff changeset
   428
           elapsed)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   429
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   430
    if not result and srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   431
        if abortwhenunrelated:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25973
diff changeset
   432
            raise error.Abort(_("repository is unrelated"))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   433
        else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   434
            ui.warn(_("warning: repository is unrelated\n"))
32331
bd872f64a8ba cleanup: use set literals
Martin von Zweigbergk <martinvonz@google.com>
parents: 28437
diff changeset
   435
        return ({nullid}, True, srvheadhashes,)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   436
14981
192e02680d09 setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents: 14833
diff changeset
   437
    anyincoming = (srvheadhashes != [nullid])
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   438
    result = {clnode(r) for r in result}
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   439
    return result, anyincoming, srvheadhashes