mercurial/setdiscovery.py
author Boris Feld <boris.feld@octobus.net>
Fri, 14 Dec 2018 12:01:15 +0100
changeset 41114 3c85a62d7462
parent 41113 71b0db4fa027
child 41115 3023bc4b3da0
permissions -rw-r--r--
discovery: move handling of sampling special case inside sampling function The handling of cases where the number of revisions to sample is smaller than the sample size can be moved with the sample function themselves. This simplifies main logic, preparing a coming refactoring.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     1
# setdiscovery.py - improved discovery of common nodeset for mercurial
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     2
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     3
# Copyright 2010 Benoit Boissinot <bboissin@gmail.com>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     4
# and Peter Arrenbrecht <peter@arrenbrecht.ch>
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     5
#
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     6
# This software may be used and distributed according to the terms of the
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
     7
# GNU General Public License version 2 or any later version.
20656
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     8
"""
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
     9
Algorithm works in the following way. You have two repository: local and
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    10
remote. They both contains a DAG of changelists.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    11
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    12
The goal of the discovery protocol is to find one set of node *common*,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    13
the set of nodes shared by local and remote.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    14
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    15
One of the issue with the original protocol was latency, it could
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    16
potentially require lots of roundtrips to discover that the local repo was a
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    17
subset of remote (which is a very common case, you usually have few changes
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    18
compared to upstream, while upstream probably had lots of development).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    19
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    20
The new protocol only requires one interface for the remote repo: `known()`,
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    21
which given a set of changelists tells you if they are present in the DAG.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    22
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    23
The algorithm then works as follow:
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    24
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    25
 - We will be using three sets, `common`, `missing`, `unknown`. Originally
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    26
 all nodes are in `unknown`.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    27
 - Take a sample from `unknown`, call `remote.known(sample)`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    28
   - For each node that remote knows, move it and all its ancestors to `common`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    29
   - For each node that remote doesn't know, move it and all its descendants
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    30
   to `missing`
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    31
 - Iterate until `unknown` is empty
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    32
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    33
There are a couple optimizations, first is instead of starting with a random
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    34
sample of missing, start by sending all heads, in the case where the local
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    35
repo is a subset, you computed the answer in one round trip.
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    36
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    37
Then you can do something similar to the bisecting strategy used when
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    38
finding faulty changesets. Instead of random samples, you can try picking
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    39
nodes that will maximize the number of nodes that will be
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    40
classified with it (since all ancestors or descendants will be marked as well).
cdecbc5ab504 setdiscovery: document algorithms used
Olle Lundberg <geek@nerd.sh>
parents: 20034
diff changeset
    41
"""
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    42
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    43
from __future__ import absolute_import
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    44
25113
0ca8410ea345 util: drop alias for collections.deque
Martin von Zweigbergk <martinvonz@google.com>
parents: 23817
diff changeset
    45
import collections
20034
1e5b38a919dd cleanup: move stdlib imports to their own import statement
Augie Fackler <raf@durin42.com>
parents: 17426
diff changeset
    46
import random
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    47
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    48
from .i18n import _
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    49
from .node import (
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    50
    nullid,
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    51
    nullrev,
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    52
)
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    53
from . import (
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25973
diff changeset
    54
    error,
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
    55
    util,
25973
fb5664eb8414 setdiscovery: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 25914
diff changeset
    56
)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    57
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    58
def _updatesample(revs, heads, sample, parentfn, quicksamplesize=0):
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    59
    """update an existing sample to match the expected size
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    60
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    61
    The sample is updated with revs exponentially distant from each head of the
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    62
    <revs> set. (H~1, H~2, H~4, H~8, etc).
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    63
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    64
    If a target size is specified, the sampling will stop once this size is
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    65
    reached. Otherwise sampling will happen until roots of the <revs> set are
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    66
    reached.
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    67
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
    68
    :revs:  set of revs we want to discover (if None, assume the whole dag)
39203
754f389b87f2 setdiscovery: pass heads into _updatesample()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39202
diff changeset
    69
    :heads: set of DAG head revs
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    70
    :sample: a sample to update
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    71
    :parentfn: a callable to resolve parents for a revision
23809
9ca2eb881b53 setdiscovery: document the '_updatesample' function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23808
diff changeset
    72
    :quicksamplesize: optional target size of the sample"""
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    73
    dist = {}
25113
0ca8410ea345 util: drop alias for collections.deque
Martin von Zweigbergk <martinvonz@google.com>
parents: 23817
diff changeset
    74
    visit = collections.deque(heads)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    75
    seen = set()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    76
    factor = 1
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    77
    while visit:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    78
        curr = visit.popleft()
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    79
        if curr in seen:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    80
            continue
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    81
        d = dist.setdefault(curr, 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    82
        if d > factor:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    83
            factor *= 2
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    84
        if d == factor:
23814
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    85
            sample.add(curr)
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    86
            if quicksamplesize and (len(sample) >= quicksamplesize):
6a5877a73141 setdiscovery: drop the 'always' argument to '_updatesample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23813
diff changeset
    87
                return
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    88
        seen.add(curr)
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    89
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    90
        for p in parentfn(curr):
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    91
            if p != nullrev and (not revs or p in revs):
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    92
                dist.setdefault(p, d + 1)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    93
                visit.append(p)
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
    94
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
    95
def _takequicksample(repo, headrevs, revs, size):
23816
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    96
    """takes a quick sample of size <size>
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    97
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    98
    It is meant for initial sampling and focuses on querying heads and close
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
    99
    ancestors of heads.
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
   100
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
   101
    :dag: a dag object
39204
abce899c985f setdiscovery: pass head revisions into sample functions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39203
diff changeset
   102
    :headrevs: set of head revisions in local DAG to consider
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
   103
    :revs: set of revs to discover
23816
34d4b58580d1 setdiscovery: document '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23815
diff changeset
   104
    :size: the maximum size of the sample"""
41114
3c85a62d7462 discovery: move handling of sampling special case inside sampling function
Boris Feld <boris.feld@octobus.net>
parents: 41113
diff changeset
   105
    if len(revs) <= size:
3c85a62d7462 discovery: move handling of sampling special case inside sampling function
Boris Feld <boris.feld@octobus.net>
parents: 41113
diff changeset
   106
        return list(revs)
39202
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39201
diff changeset
   107
    sample = set(repo.revs('heads(%ld)', revs))
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39201
diff changeset
   108
36741
59802fa590db setdiscovery: avoid a Yoda condition
Martin von Zweigbergk <martinvonz@google.com>
parents: 36740
diff changeset
   109
    if len(sample) >= size:
23815
31e75a362d44 setdiscovery: drop '_setupsample' usage in '_takequicksample'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23814
diff changeset
   110
        return _limitsample(sample, size)
39203
754f389b87f2 setdiscovery: pass heads into _updatesample()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39202
diff changeset
   111
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
   112
    _updatesample(None, headrevs, sample, repo.changelog.parentrevs,
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
   113
                  quicksamplesize=size)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   114
    return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   115
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
   116
def _takefullsample(repo, headrevs, revs, size):
41114
3c85a62d7462 discovery: move handling of sampling special case inside sampling function
Boris Feld <boris.feld@octobus.net>
parents: 41113
diff changeset
   117
    if len(revs) <= size:
3c85a62d7462 discovery: move handling of sampling special case inside sampling function
Boris Feld <boris.feld@octobus.net>
parents: 41113
diff changeset
   118
        return list(revs)
39202
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39201
diff changeset
   119
    sample = set(repo.revs('heads(%ld)', revs))
140992750187 setdiscovery: use a revset for finding DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39201
diff changeset
   120
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   121
    # update from heads
39206
56279660d264 setdiscovery: use revsets for computing a subset's heads and roots
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39205
diff changeset
   122
    revsheads = set(repo.revs('heads(%ld)', revs))
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
   123
    _updatesample(revs, revsheads, sample, repo.changelog.parentrevs)
39211
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   124
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   125
    # update from roots
39206
56279660d264 setdiscovery: use revsets for computing a subset's heads and roots
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39205
diff changeset
   126
    revsroots = set(repo.revs('roots(%ld)', revs))
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
   127
39211
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   128
    # _updatesample() essentially does interaction over revisions to look up
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   129
    # their children. This lookup is expensive and doing it in a loop is
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   130
    # quadratic. We precompute the children for all relevant revisions and
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   131
    # make the lookup in _updatesample() a simple dict lookup.
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   132
    #
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   133
    # Because this function can be called multiple times during discovery, we
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   134
    # may still perform redundant work and there is room to optimize this by
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   135
    # keeping a persistent cache of children across invocations.
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   136
    children = {}
39207
71d83b315778 setdiscovery: don't use dagutil for parent resolution
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39206
diff changeset
   137
39211
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   138
    parentrevs = repo.changelog.parentrevs
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   139
    for rev in repo.changelog.revs(start=min(revsroots)):
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   140
        # Always ensure revision has an entry so we don't need to worry about
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   141
        # missing keys.
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   142
        children.setdefault(rev, [])
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   143
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   144
        for prev in parentrevs(rev):
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   145
            if prev == nullrev:
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   146
                continue
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   147
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   148
            children.setdefault(prev, []).append(rev)
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   149
274acf379dbb setdiscovery: precompute children revisions to avoid quadratic lookup
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39209
diff changeset
   150
    _updatesample(revs, revsroots, sample, children.__getitem__)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   151
    assert sample
23810
b681d3a2bf04 setdiscovery: randomly pick between heads and sample when taking full sample
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23809
diff changeset
   152
    sample = _limitsample(sample, size)
41113
71b0db4fa027 discovery: minor fix to some conditionals
Boris Feld <boris.feld@octobus.net>
parents: 39211
diff changeset
   153
    if len(sample) <= size:
23810
b681d3a2bf04 setdiscovery: randomly pick between heads and sample when taking full sample
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23809
diff changeset
   154
        more = size - len(sample)
39201
2d218db7389b setdiscovery: reflect use of revs instead of nodes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39199
diff changeset
   155
        sample.update(random.sample(list(revs - sample), more))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   156
    return sample
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   157
23083
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   158
def _limitsample(sample, desiredlen):
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   159
    """return a random subset of sample of at most desiredlen item"""
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   160
    if len(sample) > desiredlen:
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   161
        sample = set(random.sample(sample, desiredlen))
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   162
    return sample
ee45f5c2ffcc setdiscovery: extract sample limitation in a `_limitsample` function
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 20656
diff changeset
   163
36738
613954a17a25 setdiscovery: back out changeset 5cfdf6137af8 (issue5809)
Martin von Zweigbergk <martinvonz@google.com>
parents: 35889
diff changeset
   164
def findcommonheads(ui, local, remote,
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   165
                    initialsamplesize=100,
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   166
                    fullsamplesize=200,
35313
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32788
diff changeset
   167
                    abortwhenunrelated=True,
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32788
diff changeset
   168
                    ancestorsof=None):
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   169
    '''Return a tuple (common, anyincoming, remoteheads) used to identify
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   170
    missing nodes from or in remote.
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   171
    '''
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   172
    start = util.timer()
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   173
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   174
    roundtrips = 0
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   175
    cl = local.changelog
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   176
    clnode = cl.node
39194
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   177
    clrev = cl.rev
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   178
35313
f77121b6bf1b setdiscover: allow to ignore part of the local graph
Boris Feld <boris.feld@octobus.net>
parents: 32788
diff changeset
   179
    if ancestorsof is not None:
39198
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   180
        ownheads = [clrev(n) for n in ancestorsof]
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   181
    else:
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   182
        ownheads = [rev for rev in cl.headrevs() if rev != nullrev]
860e83cd97de setdiscovery: don't use dagutil to compute heads
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39194
diff changeset
   183
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   184
    # early exit if we know all the specified remote heads already
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   185
    ui.debug("query 1; heads\n")
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   186
    roundtrips += 1
23084
3ef893520a85 setdiscovery: limit the size of the initial sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23083
diff changeset
   187
    sample = _limitsample(ownheads, initialsamplesize)
23192
73cfaa348650 discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents: 23191
diff changeset
   188
    # indices between sample and externalized version must match
73cfaa348650 discovery: indices between sample and yesno must match (issue4438)
Mads Kiilerich <madski@unity3d.com>
parents: 23191
diff changeset
   189
    sample = list(sample)
37631
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   190
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   191
    with remote.commandexecutor() as e:
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   192
        fheads = e.callcommand('heads', {})
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   193
        fknown = e.callcommand('known', {
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   194
            'nodes': [clnode(r) for r in sample],
37631
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   195
        })
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   196
2f626233859b wireproto: implement batching on peer executor interface
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37630
diff changeset
   197
    srvheadhashes, yesno = fheads.result(), fknown.result()
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   198
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   199
    if cl.tip() == nullid:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   200
        if srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   201
            return [nullid], True, srvheadhashes
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   202
        return [nullid], False, []
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   203
14206
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   204
    # start actual discovery (we note this before the next "if" for
2bf60f158ecb setdiscovery: limit lines to 80 characters
Steven Brown <StevenGBrown@gmail.com>
parents: 14164
diff changeset
   205
    # compatibility reasons)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   206
    ui.status(_("searching for changes\n"))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   207
39194
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   208
    srvheads = []
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   209
    for node in srvheadhashes:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   210
        if node == nullid:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   211
            continue
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   212
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   213
        try:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   214
            srvheads.append(clrev(node))
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   215
        # Catches unknown and filtered nodes.
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   216
        except error.LookupError:
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   217
            continue
858a12846f4f setdiscovery: don't use dagutil for node -> rev conversion
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39192
diff changeset
   218
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   219
    if len(srvheads) == len(srvheadhashes):
14833
308e1b5acc87 discovery: quiet note about heads
Matt Mackall <mpm@selenic.com>
parents: 14624
diff changeset
   220
        ui.debug("all remote heads known locally\n")
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   221
        return srvheadhashes, False, srvheadhashes
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   222
36739
bf485b70d0ae setdiscovery: remove initialsamplesize from a condition
Martin von Zweigbergk <martinvonz@google.com>
parents: 36738
diff changeset
   223
    if len(sample) == len(ownheads) and all(yesno):
15497
9bea3aed6ee1 add missing localization markup
Mads Kiilerich <mads@kiilerich.com>
parents: 15063
diff changeset
   224
        ui.note(_("all local heads known remotely\n"))
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   225
        ownheadhashes = [clnode(r) for r in ownheads]
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   226
        return ownheadhashes, True, srvheadhashes
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   227
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   228
    # full blown discovery
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   229
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   230
    # own nodes I know we both know
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   231
    # treat remote heads (and maybe own heads) as a first implicit sample
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   232
    # response
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   233
    common = cl.incrementalmissingrevs(srvheads)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   234
    commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   235
    common.addbases(commoninsample)
23746
4ef2f2fa8b8b setdiscovery: drop shadowed 'undecided' assignment
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23343
diff changeset
   236
    # own nodes where I don't know if remote knows them
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   237
    undecided = set(common.missingancestors(ownheads))
16683
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   238
    # own nodes I know remote lacks
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   239
    missing = set()
525fdb738975 cleanup: eradicate long lines
Brodie Rao <brodie@sf.io>
parents: 15713
diff changeset
   240
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   241
    full = False
38356
9e70690a21ac setdiscovery: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 37631
diff changeset
   242
    progress = ui.makeprogress(_('searching'), unit=_('queries'))
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   243
    while undecided:
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   244
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   245
        if sample:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   246
            missinginsample = [n for i, n in enumerate(sample) if not yesno[i]]
39199
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39198
diff changeset
   247
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39198
diff changeset
   248
            if missing:
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39198
diff changeset
   249
                missing.update(local.revs('descendants(%ld) - descendants(%ld)',
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39198
diff changeset
   250
                                          missinginsample, missing))
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39198
diff changeset
   251
            else:
484c9fe570a7 setdiscovery: use a revset instead of dagutil.descendantset()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39198
diff changeset
   252
                missing.update(local.revs('descendants(%ld)', missinginsample))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   253
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   254
            undecided.difference_update(missing)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   255
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   256
        if not undecided:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   257
            break
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   258
23747
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   259
        if full or common.hasbases():
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   260
            if full:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   261
                ui.note(_("sampling from both directions\n"))
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   262
            else:
f82173a90c2c setdiscovery: factorize similar sampling code
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23746
diff changeset
   263
                ui.debug("taking initial sample\n")
23807
e97e363a7000 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23806
diff changeset
   264
            samplefunc = _takefullsample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   265
            targetsize = fullsamplesize
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   266
        else:
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   267
            # use even cheaper initial sample
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   268
            ui.debug("taking quick initial sample\n")
23807
e97e363a7000 setdiscovery: delay sample building calls to gather them in a single place
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23806
diff changeset
   269
            samplefunc = _takequicksample
23130
ced632394371 setdiscovery: limit the size of all sample (issue4411)
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 23084
diff changeset
   270
            targetsize = initialsamplesize
41114
3c85a62d7462 discovery: move handling of sampling special case inside sampling function
Boris Feld <boris.feld@octobus.net>
parents: 41113
diff changeset
   271
        sample = samplefunc(local, ownheads, undecided, targetsize)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   272
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   273
        roundtrips += 1
38356
9e70690a21ac setdiscovery: use progress helper
Martin von Zweigbergk <martinvonz@google.com>
parents: 37631
diff changeset
   274
        progress.update(roundtrips)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   275
        ui.debug("query %i; still undecided: %i, sample size is: %i\n"
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   276
                 % (roundtrips, len(undecided), len(sample)))
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   277
        # indices between sample and externalized version must match
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   278
        sample = list(sample)
37630
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   279
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   280
        with remote.commandexecutor() as e:
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   281
            yesno = e.callcommand('known', {
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   282
                'nodes': [clnode(r) for r in sample],
37630
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   283
            }).result()
e1b32dc4646c wireproto: implement command executor interface for version 1 peers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36741
diff changeset
   284
14624
f03c82d1f50a setdiscovery: batch heads and known(ownheads)
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents: 14206
diff changeset
   285
        full = True
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   286
23343
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   287
        if sample:
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   288
            commoninsample = set(n for i, n in enumerate(sample) if yesno[i])
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   289
            common.addbases(commoninsample)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   290
            common.removeancestorsfrom(undecided)
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   291
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   292
    # heads(common) == heads(common.bases) since common represents common.bases
f8a2647fe020 setdiscovery: avoid a full changelog graph traversal
Siddharth Agarwal <sid0@fb.com>
parents: 23192
diff changeset
   293
    # and all its ancestors
39209
fec01c69b0f0 setdiscovery: use revset for resolving DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39207
diff changeset
   294
    # The presence of nullrev will confuse heads(). So filter it out.
fec01c69b0f0 setdiscovery: use revset for resolving DAG heads in a subset
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39207
diff changeset
   295
    result = set(local.revs('heads(%ld)', common.bases - {nullrev}))
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   296
    elapsed = util.timer() - start
38379
ef692614e601 progress: hide update(None) in a new complete() method
Martin von Zweigbergk <martinvonz@google.com>
parents: 38356
diff changeset
   297
    progress.complete()
32732
43bda143e3b2 discovery: include timing in the debug output
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32331
diff changeset
   298
    ui.debug("%d total queries in %.4fs\n" % (roundtrips, elapsed))
32788
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32733
diff changeset
   299
    msg = ('found %d common and %d unknown server heads,'
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32733
diff changeset
   300
           ' %d roundtrips in %.4fs\n')
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32733
diff changeset
   301
    missing = set(result) - set(srvheads)
483d47753726 setdiscovery: improves logged message
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32733
diff changeset
   302
    ui.log('discovery', msg, len(result), len(missing), roundtrips,
32733
28240b75e880 discovery: log discovery result in non-trivial cases
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 32732
diff changeset
   303
           elapsed)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   304
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   305
    if not result and srvheadhashes != [nullid]:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   306
        if abortwhenunrelated:
26587
56b2bcea2529 error: get Abort from 'error' instead of 'util'
Pierre-Yves David <pierre-yves.david@fb.com>
parents: 25973
diff changeset
   307
            raise error.Abort(_("repository is unrelated"))
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   308
        else:
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   309
            ui.warn(_("warning: repository is unrelated\n"))
32331
bd872f64a8ba cleanup: use set literals
Martin von Zweigbergk <martinvonz@google.com>
parents: 28437
diff changeset
   310
        return ({nullid}, True, srvheadhashes,)
14164
cb98fed52495 discovery: add new set-based discovery
Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
parents:
diff changeset
   311
14981
192e02680d09 setdiscovery: return anyincoming=False when remote's only head is nullid
Andrew Pritchard <andrewp@fogcreek.com>
parents: 14833
diff changeset
   312
    anyincoming = (srvheadhashes != [nullid])
39192
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   313
    result = {clnode(r) for r in result}
5b32b3c618b2 setdiscovery: don't use dagutil for rev -> node conversions
Gregory Szorc <gregory.szorc@gmail.com>
parents: 38379
diff changeset
   314
    return result, anyincoming, srvheadhashes