hg
author Georges Racinet <georges.racinet@octobus.net>
Wed, 20 Feb 2019 09:04:54 +0100
changeset 42761 4d20b1fe8a72
parent 39608 5e78c100a215
child 43073 5c9c71cde1c9
permissions -rwxr-xr-x
rust-discovery: using from Python code As previously done in other topics, the Rust version is used if it's been built. The version fully in Rust of the partialdiscovery class has the performance advantage over the Python version (actually using the Rust MissingAncestor) if the undecided set is big enough. Otherwise no sampling occurs, and the discovery is reasonably fast anyway. Note: it's hard to predict the size of the initial undecided set, it can depend on the kind of topological changes between the local and remote graphs. The point of the Rust version is to make the bad cases acceptable. More specifically, the performance advantages are: - faster sampling, especially takefullsample() - much faster addmissings() in almost all cases (see commit message in grandparent of the present changeset) - no conversion cost of the undecided set at the interface between Rust and Python == Measurements with big undecided sets For an extreme example, discovery between mozilla-try and mozilla-unified (over one million undecided revisions, same case as in dbd0fcca6dfc), we get roughly a x2.5/x3 better performance: Growing sample size (5% starting with 200): time goes down from 210 to 72 seconds. Constant sample size of 200: time down from 1853 to 659 seconds. With a sample size computed from number of roots and heads of the undecided set (`respectsize` is `False`), here are perfdiscovery results: Before ! wall 9.358729 comb 9.360000 user 9.310000 sys 0.050000 (median of 50) After ! wall 3.793819 comb 3.790000 user 3.750000 sys 0.040000 (median of 50) In that later case, the sample sizes are routinely in the hundreds of thousands of revisions. While still faster, the Rust iteration in addmissings has less of an advantage than with smaller sample sizes, but one sees addcommons becoming faster, probably a consequence of not having to copy big sets back and forth. This example is not a goal in itself, but it showcases several different areas in which the process can become slow, due to different factors, and how this full Rust version can help. == Measurements with small undecided sets In cases the undecided set is small enough than no sampling occurs, the Rust version has a disadvantage at init if `targetheads` is really big (some time is lost in the translation to Rust data structures), and that is compensated by the faster `addmissings()`. On a private repository with over one million commits, we still get a minor improvement, of 6.8%: Before ! wall 0.593585 comb 0.590000 user 0.550000 sys 0.040000 (median of 50) After ! wall 0.553035 comb 0.550000 user 0.520000 sys 0.030000 (median of 50) What's interesting in that case is the first addinfo() at 180ms for Rust and 233ms for Python+C, mostly due to add_missings and the children cache computation being done in less than 0.2ms on the Rust side vs over 40ms on the Python side. The worst case we have on hand is with mozilla-try, prepared with discovery-helper.sh for 10 heads and depth 10, time goes up 2.2% on the median. In this case `targetheads` is really huge with 165842 server heads. Before ! wall 0.823884 comb 0.810000 user 0.790000 sys 0.020000 (median of 50) After ! wall 0.842607 comb 0.840000 user 0.800000 sys 0.040000 (median of 50) If that would be considered a problem, more adjustments can be made, which are prematurate at this stage: cooking special variants of methods of the inner MissingAncestors object, retrieving local heads directly from Rust to avoid the cost of conversion. Effort would probably be better spent at this point improving the surroundings if needed. Here's another data point with a smaller repository, pypy, where performance is almost identical Before ! wall 0.015121 comb 0.030000 user 0.020000 sys 0.010000 (median of 186) After ! wall 0.015009 comb 0.010000 user 0.010000 sys 0.000000 (median of 184) Differential Revision: https://phab.mercurial-scm.org/D6430
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     1
#!/usr/bin/env python
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     2
#
1698
ad4a2eefe4d7 Update copyright notice
Matt Mackall <mpm@selenic.com>
parents: 515
diff changeset
     3
# mercurial - scalable distributed SCM
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     4
#
4635
63b9d2deed48 Updated copyright notices and add "and others" to "hg version"
Thomas Arendsen Hein <thomas@intevation.de>
parents: 3877
diff changeset
     5
# Copyright 2005-2007 Matt Mackall <mpm@selenic.com>
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
     6
#
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7672
diff changeset
     7
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 8225
diff changeset
     8
# GNU General Public License version 2 or any later version.
33914
1900381b6a6e hg: update top-level script to use modern import conventions
Augie Fackler <raf@durin42.com>
parents: 32462
diff changeset
     9
from __future__ import absolute_import
0
9117c6561b0b Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff changeset
    10
12661
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    11
import os
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    12
import sys
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    13
21812
73e4a02e6d23 hg: add support for HGUNICODEPEDANTRY environment variable
Augie Fackler <raf@durin42.com>
parents: 14233
diff changeset
    14
if os.environ.get('HGUNICODEPEDANTRY', False):
29172
2ea9c9aa6e60 hg: limit HGUNICODEPEDANTRY to py2
timeless <timeless@mozdev.org>
parents: 21812
diff changeset
    15
    try:
2ea9c9aa6e60 hg: limit HGUNICODEPEDANTRY to py2
timeless <timeless@mozdev.org>
parents: 21812
diff changeset
    16
        reload(sys)
2ea9c9aa6e60 hg: limit HGUNICODEPEDANTRY to py2
timeless <timeless@mozdev.org>
parents: 21812
diff changeset
    17
        sys.setdefaultencoding("undefined")
2ea9c9aa6e60 hg: limit HGUNICODEPEDANTRY to py2
timeless <timeless@mozdev.org>
parents: 21812
diff changeset
    18
    except NameError:
2ea9c9aa6e60 hg: limit HGUNICODEPEDANTRY to py2
timeless <timeless@mozdev.org>
parents: 21812
diff changeset
    19
        pass
21812
73e4a02e6d23 hg: add support for HGUNICODEPEDANTRY environment variable
Augie Fackler <raf@durin42.com>
parents: 14233
diff changeset
    20
12661
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    21
libdir = '@LIBDIR@'
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    22
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    23
if libdir != '@' 'LIBDIR' '@':
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    24
    if not os.path.isabs(libdir):
12805
cae1c187abd4 setup/hg: handle hg being a symlink when appending relative libdir to sys.path
L. David Baron <dbaron@dbaron.org>
parents: 12661
diff changeset
    25
        libdir = os.path.join(os.path.dirname(os.path.realpath(__file__)),
cae1c187abd4 setup/hg: handle hg being a symlink when appending relative libdir to sys.path
L. David Baron <dbaron@dbaron.org>
parents: 12661
diff changeset
    26
                              libdir)
12661
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    27
        libdir = os.path.abspath(libdir)
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    28
    sys.path.insert(0, libdir)
10da5a1f25dd setup/hg: always load Mercurial from where it was installed.
Dan Villiom Podlaski Christiansen <danchr@gmail.com>
parents: 10263
diff changeset
    29
39608
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    30
from hgdemandimport import tracing
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    31
with tracing.log('hg script'):
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    32
    # enable importing on demand to reduce startup time
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    33
    try:
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    34
        if sys.version_info[0] < 3 or sys.version_info >= (3, 6):
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    35
            import hgdemandimport; hgdemandimport.enable()
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    36
    except ImportError:
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    37
        sys.stderr.write("abort: couldn't find mercurial libraries in [%s]\n" %
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    38
                         ' '.join(sys.path))
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    39
        sys.stderr.write("(check your install and PYTHONPATH)\n")
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    40
        sys.exit(-1)
5197
55860a45bbf2 Enable demandimport only in scripts, not in importable modules (issue605)
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5178
diff changeset
    41
39608
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    42
    from mercurial import dispatch
5e78c100a215 hg: wrap the highest layer in the `hg` script possible in trace event
Augie Fackler <augie@google.com>
parents: 34533
diff changeset
    43
    dispatch.run()