contrib/bdiff-torture.py
author Pierre-Yves David <pierre-yves.david@octobus.net>
Sat, 12 Nov 2022 00:18:41 +0100
changeset 49672 e92de86cf4f8
parent 48875 6000f5b25c9b
permissions -rw-r--r--
emitrevision: consider ancestors revision to emit as available base This should make more delta base valid. This notably affects: * case where we skipped some parent with empty delta to directly delta against an ancestors * case where an intermediate snapshots is stored. This change means we could sent largish intermediate snapshots over the wire. However this is actually a sub goal here. Sending snapshots over the wire means the client have a high odd of simply storing the pre-computed delta instead of doing a lengthy process that will… end up doing the same intermediate snapshot. In addition the overall size of snapshot (or any level) is "only" some or the overall delta size. (0.17% for my mercurial clone, 20% for my clone of Mozilla try). So Sending them other the wire is unlikely to change large impact on the bandwidth used. If we decide that minimising the bandwidth is an explicit goal, we should introduce new logic to filter-out snapshot as delta. The current code has no notion explicite of snapshot so far, they just tended to fall into the wobbly filtering options. In some cases, this patch can yield large improvement to the bundling time: ### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog # benchmark.name = perf-bundle # benchmark.variants.revs = last-100000 before: 68.787066 seconds after: 47.552677 seconds (-30.87%) That translate to large improvement to the pull time : ### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog # benchmark.name = pull # benchmark.variants.issue6528 = disabled # benchmark.variants.revs = last-100000 before: 142.186625 seconds after: 75.897745 seconds (-46.62%) No significant negative impact have been observed.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     1
# Randomized torture test generation for bdiff
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     2
29209
eccfd6500636 py3: make contrib/bdiff-torture.py conform to our import style
Yuya Nishihara <yuya@tcha.org>
parents: 29012
diff changeset
     3
import random
eccfd6500636 py3: make contrib/bdiff-torture.py conform to our import style
Yuya Nishihara <yuya@tcha.org>
parents: 29012
diff changeset
     4
import sys
eccfd6500636 py3: make contrib/bdiff-torture.py conform to our import style
Yuya Nishihara <yuya@tcha.org>
parents: 29012
diff changeset
     5
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     6
from mercurial import (
32202
ded48ad55146 bdiff: proxy through mdiff module
Yuya Nishihara <yuya@tcha.org>
parents: 29209
diff changeset
     7
    mdiff,
43009
e05c141511dd contrib: use pycompat.xrange in bdiff-torture.py
Gregory Szorc <gregory.szorc@gmail.com>
parents: 42857
diff changeset
     8
    pycompat,
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     9
)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    10
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    11
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    12
def reducetest(a, b):
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    13
    tries = 0
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    14
    reductions = 0
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    15
    print("reducing...")
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    16
    while tries < 1000:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    17
        a2 = (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    18
            "\n".join(l for l in a.splitlines() if random.randint(0, 100) > 0)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    19
            + "\n"
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    20
        )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    21
        b2 = (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    22
            "\n".join(l for l in b.splitlines() if random.randint(0, 100) > 0)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    23
            + "\n"
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    24
        )
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    25
        if a2 == a and b2 == b:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    26
            continue
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    27
        if a2 == b2:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    28
            continue
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    29
        tries += 1
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    30
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    31
        try:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    32
            test1(a, b)
41365
876494fd967d cleanup: delete lots of unused local variables
Martin von Zweigbergk <martinvonz@google.com>
parents: 32203
diff changeset
    33
        except Exception:
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    34
            reductions += 1
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    35
            tries = 0
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    36
            a = a2
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    37
            b = b2
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    38
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    39
    print("reduced:", reductions, len(a) + len(b), repr(a), repr(b))
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    40
    try:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    41
        test1(a, b)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    42
    except Exception as inst:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    43
        print("failed:", inst)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    44
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    45
    sys.exit(0)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    46
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    47
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    48
def test1(a, b):
32202
ded48ad55146 bdiff: proxy through mdiff module
Yuya Nishihara <yuya@tcha.org>
parents: 29209
diff changeset
    49
    d = mdiff.textdiff(a, b)
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    50
    if not d:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    51
        raise ValueError("empty")
32203
0c73634d0570 mpatch: proxy through mdiff module
Yuya Nishihara <yuya@tcha.org>
parents: 32202
diff changeset
    52
    c = mdiff.patches(a, [d])
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    53
    if c != b:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    54
        raise ValueError("bad")
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    55
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    56
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    57
def testwrap(a, b):
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    58
    try:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    59
        test1(a, b)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    60
        return
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    61
    except Exception as inst:
42857
3316e59b0105 bdiff-torture: fix pyflakes warning reporting undefined name 'inst'
Pulkit Goyal <pulkit@yandex-team.ru>
parents: 41365
diff changeset
    62
        print("exception:", inst)
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    63
    reducetest(a, b)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    64
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    65
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    66
def test(a, b):
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    67
    testwrap(a, b)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    68
    testwrap(b, a)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    69
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    70
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    71
def rndtest(size, noise):
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    72
    a = []
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    73
    src = "                aaaaaaaabbbbccd"
43009
e05c141511dd contrib: use pycompat.xrange in bdiff-torture.py
Gregory Szorc <gregory.szorc@gmail.com>
parents: 42857
diff changeset
    74
    for x in pycompat.xrange(size):
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    75
        a.append(src[random.randint(0, len(src) - 1)])
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    76
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    77
    while True:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    78
        b = [c for c in a if random.randint(0, 99) > noise]
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    79
        b2 = []
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    80
        for c in b:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    81
            b2.append(c)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    82
            while random.randint(0, 99) < noise:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    83
                b2.append(src[random.randint(0, len(src) - 1)])
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    84
        if b2 != a:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    85
            break
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    86
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    87
    a = "\n".join(a) + "\n"
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    88
    b = "\n".join(b2) + "\n"
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    89
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    90
    test(a, b)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    91
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43009
diff changeset
    92
29012
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    93
maxvol = 10000
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    94
startsize = 2
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    95
while True:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    96
    size = startsize
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    97
    count = 0
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    98
    while size < maxvol:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    99
        print(size)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   100
        volume = 0
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   101
        while volume < maxvol:
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   102
            rndtest(size, 2)
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   103
            volume += size
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   104
            count += 2
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   105
        size *= 2
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   106
    maxvol *= 4
4bd67ae7d75a bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
   107
    startsize *= 4