tests/test-pathencode.py
author Paul Morelle <paul.morelle@octobus.net>
Tue, 05 Jun 2018 08:19:35 +0200
changeset 38718 f8762ea73e0d
parent 37880 1b230e19d044
child 43076 2372284d9457
permissions -rw-r--r--
sparse-revlog: implement algorithm to write sparse delta chains (issue5480) The classic behavior of revlog._isgooddeltainfo is to consider the span size of the whole delta chain, and limit it to 4 * textlen. Once sparse-revlog writing is allowed (and enforced with a requirement), revlog._isgooddeltainfo considers the span of the largest chunk as the distance used in the verification, instead of using the span of the whole delta chain. In order to compute the span of the largest chunk, we need to slice into chunks a chain with the new revision at the top of the revlog, and take the maximal span of these chunks. The sparse read density is a parameter to the slicing, as it will stop when the global read density reaches this threshold. For instance, a density of 50% means that 2 of 4 read bytes are actually used for the reconstruction of the revision (the others are part of other chains). This allows a new revision to be potentially stored with a diff against another revision anywhere in the history, instead of forcing it in the last 4 * textlen. The result is a much better compression on repositories that have many concurrent branches. Here are a comparison between using deltas from current upstream (aggressive-merge-deltas on by default) and deltas from a sparse-revlog Comparison of `.hg/store/` size: mercurial (6.74% merges): before: 46,831,873 bytes after: 46,795,992 bytes (no relevant change) pypy (8.30% merges): before: 333,524,651 bytes after: 308,417,511 bytes -8% netbeans (34.21% merges): before: 1,141,847,554 bytes after: 1,131,093,161 bytes -1% mozilla-central (4.84% merges): before: 2,344,248,850 bytes after: 2,328,459,258 bytes -1% large-private-repo-A (merge 19.73%) before: 41,510,550,163 bytes after: 8,121,763,428 bytes -80% large-private-repo-B (23.77%) before: 58,702,221,709 bytes after: 8,351,588,828 bytes -76% Comparison of `00manifest.d` size: mercurial (6.74% merges): before: 6,143,044 bytes after: 6,107,163 bytes pypy (8.30% merges): before: 52,941,780 bytes after: 27,834,082 bytes -48% netbeans (34.21% merges): before: 130,088,982 bytes after: 119,337,636 bytes -10% mozilla-central (4.84% merges): before: 215,096,339 bytes after: 199,496,863 bytes -8% large-private-repo-A (merge 19.73%) before: 33,725,285,081 bytes after: 390,302,545 bytes -99% large-private-repo-B (23.77%) before: 49,457,701,645 bytes after: 1,366,752,187 bytes -97% The better delta chains provide a performance boost in relevant repositories: pypy, bundling 1000 revisions: before: 1.670s after: 1.149s -31% Unbundling got a bit slower. probably because the sparse algorithm is still pure python. pypy, unbundling 1000 revisions: before: 4.062s after: 4.507s +10% Performance of bundle/unbundle in repository with few concurrent branches (eg: mercurial) are unaffected. No significant differences have been noticed then timing `hg push` and `hg pull` locally. More state timings are being gathered. Same as for aggressive-merge-delta, better delta comes with longer delta chains. Longer chains have a performance impact. For example. The length of the chain needed to get the manifest of pypy's tip moves from 82 item to 1929 items. This moves the restore time from 3.88ms to 11.3ms. Delta chain length is an independent issue that affects repository without this changes. It will be dealt with independently. No significant differences have been observed on repositories where `sparse-revlog` have not much effect (mercurial, unity, netbeans). On pypy, small differences have been observed on some operation affected by delta chain building and retrieval. pypy, perfmanifest before: 0.006162s after: 0.017899s +190% pypy, commit: before: 0.382 after: 0.376 -1% pypy, status: before: 0.157 after: 0.168 +7% More comprehensive and stable timing comparisons are in progress.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
     1
# This is a randomized test that generates different pathnames every
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
     2
# time it is invoked, and tests the encoding of those pathnames.
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
     3
#
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
     4
# It uses a simple probabilistic model to generate valid pathnames
26098
ce26928cbe41 spelling: behaviour -> behavior
timeless@mozdev.org
parents: 20938
diff changeset
     5
# that have proven likely to expose bugs and divergent behavior in
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
     6
# different encoding implementations.
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
     7
28928
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
     8
from __future__ import absolute_import, print_function
28918
72f683260f31 tests: make test-pathencode use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 26849
diff changeset
     9
28928
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    10
import binascii
17935
9c888b945b65 test-pathencode: make a 2.4-safe import of collections
Bryan O'Sullivan <bryano@fb.com>
parents: 17934
diff changeset
    11
import collections
28928
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    12
import itertools
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    13
import math
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    14
import os
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    15
import random
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    16
import sys
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    17
import time
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    18
from mercurial import (
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    19
    pycompat,
28928
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    20
    store,
59481bfdb7f3 tests: make test-pathencode use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28918
diff changeset
    21
)
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    22
34224
0f200e2310ca tests: add xrange alias for test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 28928
diff changeset
    23
try:
0f200e2310ca tests: add xrange alias for test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 28928
diff changeset
    24
    xrange
0f200e2310ca tests: add xrange alias for test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 28928
diff changeset
    25
except NameError:
0f200e2310ca tests: add xrange alias for test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 28928
diff changeset
    26
    xrange = range
0f200e2310ca tests: add xrange alias for test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 28928
diff changeset
    27
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    28
validchars = set(map(pycompat.bytechr, range(0, 256)))
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    29
alphanum = range(ord('A'), ord('Z'))
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    30
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    31
for c in (b'\0', b'/'):
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    32
    validchars.remove(c)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    33
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    34
winreserved = (b'aux con prn nul'.split() +
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    35
               [b'com%d' % i for i in xrange(1, 10)] +
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    36
               [b'lpt%d' % i for i in xrange(1, 10)])
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    37
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    38
def casecombinations(names):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    39
    '''Build all case-diddled combinations of names.'''
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    40
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    41
    combos = set()
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    42
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    43
    for r in names:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    44
        for i in xrange(len(r) + 1):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    45
            for c in itertools.combinations(xrange(len(r)), i):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    46
                d = r
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    47
                for j in c:
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    48
                    d = b''.join((d[:j], d[j:j + 1].upper(), d[j + 1:]))
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    49
                combos.add(d)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    50
    return sorted(combos)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    51
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    52
def buildprobtable(fp, cmd='hg manifest tip'):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    53
    '''Construct and print a table of probabilities for path name
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    54
    components.  The numbers are percentages.'''
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    55
17935
9c888b945b65 test-pathencode: make a 2.4-safe import of collections
Bryan O'Sullivan <bryano@fb.com>
parents: 17934
diff changeset
    56
    counts = collections.defaultdict(lambda: 0)
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    57
    for line in os.popen(cmd).read().splitlines():
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    58
        if line[-2:] in ('.i', '.d'):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    59
            line = line[:-2]
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    60
        if line.startswith('data/'):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    61
            line = line[5:]
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    62
        for c in line:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    63
            counts[c] += 1
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    64
    for c in '\r/\n':
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    65
        counts.pop(c, None)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    66
    t = sum(counts.itervalues()) / 100.0
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    67
    fp.write('probtable = (')
36327
58c1368ab629 py3: use dict.items() instead of dict.iteritems() in tests
Pulkit Goyal <7895pulkit@gmail.com>
parents: 34225
diff changeset
    68
    for i, (k, v) in enumerate(sorted(counts.items(), key=lambda x: x[1],
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    69
                                      reverse=True)):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    70
        if (i % 5) == 0:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    71
            fp.write('\n    ')
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    72
        vt = v / t
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    73
        if vt < 0.0005:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    74
            break
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    75
        fp.write('(%r, %.03f), ' % (k, vt))
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    76
    fp.write('\n    )\n')
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    77
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    78
# A table of character frequencies (as percentages), gleaned by
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    79
# looking at filelog names from a real-world, very large repo.
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    80
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    81
probtable = (
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    82
    (b't', 9.828), (b'e', 9.042), (b's', 8.011), (b'a', 6.801), (b'i', 6.618),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    83
    (b'g', 5.053), (b'r', 5.030), (b'o', 4.887), (b'p', 4.363), (b'n', 4.258),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    84
    (b'l', 3.830), (b'h', 3.693), (b'_', 3.659), (b'.', 3.377), (b'm', 3.194),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    85
    (b'u', 2.364), (b'd', 2.296), (b'c', 2.163), (b'b', 1.739), (b'f', 1.625),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    86
    (b'6', 0.666), (b'j', 0.610), (b'y', 0.554), (b'x', 0.487), (b'w', 0.477),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    87
    (b'k', 0.476), (b'v', 0.473), (b'3', 0.336), (b'1', 0.335), (b'2', 0.326),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    88
    (b'4', 0.310), (b'5', 0.305), (b'9', 0.302), (b'8', 0.300), (b'7', 0.299),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    89
    (b'q', 0.298), (b'0', 0.250), (b'z', 0.223), (b'-', 0.118), (b'C', 0.095),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    90
    (b'T', 0.087), (b'F', 0.085), (b'B', 0.077), (b'S', 0.076), (b'P', 0.076),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    91
    (b'L', 0.059), (b'A', 0.058), (b'N', 0.051), (b'D', 0.049), (b'M', 0.046),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    92
    (b'E', 0.039), (b'I', 0.035), (b'R', 0.035), (b'G', 0.028), (b'U', 0.026),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    93
    (b'W', 0.025), (b'O', 0.017), (b'V', 0.015), (b'H', 0.013), (b'Q', 0.011),
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
    94
    (b'J', 0.007), (b'K', 0.005), (b'+', 0.004), (b'X', 0.003), (b'Y', 0.001),
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    95
    )
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    96
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    97
for c, _ in probtable:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    98
    validchars.remove(c)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
    99
validchars = list(validchars)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   100
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   101
def pickfrom(rng, table):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   102
    c = 0
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   103
    r = rng.random() * sum(i[1] for i in table)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   104
    for i, p in table:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   105
        c += p
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   106
        if c >= r:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   107
            return i
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   108
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   109
reservedcombos = casecombinations(winreserved)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   110
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   111
# The first component of a name following a slash.
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   112
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   113
firsttable = (
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   114
    (lambda rng: pickfrom(rng, probtable), 90),
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   115
    (lambda rng: rng.choice(validchars), 5),
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   116
    (lambda rng: rng.choice(reservedcombos), 5),
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   117
    )
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   118
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   119
# Components of a name following the first.
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   120
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   121
resttable = firsttable[:-1]
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   122
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   123
# Special suffixes.
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   124
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
   125
internalsuffixcombos = casecombinations(b'.hg .i .d'.split())
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   126
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   127
# The last component of a path, before a slash or at the end of a name.
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   128
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   129
lasttable = resttable + (
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
   130
    (lambda rng: b'', 95),
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   131
    (lambda rng: rng.choice(internalsuffixcombos), 5),
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   132
    )
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   133
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   134
def makepart(rng, k):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   135
    '''Construct a part of a pathname, without slashes.'''
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   136
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   137
    p = pickfrom(rng, firsttable)(rng)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   138
    l = len(p)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   139
    ps = [p]
19319
ec17ddecdf64 test-pathencode: randomize length of each path component
Siddharth Agarwal <sid0@fb.com>
parents: 19318
diff changeset
   140
    maxl = rng.randint(1, k)
ec17ddecdf64 test-pathencode: randomize length of each path component
Siddharth Agarwal <sid0@fb.com>
parents: 19318
diff changeset
   141
    while l < maxl:
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   142
        p = pickfrom(rng, resttable)(rng)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   143
        l += len(p)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   144
        ps.append(p)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   145
    ps.append(pickfrom(rng, lasttable)(rng))
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
   146
    return b''.join(ps)
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   147
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   148
def makepath(rng, j, k):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   149
    '''Construct a complete pathname.'''
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   150
37880
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
   151
    return (b'data/' + b'/'.join(makepart(rng, k) for _ in xrange(j)) +
1b230e19d044 tests: port test-pathencode.py to Python 3
Augie Fackler <augie@google.com>
parents: 36327
diff changeset
   152
            rng.choice([b'.d', b'.i']))
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   153
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   154
def genpath(rng, count):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   155
    '''Generate random pathnames with gradually increasing lengths.'''
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   156
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   157
    mink, maxk = 1, 4096
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   158
    def steps():
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   159
        for i in xrange(count):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   160
            yield mink + int(round(math.sqrt((maxk - mink) * float(i) / count)))
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   161
    for k in steps():
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   162
        x = rng.randint(1, k)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   163
        y = rng.randint(1, k)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   164
        yield makepath(rng, x, y)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   165
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   166
def runtests(rng, seed, count):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   167
    nerrs = 0
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   168
    for p in genpath(rng, count):
18435
8c019d2fd7c0 store: switch to C-based hashed path encoding
Bryan O'Sullivan <bryano@fb.com>
parents: 18110
diff changeset
   169
        h = store._pathencode(p)    # uses C implementation, if available
18094
8ceabb34f1cb test-pathencode: compare current pathencoding implementations
Adrian Buehlmann <adrian@cadifra.com>
parents: 17947
diff changeset
   170
        r = store._hybridencode(p, True) # reference implementation in Python
8ceabb34f1cb test-pathencode: compare current pathencoding implementations
Adrian Buehlmann <adrian@cadifra.com>
parents: 17947
diff changeset
   171
        if h != r:
8ceabb34f1cb test-pathencode: compare current pathencoding implementations
Adrian Buehlmann <adrian@cadifra.com>
parents: 17947
diff changeset
   172
            if nerrs == 0:
28918
72f683260f31 tests: make test-pathencode use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 26849
diff changeset
   173
                print('seed:', hex(seed)[:-1], file=sys.stderr)
72f683260f31 tests: make test-pathencode use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 26849
diff changeset
   174
            print("\np: '%s'" % p.encode("string_escape"), file=sys.stderr)
72f683260f31 tests: make test-pathencode use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 26849
diff changeset
   175
            print("h: '%s'" % h.encode("string_escape"), file=sys.stderr)
72f683260f31 tests: make test-pathencode use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 26849
diff changeset
   176
            print("r: '%s'" % r.encode("string_escape"), file=sys.stderr)
18094
8ceabb34f1cb test-pathencode: compare current pathencoding implementations
Adrian Buehlmann <adrian@cadifra.com>
parents: 17947
diff changeset
   177
            nerrs += 1
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   178
    return nerrs
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   179
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   180
def main():
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   181
    import getopt
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   182
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   183
    # Empirically observed to take about a second to run
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   184
    count = 100
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   185
    seed = None
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   186
    opts, args = getopt.getopt(sys.argv[1:], 'c:s:',
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   187
                               ['build', 'count=', 'seed='])
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   188
    for o, a in opts:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   189
        if o in ('-c', '--count'):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   190
            count = int(a)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   191
        elif o in ('-s', '--seed'):
34225
d43340bec0f5 tests: use int() instead of long() in test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 34224
diff changeset
   192
            seed = int(a, base=0) # accepts base 10 or 16 strings
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   193
        elif o == '--build':
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   194
            buildprobtable(sys.stdout,
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   195
                           'find .hg/store/data -type f && '
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   196
                           'cat .hg/store/fncache 2>/dev/null')
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   197
            sys.exit(0)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   198
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   199
    if seed is None:
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   200
        try:
34225
d43340bec0f5 tests: use int() instead of long() in test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 34224
diff changeset
   201
            seed = int(binascii.hexlify(os.urandom(16)), 16)
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   202
        except AttributeError:
34225
d43340bec0f5 tests: use int() instead of long() in test-pathencode.py
Augie Fackler <raf@durin42.com>
parents: 34224
diff changeset
   203
            seed = int(time.time() * 1000)
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   204
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   205
    rng = random.Random(seed)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   206
    if runtests(rng, seed, count):
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   207
        sys.exit(1)
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   208
17947
f945caa5e963 test-pathencode: more aggressively check for python < 2.6
Bryan O'Sullivan <bryano@fb.com>
parents: 17935
diff changeset
   209
if __name__ == '__main__':
17934
736f1c09f321 tests: add a randomized test for pathencode
Bryan O'Sullivan <bryano@fb.com>
parents:
diff changeset
   210
    main()