annotate tests/revlog-formatv0.py @ 51681:522b4d729e89

mmap: populate the mapping by default Without pre-population, accessing all data through a mmap can result in many pagefault, reducing performance significantly. If the mmap is prepopulated, the performance can no longer get slower than a full read. (See benchmark number below) In some cases were very few data is read, prepopulating can be overkill and slower than populating on access (through page fault). So that behavior can be controlled when the caller can pre-determine the best behavior. (See benchmark number below) In addition, testing with populating in a secondary thread yield great result combining the best of each approach. This might be implemented in later changesets. In all cases, using mmap has a great effect on memory usage when many processes run in parallel on the same machine. ### Benchmarks # What did I run A couple of month back I ran a large benchmark campaign to assess the impact of various approach for using mmap with the revlog (and other files), it highlighted a few benchmarks that capture the impact of the changes well. So to validate this change I checked the following: - log command displaying various revisions (read the changelog index) - log command displaying the patch of listed revisions (read the changelog index, the manifest index and a few files indexes) - unbundling a few revisions (read and write changelog, manifest and few files indexes, and walk the graph to update some cache) - pushing a few revisions (read and write changelog, manifest and few files indexes, walk the graph to update some cache, performs various accesses locally and remotely during discovery) Benchmarks were run using the default module policy (c+py) and the rust one. No significant difference were found between the two implementation, so we will present result using the default policy (unless otherwise specified). I ran them on a few repositories : - mercurial: a "public changeset only" copy of mercurial from 2018-08-01 using zstd compression and sparse-revlog - pypy: a copy of pypy from 2018-08-01 using zstd compression and sparse-revlog - netbeans: a copy of netbeans from 2018-08-01 using zstd compression and sparse-revlog - mozilla-try: a copy of mozilla-try from 2019-02-18 using zstd compression and sparse-revlog - mozilla-try persistent-nodemap: Same as the above but with a persistent nodemap. Used for the log --patch benchmark only # Results For the smaller repositories (mercurial, pypy), the impact of mmap is almost imperceptible, other cost dominating the operation. The impact of prepopulating is undiscernible in the benchmark we ran. For larger repositories the benchmark support explanation given above: On netbeans, the log can be about 1% faster without repopulation (for a difference < 100ms) but unbundle becomes a bit slower, even when small. ### data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog # benchmark.name = hg.command.unbundle # benchmark.variants.issue6528 = disabled # benchmark.variants.reuse-external-delta-parent = yes # benchmark.variants.revs = any-1-extra-rev # benchmark.variants.source = unbundle # benchmark.variants.verbosity = quiet with-populate: 0.240157 no-populate: 0.265087 (+10.38%, +0.02) # benchmark.variants.revs = any-100-extra-rev with-populate: 1.459518 no-populate: 1.481290 (+1.49%, +0.02) ## benchmark.name = hg.command.push # benchmark.variants.explicit-rev = none # benchmark.variants.issue6528 = disabled # benchmark.variants.protocol = ssh # benchmark.variants.reuse-external-delta-parent = yes # benchmark.variants.revs = any-1-extra-rev with-populate: 0.771919 no-populate: 0.792025 (+2.60%, +0.02) # benchmark.variants.revs = any-100-extra-rev with-populate: 1.459518 no-populate: 1.481290 (+1.49%, +0.02) For mozilla-try, the "slow down" from pre-populate for small `hg log` is more visible, but still small in absolute time. (using rust value for the persistent nodemap value to be relevant). ### data-env-vars.name = mozilla-try-2019-02-18-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # benchmark.variants.patch = yes # benchmark.variants.limit-rev = 1 with-populate: 0.237813 no-populate: 0.229452 (-3.52%, -0.01) # benchmark.variants.limit-rev = 10 # benchmark.variants.patch = yes with-populate: 1.213578 no-populate: 1.205189 ### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog # benchmark.variants.limit-rev = 1000 # benchmark.variants.patch = no # benchmark.variants.rev = tip with-populate: 0.198607 no-populate: 0.195038 (-1.80%, -0.00) However pre-populating provide a significant boost on more complex operations like unbundle or push: ### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog # benchmark.name = hg.command.push # benchmark.variants.explicit-rev = none # benchmark.variants.issue6528 = disabled # benchmark.variants.protocol = ssh # benchmark.variants.reuse-external-delta-parent = yes # benchmark.variants.revs = any-1-extra-rev with-populate: 4.798632 no-populate: 4.953295 (+3.22%, +0.15) # benchmark.variants.revs = any-100-extra-rev with-populate: 4.903618 no-populate: 5.014963 (+2.27%, +0.11) ## benchmark.name = hg.command.unbundle # benchmark.variants.revs = any-1-extra-rev with-populate: 1.423411 no-populate: 1.585365 (+11.38%, +0.16) # benchmark.variants.revs = any-100-extra-rev with-populate: 1.537909 no-populate: 1.688489 (+9.79%, +0.15)
author Pierre-Yves David <pierre-yves.david@octobus.net>
date Thu, 11 Apr 2024 00:02:07 +0200
parents 6000f5b25c9b
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
45830
c102b704edb5 global: use python3 in shebangs
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43076
diff changeset
1 #!/usr/bin/env python3
12170
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
2 # Copyright 2010 Intevation GmbH
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
3 # Author(s):
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
4 # Thomas Arendsen Hein <thomas@intevation.de>
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
5 #
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
6 # This software may be used and distributed according to the terms of the
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
7 # GNU General Public License version 2 or any later version.
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
8
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
9 """Create a Mercurial repository in revlog format 0
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
10
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
11 changeset: 0:a1ef0b125355
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
12 tag: tip
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
13 user: user
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
14 date: Thu Jan 01 00:00:00 1970 +0000
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
15 files: empty
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
16 description:
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
17 empty file
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
18 """
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
19
36565
9805c906aaad tests: port helper script revlog-formatv0.py to python 3
Augie Fackler <augie@google.com>
parents: 35570
diff changeset
20 import binascii
28945
05982f7ab231 py3: use absolute_import in revlog-formatv0.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 12170
diff changeset
21 import os
05982f7ab231 py3: use absolute_import in revlog-formatv0.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 12170
diff changeset
22 import sys
12170
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
23
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
24 files = [
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
25 (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
26 b'formatv0/.hg/00changelog.i',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
27 b'000000000000004400000000000000000000000000000000000000'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
28 b'000000000000000000000000000000000000000000000000000000'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
29 b'0000a1ef0b125355d27765928be600cfe85784284ab3',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
30 ),
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
31 (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
32 b'formatv0/.hg/00changelog.d',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
33 b'756163613935613961356635353036303562366138343738336237'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
34 b'61623536363738616436356635380a757365720a3020300a656d70'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
35 b'74790a0a656d7074792066696c65',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
36 ),
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
37 (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
38 b'formatv0/.hg/00manifest.i',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
39 b'000000000000003000000000000000000000000000000000000000'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
40 b'000000000000000000000000000000000000000000000000000000'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
41 b'0000aca95a9a5f550605b6a84783b7ab56678ad65f58',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
42 ),
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
43 (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
44 b'formatv0/.hg/00manifest.d',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
45 b'75656d707479006238306465356431333837353835343163356630'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
46 b'35323635616431343461623966613836643164620a',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
47 ),
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
48 (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
49 b'formatv0/.hg/data/empty.i',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
50 b'000000000000000000000000000000000000000000000000000000'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
51 b'000000000000000000000000000000000000000000000000000000'
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
52 b'0000b80de5d138758541c5f05265ad144ab9fa86d1db',
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
53 ),
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
54 (b'formatv0/.hg/data/empty.d', b''),
12170
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
55 ]
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
56
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
57
12170
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
58 def makedirs(name):
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
59 """recursive directory creation"""
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
60 parent = os.path.dirname(name)
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
61 if parent:
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
62 makedirs(parent)
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
63 os.mkdir(name)
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
64
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36565
diff changeset
65
12170
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
66 makedirs(os.path.join(*'formatv0/.hg/data'.split('/')))
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
67
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
68 for name, data in files:
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
69 f = open(name, 'wb')
36565
9805c906aaad tests: port helper script revlog-formatv0.py to python 3
Augie Fackler <augie@google.com>
parents: 35570
diff changeset
70 f.write(binascii.unhexlify(data))
12170
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
71 f.close()
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
72
581066a319e5 verify: fix "missing revlog!" errors for revlog format v0 and add test
Thomas Arendsen Hein <thomas@intevation.de>
parents:
diff changeset
73 sys.exit(0)