Mercurial > hg
view tests/test-convert-bzr.t @ 51681:522b4d729e89
mmap: populate the mapping by default
Without pre-population, accessing all data through a mmap can result in many
pagefault, reducing performance significantly. If the mmap is prepopulated, the
performance can no longer get slower than a full read.
(See benchmark number below)
In some cases were very few data is read, prepopulating can be overkill and
slower than populating on access (through page fault). So that behavior can be
controlled when the caller can pre-determine the best behavior.
(See benchmark number below)
In addition, testing with populating in a secondary thread yield great result
combining the best of each approach. This might be implemented in later
changesets.
In all cases, using mmap has a great effect on memory usage when many processes
run in parallel on the same machine.
### Benchmarks
# What did I run
A couple of month back I ran a large benchmark campaign to assess the impact of
various approach for using mmap with the revlog (and other files), it
highlighted a few benchmarks that capture the impact of the changes well. So to
validate this change I checked the following:
- log command displaying various revisions
(read the changelog index)
- log command displaying the patch of listed revisions
(read the changelog index, the manifest index and a few files indexes)
- unbundling a few revisions
(read and write changelog, manifest and few files indexes, and walk the graph
to update some cache)
- pushing a few revisions
(read and write changelog, manifest and few files indexes, walk the graph to
update some cache, performs various accesses locally and remotely during
discovery)
Benchmarks were run using the default module policy (c+py) and the rust one. No
significant difference were found between the two implementation, so we will
present result using the default policy (unless otherwise specified).
I ran them on a few repositories :
- mercurial: a "public changeset only" copy of mercurial from 2018-08-01 using
zstd compression and sparse-revlog
- pypy: a copy of pypy from 2018-08-01 using zstd compression and sparse-revlog
- netbeans: a copy of netbeans from 2018-08-01 using zstd compression and
sparse-revlog
- mozilla-try: a copy of mozilla-try from 2019-02-18 using zstd compression and
sparse-revlog
- mozilla-try persistent-nodemap: Same as the above but with a persistent
nodemap. Used for the log --patch benchmark only
# Results
For the smaller repositories (mercurial, pypy), the impact of mmap is almost
imperceptible, other cost dominating the operation. The impact of prepopulating
is undiscernible in the benchmark we ran.
For larger repositories the benchmark support explanation given above:
On netbeans, the log can be about 1% faster without repopulation (for a
difference < 100ms) but unbundle becomes a bit slower, even when small.
### data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.command.unbundle
# benchmark.variants.issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
# benchmark.variants.source = unbundle
# benchmark.variants.verbosity = quiet
with-populate: 0.240157
no-populate: 0.265087 (+10.38%, +0.02)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.459518
no-populate: 1.481290 (+1.49%, +0.02)
## benchmark.name = hg.command.push
# benchmark.variants.explicit-rev = none
# benchmark.variants.issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
with-populate: 0.771919
no-populate: 0.792025 (+2.60%, +0.02)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.459518
no-populate: 1.481290 (+1.49%, +0.02)
For mozilla-try, the "slow down" from pre-populate for small `hg log` is more
visible, but still small in absolute time. (using rust value for the persistent
nodemap value to be relevant).
### data-env-vars.name = mozilla-try-2019-02-18-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# benchmark.variants.patch = yes
# benchmark.variants.limit-rev = 1
with-populate: 0.237813
no-populate: 0.229452 (-3.52%, -0.01)
# benchmark.variants.limit-rev = 10
# benchmark.variants.patch = yes
with-populate: 1.213578
no-populate: 1.205189
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.variants.limit-rev = 1000
# benchmark.variants.patch = no
# benchmark.variants.rev = tip
with-populate: 0.198607
no-populate: 0.195038 (-1.80%, -0.00)
However pre-populating provide a significant boost on more complex operations
like unbundle or push:
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = hg.command.push
# benchmark.variants.explicit-rev = none
# benchmark.variants.issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
with-populate: 4.798632
no-populate: 4.953295 (+3.22%, +0.15)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 4.903618
no-populate: 5.014963 (+2.27%, +0.11)
## benchmark.name = hg.command.unbundle
# benchmark.variants.revs = any-1-extra-rev
with-populate: 1.423411
no-populate: 1.585365 (+11.38%, +0.16)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.537909
no-populate: 1.688489 (+9.79%, +0.15)
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Thu, 11 Apr 2024 00:02:07 +0200 |
parents | 26127236b229 |
children | fe08a0bfa9fd |
line wrap: on
line source
#require bzr $ . "$TESTDIR/bzr-definitions" create and rename on the same file in the same step $ mkdir test-createandrename $ cd test-createandrename $ brz init -q source test empty repo conversion (issue3233) $ hg convert source source-hg initializing destination source-hg repository scanning source... sorting... converting... back to the rename stuff $ cd source $ echo a > a $ echo c > c $ echo e > e $ brz add -q a c e $ brz commit -q -m 'Initial add: a, c, e' $ brz mv a b a => b $ brz mv c d c => d $ brz mv e f e => f $ echo a2 >> a $ mkdir e $ brz add -q a e $ brz commit -q -m 'rename a into b, create a, rename c into d' $ cd .. $ hg convert source source-hg scanning source... sorting... converting... 1 Initial add: a, c, e 0 rename a into b, create a, rename c into d $ glog -R source-hg o 1@source "rename a into b, create a, rename c into d" files+: [b d f], files-: [c e], files: [a] | o 0@source "Initial add: a, c, e" files+: [a c e], files-: [], files: [] manifest $ hg manifest -R source-hg -r tip a b d f test --rev option $ hg convert -r 1 source source-1-hg initializing destination source-1-hg repository scanning source... sorting... converting... 0 Initial add: a, c, e $ glog -R source-1-hg o 0@source "Initial add: a, c, e" files+: [a c e], files-: [], files: [] test with filemap $ cat > filemap <<EOF > exclude a > EOF $ hg convert --filemap filemap source source-filemap-hg initializing destination source-filemap-hg repository scanning source... sorting... converting... 1 Initial add: a, c, e 0 rename a into b, create a, rename c into d $ hg -R source-filemap-hg manifest -r tip b d f convert from lightweight checkout $ brz checkout --lightweight source source-light $ hg convert -s bzr source-light source-light-hg initializing destination source-light-hg repository warning: lightweight checkouts may cause conversion failures, try with a regular branch instead. $TESTTMP/test-createandrename/source-light does not look like a Bazaar repository abort: source-light: missing or unsupported repository [255] extract timestamps that look just like hg's {date|isodate}: yyyy-mm-dd HH:MM zzzz (no seconds!) compare timestamps $ cd source $ brz log | \ > sed '/timestamp/!d;s/.\{15\}\([0-9: -]\{16\}\):.. \(.[0-9]\{4\}\)/\1 \2/' \ > > ../bzr-timestamps $ cd .. $ hg -R source-hg log --template "{date|isodate}\n" > hg-timestamps $ cmp bzr-timestamps hg-timestamps || diff -u bzr-timestamps hg-timestamps $ cd .. merge $ mkdir test-merge $ cd test-merge $ cat > helper.py <<EOF > import sys > from breezy import workingtree > import breezy.bzr.bzrdir > wt = workingtree.WorkingTree.open('.') > > message, stamp = sys.argv[1:] > wt.commit(message, timestamp=int(stamp)) > EOF $ brz init -q source $ cd source $ echo content > a $ echo content2 > b $ brz add -q a b $ brz commit -q -m 'Initial add' $ cd .. $ brz branch -q source source-improve $ cd source $ echo more >> a $ "$PYTHON" ../helper.py 'Editing a' 100 $ cd ../source-improve $ echo content3 >> b $ "$PYTHON" ../helper.py 'Editing b' 200 $ cd ../source $ brz merge -q ../source-improve $ brz commit -q -m 'Merged improve branch' $ cd .. $ hg convert --datesort source source-hg initializing destination source-hg repository scanning source... sorting... converting... 3 Initial add 2 Editing a 1 Editing b 0 Merged improve branch $ glog -R source-hg o 3@source "Merged improve branch" files+: [], files-: [], files: [] |\ | o 2@source-improve "Editing b" files+: [], files-: [], files: [b] | | o | 1@source "Editing a" files+: [], files-: [], files: [a] |/ o 0@source "Initial add" files+: [a b], files-: [], files: [] $ cd .. #if symlink execbit symlinks and executable files $ mkdir test-symlinks $ cd test-symlinks $ brz init -q source $ cd source $ touch program $ chmod +x program $ ln -s program altname $ mkdir d $ echo a > d/a $ ln -s a syma $ brz add -q altname program syma d/a $ brz commit -q -m 'Initial setup' $ touch newprog $ chmod +x newprog $ rm altname $ ln -s newprog altname $ chmod -x program $ brz add -q newprog $ brz commit -q -m 'Symlink changed, x bits changed' $ cd .. $ hg convert source source-hg initializing destination source-hg repository scanning source... sorting... converting... 1 Initial setup 0 Symlink changed, x bits changed $ manifest source-hg 0 % manifest of 0 644 @ altname 644 d/a 755 * program 644 @ syma $ manifest source-hg tip % manifest of tip 644 @ altname 644 d/a 755 * newprog 644 program 644 @ syma test the symlinks can be recreated $ cd source-hg $ hg up 5 files updated, 0 files merged, 0 files removed, 0 files unresolved $ hg cat syma; echo a $ cd ../.. #endif Multiple branches $ brz init-repo -q --no-trees repo $ brz init -q repo/trunk $ brz co repo/trunk repo-trunk $ cd repo-trunk $ echo a > a $ brz add -q a $ brz ci -qm adda $ brz tag trunk-tag Created tag trunk-tag. $ brz switch -b branch Tree is up to date at revision 1. Switched to branch*repo/branch/ (glob) $ echo b > b $ brz add -q b $ brz ci -qm addb $ brz tag branch-tag Created tag branch-tag. $ brz switch --force ../repo/trunk Updated to revision 1. Switched to branch*/repo/trunk/ (glob) $ echo a >> a $ brz ci -qm changea $ cd .. $ hg convert --datesort repo repo-bzr initializing destination repo-bzr repository scanning source... sorting... converting... 2 adda 1 addb 0 changea updating tags $ (cd repo-bzr; glog) o 3@default "update tags" files+: [.hgtags], files-: [], files: [] | o 2@default "changea" files+: [], files-: [], files: [a] | | o 1@branch "addb" files+: [b], files-: [], files: [] |/ o 0@default "adda" files+: [a], files-: [], files: [] Test tags (converted identifiers are not stable because bzr ones are not and get incorporated in extra fields). $ hg -R repo-bzr tags tip 3:* (glob) branch-tag 1:* (glob) trunk-tag 0:* (glob) Nested repositories (issue3254) $ brz init-repo -q --no-trees repo/inner $ brz init -q repo/inner/trunk $ brz co repo/inner/trunk inner-trunk $ cd inner-trunk $ echo b > b $ brz add -q b $ brz ci -qm addb $ cd .. $ hg convert --datesort repo noinner-bzr initializing destination noinner-bzr repository scanning source... sorting... converting... 2 adda 1 addb 0 changea updating tags