Mercurial > hg
view tests/test-convert-cvs-detectmerge.t @ 51681:522b4d729e89
mmap: populate the mapping by default
Without pre-population, accessing all data through a mmap can result in many
pagefault, reducing performance significantly. If the mmap is prepopulated, the
performance can no longer get slower than a full read.
(See benchmark number below)
In some cases were very few data is read, prepopulating can be overkill and
slower than populating on access (through page fault). So that behavior can be
controlled when the caller can pre-determine the best behavior.
(See benchmark number below)
In addition, testing with populating in a secondary thread yield great result
combining the best of each approach. This might be implemented in later
changesets.
In all cases, using mmap has a great effect on memory usage when many processes
run in parallel on the same machine.
### Benchmarks
# What did I run
A couple of month back I ran a large benchmark campaign to assess the impact of
various approach for using mmap with the revlog (and other files), it
highlighted a few benchmarks that capture the impact of the changes well. So to
validate this change I checked the following:
- log command displaying various revisions
(read the changelog index)
- log command displaying the patch of listed revisions
(read the changelog index, the manifest index and a few files indexes)
- unbundling a few revisions
(read and write changelog, manifest and few files indexes, and walk the graph
to update some cache)
- pushing a few revisions
(read and write changelog, manifest and few files indexes, walk the graph to
update some cache, performs various accesses locally and remotely during
discovery)
Benchmarks were run using the default module policy (c+py) and the rust one. No
significant difference were found between the two implementation, so we will
present result using the default policy (unless otherwise specified).
I ran them on a few repositories :
- mercurial: a "public changeset only" copy of mercurial from 2018-08-01 using
zstd compression and sparse-revlog
- pypy: a copy of pypy from 2018-08-01 using zstd compression and sparse-revlog
- netbeans: a copy of netbeans from 2018-08-01 using zstd compression and
sparse-revlog
- mozilla-try: a copy of mozilla-try from 2019-02-18 using zstd compression and
sparse-revlog
- mozilla-try persistent-nodemap: Same as the above but with a persistent
nodemap. Used for the log --patch benchmark only
# Results
For the smaller repositories (mercurial, pypy), the impact of mmap is almost
imperceptible, other cost dominating the operation. The impact of prepopulating
is undiscernible in the benchmark we ran.
For larger repositories the benchmark support explanation given above:
On netbeans, the log can be about 1% faster without repopulation (for a
difference < 100ms) but unbundle becomes a bit slower, even when small.
### data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.command.unbundle
# benchmark.variants.issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
# benchmark.variants.source = unbundle
# benchmark.variants.verbosity = quiet
with-populate: 0.240157
no-populate: 0.265087 (+10.38%, +0.02)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.459518
no-populate: 1.481290 (+1.49%, +0.02)
## benchmark.name = hg.command.push
# benchmark.variants.explicit-rev = none
# benchmark.variants.issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
with-populate: 0.771919
no-populate: 0.792025 (+2.60%, +0.02)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.459518
no-populate: 1.481290 (+1.49%, +0.02)
For mozilla-try, the "slow down" from pre-populate for small `hg log` is more
visible, but still small in absolute time. (using rust value for the persistent
nodemap value to be relevant).
### data-env-vars.name = mozilla-try-2019-02-18-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# benchmark.variants.patch = yes
# benchmark.variants.limit-rev = 1
with-populate: 0.237813
no-populate: 0.229452 (-3.52%, -0.01)
# benchmark.variants.limit-rev = 10
# benchmark.variants.patch = yes
with-populate: 1.213578
no-populate: 1.205189
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.variants.limit-rev = 1000
# benchmark.variants.patch = no
# benchmark.variants.rev = tip
with-populate: 0.198607
no-populate: 0.195038 (-1.80%, -0.00)
However pre-populating provide a significant boost on more complex operations
like unbundle or push:
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = hg.command.push
# benchmark.variants.explicit-rev = none
# benchmark.variants.issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
with-populate: 4.798632
no-populate: 4.953295 (+3.22%, +0.15)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 4.903618
no-populate: 5.014963 (+2.27%, +0.11)
## benchmark.name = hg.command.unbundle
# benchmark.variants.revs = any-1-extra-rev
with-populate: 1.423411
no-populate: 1.585365 (+11.38%, +0.16)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.537909
no-populate: 1.688489 (+9.79%, +0.15)
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Thu, 11 Apr 2024 00:02:07 +0200 |
parents | e5e5ee2b60e4 |
children |
line wrap: on
line source
#require cvs no-root Test config convert.cvsps.mergefrom config setting. (Should test similar mergeto feature, but I don't understand it yet.) Requires builtin cvsps. $ CVSROOT=`pwd`/cvsrepo $ export CVSROOT $ cvscall() > { > cvs -f "$@" > } output of 'cvs ci' varies unpredictably, so just discard it XXX copied from test-convert-cvs-synthetic $ cvsci() > { > sleep 1 > cvs -f ci "$@" > /dev/null > } XXX copied from test-convert-cvs-synthetic $ cat <<EOF >> $HGRCPATH > [extensions] > convert = > [convert] > cvsps.cache = 0 > cvsps.mergefrom = \[MERGE from (\S+)\] > EOF create cvs repository with one project $ cvscall -q -d "$CVSROOT" init $ mkdir cvsrepo/proj populate cvs repository $ cvscall -Q co proj $ cd proj $ touch file1 $ cvscall -Q add file1 $ cvsci -m"add file1 on trunk" cvs commit: Examining . create two release branches $ cvscall -q tag -b v1_0 T file1 $ cvscall -q tag -b v1_1 T file1 modify file1 on branch v1_0 $ cvscall -Q update -rv1_0 $ sleep 1 $ echo "change" >> file1 $ cvsci -m"add text" cvs commit: Examining . make unrelated change on v1_1 $ cvscall -Q update -rv1_1 $ touch unrelated $ cvscall -Q add unrelated $ cvsci -m"unrelated change" cvs commit: Examining . merge file1 to v1_1 $ cvscall -Q update -jv1_0 RCS file: $TESTTMP/cvsrepo/proj/file1,v retrieving revision 1.1 retrieving revision 1.1.2.1 Merging differences between 1.1 and 1.1.2.1 into file1 $ cvsci -m"add text [MERGE from v1_0]" cvs commit: Examining . merge change to trunk $ cvscall -Q update -A $ cvscall -Q update -jv1_1 RCS file: $TESTTMP/cvsrepo/proj/file1,v retrieving revision 1.1 retrieving revision 1.1.4.1 Merging differences between 1.1 and 1.1.4.1 into file1 $ cvsci -m"add text [MERGE from v1_1]" cvs commit: Examining . non-merged change on trunk $ echo "foo" > file2 $ cvscall -Q add file2 $ cvsci -m"add file2 on trunk" file2 this will create rev 1.3 change on trunk to backport $ echo "backport me" >> file1 $ cvsci -m"add other text" file1 $ cvscall log file1 RCS file: $TESTTMP/cvsrepo/proj/file1,v Working file: file1 head: 1.3 branch: locks: strict access list: symbolic names: v1_1: 1.1.0.4 v1_0: 1.1.0.2 keyword substitution: kv total revisions: 5; selected revisions: 5 description: ---------------------------- revision 1.3 date: * (glob) add other text ---------------------------- revision 1.2 date: * (glob) add text [MERGE from v1_1] ---------------------------- revision 1.1 date: * (glob) branches: 1.1.2; 1.1.4; add file1 on trunk ---------------------------- revision 1.1.4.1 date: * (glob) add text [MERGE from v1_0] ---------------------------- revision 1.1.2.1 date: * (glob) add text ============================================================================= XXX how many ways are there to spell "trunk" with CVS? backport trunk change to v1_1 $ cvscall -Q update -rv1_1 $ cvscall -Q update -j1.2 -j1.3 file1 RCS file: $TESTTMP/cvsrepo/proj/file1,v retrieving revision 1.2 retrieving revision 1.3 Merging differences between 1.2 and 1.3 into file1 $ cvsci -m"add other text [MERGE from HEAD]" file1 fix bug on v1_1, merge to trunk with error $ cvscall -Q update -rv1_1 $ echo "merge forward" >> file1 $ cvscall -Q tag unmerged $ cvsci -m"fix file1" cvs commit: Examining . $ cvscall -Q update -A $ cvscall -Q update -junmerged -jv1_1 RCS file: $TESTTMP/cvsrepo/proj/file1,v retrieving revision 1.1.4.2 retrieving revision 1.1.4.3 Merging differences between 1.1.4.2 and 1.1.4.3 into file1 note the typo in the commit log message $ cvsci -m"fix file1 [MERGE from v1-1]" cvs commit: Examining . $ cvs -Q tag -d unmerged convert to hg $ cd .. $ hg convert proj proj.hg initializing destination proj.hg repository connecting to $TESTTMP/cvsrepo scanning source... collecting CVS rlog 12 log entries creating changesets warning: CVS commit message references non-existent branch 'v1-1': fix file1 [MERGE from v1-1] 10 changeset entries sorting... converting... 9 add file1 on trunk 8 unrelated change 7 add text 6 add text [MERGE from v1_0] 5 add text [MERGE from v1_1] 4 add file2 on trunk 3 add other text 2 add other text [MERGE from HEAD] 1 fix file1 0 fix file1 [MERGE from v1-1] complete log $ template="{rev}: '{branches}' {desc}\n" $ hg -R proj.hg log --template="$template" 9: '' fix file1 [MERGE from v1-1] 8: 'v1_1' fix file1 7: 'v1_1' add other text [MERGE from HEAD] 6: '' add other text 5: '' add file2 on trunk 4: '' add text [MERGE from v1_1] 3: 'v1_1' add text [MERGE from v1_0] 2: 'v1_0' add text 1: 'v1_1' unrelated change 0: '' add file1 on trunk graphical log $ hg -R proj.hg log -G --template="$template" o 9: '' fix file1 [MERGE from v1-1] | | o 8: 'v1_1' fix file1 | | | o 7: 'v1_1' add other text [MERGE from HEAD] |/| o | 6: '' add other text | | o | 5: '' add file2 on trunk | | o | 4: '' add text [MERGE from v1_1] |\| | o 3: 'v1_1' add text [MERGE from v1_0] | |\ +---o 2: 'v1_0' add text | | | o 1: 'v1_1' unrelated change |/ o 0: '' add file1 on trunk