Mercurial > hg
view tests/test-convert-hg-sink.t @ 49658:523cacdfd324
delta-find: set the default candidate chunk size to 10
I ran performance and storage tests on repositories of various sizes and shapes
for the following values of the config : 5, 10, 20, 50, 100, no-chunking
The performance tests do not show any statistical impact on computation
times for large pushes and pulls.
For searching for an individual delta, this can provide a significant
performance improvement with a minor degradation of space-quality on the
result. (see data at the end of the commit).
For overall store size, the change :
- does not have any impact on many small repositories,
- has an observable, but very negligible impact on most larger repositories.
- One private repository we use for testing sees a small increase in size
(1%) in the narrower version.
We will try to get more numbers on a larger version of that repository to
make sure nothing pathological happens.
We pick "10" as the limit as "5" seems a bit more risky.
There are room to improve the current code, by using more aggressive filtering
and better (i.e any) sorting of the candidates. However this is already a large
improvement for pathological cases, with little impact in the common
situations.
The initial motivation for this change is to fix performance of delta
computation for a file where the previous code ended up testing 20 000 possible
candidate-bases in one go, which is… slow. This affected about ½ of the file
revisions leading to atrocious performance, especially during some push/pull
operations.
Details about individual delta finding timing:
----------------------------------------------
The vast majority of benchmark cases are unchanged but the three below. The first
two do not see any impact on the final delta. The last one sees a change in
delta-size that is negligible compared to the full text size.
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = perf-delta-find
# benchmark.variants.rev = manifest-snapshot-many-tries-a (revision 756096)
∞: 5.844783
5: 4.473523 (-23.46%)
10: 4.970053 (-14.97%)
20: 5.770386 (-1.27%)
50 5.821358
100: 5.834887
MANIFESTLOG: rev = 756096: (no-limit)
delta-base = 301840
search-rounds = 6
try-count = 60
delta-type = snapshot
snap-depth = 7
delta-size = 179
MANIFESTLOG: rev=756096: (limit = 10)
delta-base=301840
search-rounds=9
try-count=51
delta-type=snapshot
snap-depth=7
delta-size=179
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = perf-delta-find
# benchmark.variants.rev = manifest-snapshot-many-tries-d (revision 754060)
∞: 5.017663
5: 3.655931 (-27.14%)
10: 4.095436 (-18.38%)
20: 4.828949 (-3.76%)
50 4.987574
100: 4.994889
MANIFESTLOG: rev=754060: (no limit)
delta-base=301840
search-rounds=5
try-count=53
delta-type=snapshot
snap-depth=7
delta-size = 179
MANIFESTLOG: rev=754060: (limite = 10)
delta-base=301840
search-rounds=8
try-count=45
delta-type=snapshot
snap-depth=7
delta-size = 179
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = perf-delta-find
# bin-env-vars.hg.flavor = rust
# benchmark.variants.rev = manifest-snapshot-many-tries-e (revision 693368)
∞: 4.869282
5: 2.039732 (-58.11%)
10: 2.413537 (-50.43%)
20: 4.449639 (-8.62%)
50 4.865863
100: 4.882649
MANIFESTLOG: rev=693368:
delta-base=693336
search-rounds=6
try-count=53
delta-type=snapshot
snap-depth=6
full-test-size=131065
delta-size=199
MANIFESTLOG: rev=693368:
delta-base=278023
search-rounds=5
try-count=21
delta-type=snapshot
snap-depth=4
full-test-size=131065
delta-size=278
Raw data for store size (in bytes) for various chunk size value below:
----------------------------------------------------------------------
440 134 384 5 pypy/.hg/store/
440 134 384 10 pypy/.hg/store/
440 134 384 20 pypy/.hg/store/
440 134 384 50 pypy/.hg/store/
440 134 384 100 pypy/.hg/store/
440 134 384 ... pypy/.hg/store/
666 987 471 5 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 10 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 20 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 50 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 100 netbsd-xsrc-2022-11-15/.hg/store/
666 987 471 ... netbsd-xsrc-2022-11-15/.hg/store/
852 844 884 5 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 10 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 20 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 50 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 100 netbsd-pkgsrc-2022-11-15/.hg/store/
852 844 884 ... netbsd-pkgsrc-2022-11-15/.hg/store/
1 504 227 981 5 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 871 10 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 20 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 50 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 100 netbeans-2018-08-01-sparse-zstd/.hg/store/
1 504 227 813 ... netbeans-2018-08-01-sparse-zstd/.hg/store/
3 875 801 068 5 netbsd-src-2022-11-15/.hg/store/
3 875 696 767 10 netbsd-src-2022-11-15/.hg/store/
3 875 696 757 20 netbsd-src-2022-11-15/.hg/store/
3 875 696 653 50 netbsd-src-2022-11-15/.hg/store/
3 875 696 653 100 netbsd-src-2022-11-15/.hg/store/
3 875 696 653 ... netbsd-src-2022-11-15/.hg/store/
4 531 441 314 5 mozilla-central/.hg/store/
4 531 435 157 10 mozilla-central/.hg/store/
4 531 432 045 20 mozilla-central/.hg/store/
4 531 429 119 50 mozilla-central/.hg/store/
4 531 429 119 100 mozilla-central/.hg/store/
4 531 429 119 ... mozilla-central/.hg/store/
4 875 861 390 5 mozilla-unified/.hg/store/
4 875 855 155 10 mozilla-unified/.hg/store/
4 875 852 027 20 mozilla-unified/.hg/store/
4 875 848 851 50 mozilla-unified/.hg/store/
4 875 848 851 100 mozilla-unified/.hg/store/
4 875 848 851 ... mozilla-unified/.hg/store/
11 498 764 601 5 mozilla-try/.hg/store/
11 497 968 858 10 mozilla-try/.hg/store/
11 497 958 730 20 mozilla-try/.hg/store/
11 497 927 156 50 mozilla-try/.hg/store/
11 497 925 963 100 mozilla-try/.hg/store/
11 497 923 428 ... mozilla-try/.hg/store/
10 047 914 031 5 private-repo
9 969 132 101 10 private-repo
9 944 745 015 20 private-repo
9 939 756 703 50 private-repo
9 939 833 016 100 private-repo
9 939 822 035 ... private-repo
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Wed, 23 Nov 2022 19:08:27 +0100 |
parents | c0e1ea0c4cee |
children |
line wrap: on
line source
$ cat >> $HGRCPATH <<EOF > [extensions] > convert= > [convert] > hg.saverev=False > EOF $ hg init orig $ cd orig $ echo foo > foo $ echo bar > bar $ hg ci -qAm 'add foo and bar' $ hg rm foo $ hg ci -m 'remove foo' $ mkdir foo $ echo file > foo/file $ hg ci -qAm 'add foo/file' $ hg tag some-tag $ hg tag -l local-tag $ echo '1234567890123456789012345678901234567890 missing_tag' >> .hgtags $ hg ci -m 'add a missing tag' $ hg log changeset: 4:3fb95ee23a66 tag: tip user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: add a missing tag changeset: 3:593cbf6fb2b4 tag: local-tag user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: Added tag some-tag for changeset ad681a868e44 changeset: 2:ad681a868e44 tag: some-tag user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: add foo/file changeset: 1:cbba8ecc03b7 user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: remove foo changeset: 0:327daa9251fa user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: add foo and bar $ hg phase --public -r tip $ cd .. $ hg convert orig new 2>&1 | grep -v 'subversion python bindings could not be loaded' initializing destination new repository scanning source... sorting... converting... 4 add foo and bar 3 remove foo 2 add foo/file 1 Added tag some-tag for changeset ad681a868e44 0 add a missing tag missing tag entry: "1234567890123456789012345678901234567890 missing_tag" $ cd new $ hg log -G --template '{rev} {node|short} ({phase}) "{desc}"\n' o 4 3fb95ee23a66 (public) "add a missing tag" | o 3 593cbf6fb2b4 (public) "Added tag some-tag for changeset ad681a868e44" | o 2 ad681a868e44 (public) "add foo/file" | o 1 cbba8ecc03b7 (public) "remove foo" | o 0 327daa9251fa (public) "add foo and bar" $ hg out ../orig comparing with ../orig searching for changes no changes found [1] dirstate should be empty: $ hg debugstate $ hg parents -q $ hg up -C 3 files updated, 0 files merged, 0 files removed, 0 files unresolved $ hg copy bar baz put something in the dirstate: $ hg debugstate > debugstate $ grep baz debugstate a 0 -1 unset baz copy: bar -> baz add a new revision in the original repo $ cd ../orig $ echo baz > baz $ hg ci -qAm 'add baz' $ cd .. $ hg convert orig new 2>&1 | grep -v 'subversion python bindings could not be loaded' scanning source... sorting... converting... 0 add baz $ cd new $ hg out ../orig comparing with ../orig searching for changes no changes found [1] dirstate should be the same (no output below): $ hg debugstate > new-debugstate $ diff debugstate new-debugstate no copies $ hg up -C 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ hg debugrename baz baz not renamed $ cd .. test tag rewriting $ cat > filemap <<EOF > exclude foo > EOF $ hg convert --filemap filemap orig new-filemap 2>&1 | grep -v 'subversion python bindings could not be loaded' initializing destination new-filemap repository scanning source... sorting... converting... 5 add foo and bar 4 remove foo 3 add foo/file 2 Added tag some-tag for changeset ad681a868e44 1 add a missing tag missing tag entry: "1234567890123456789012345678901234567890 missing_tag" 0 add baz $ cd new-filemap $ hg tags tip 3:7bb553f2c68a some-tag 0:ba8636729451 $ cd .. Test cases for hg-hg roundtrip Helper $ glog() > { > hg log -G --template '{rev} {node|short} ({phase}) "{desc}" files: {files}\n' $* > } Create a tricky source repo $ hg init source $ cd source $ echo 0 > 0 $ hg ci -Aqm '0: add 0' $ echo a > a $ mkdir dir $ echo b > dir/b $ hg ci -qAm '1: add a and dir/b' $ echo c > dir/c $ hg ci -qAm '2: add dir/c' $ hg copy a e $ echo b >> b $ hg ci -qAm '3: copy a to e, change b' $ hg up -qr -3 $ echo a >> a $ hg ci -qAm '4: change a' $ hg merge merging a and e to e 2 files updated, 1 files merged, 0 files removed, 0 files unresolved (branch merge, don't forget to commit) $ hg copy b dir/d $ hg ci -qAm '5: merge 2 and 3, copy b to dir/d' $ echo a >> a $ hg ci -qAm '6: change a' $ hg mani 0 a b dir/b dir/c dir/d e $ hg phase --public -r tip $ glog @ 6 0613c8e59a3d (public) "6: change a" files: a | o 5 717e9b37cdb7 (public) "5: merge 2 and 3, copy b to dir/d" files: dir/d e |\ | o 4 86a55cb968d5 (public) "4: change a" files: a | | o | 3 0e6e235919dd (public) "3: copy a to e, change b" files: b e | | o | 2 0394b0d5e4f7 (public) "2: add dir/c" files: dir/c |/ o 1 333546584845 (public) "1: add a and dir/b" files: a dir/b | o 0 d1a24e2ebd23 (public) "0: add 0" files: 0 $ cd .. Convert excluding rev 0 and dir/ (and thus rev2): $ cat << EOF > filemap > exclude dir > EOF $ hg convert --filemap filemap source dest --config convert.hg.revs=1:: initializing destination dest repository scanning source... sorting... converting... 5 1: add a and dir/b 4 2: add dir/c 3 3: copy a to e, change b 2 4: change a 1 5: merge 2 and 3, copy b to dir/d 0 6: change a Verify that conversion skipped rev 2: $ glog -R dest o 4 78814e84a217 (draft) "6: change a" files: a | o 3 f7cff662c5e5 (draft) "5: merge 2 and 3, copy b to dir/d" files: e |\ | o 2 ab40a95b0072 (draft) "4: change a" files: a | | o | 1 bd51f17597bf (draft) "3: copy a to e, change b" files: b e |/ o 0 a4a1dae0fe35 (draft) "1: add a and dir/b" files: 0 a Verify mapping correct in both directions: $ cat source/.hg/shamap a4a1dae0fe3514cefd9b8541b7abbc8f44f946d5 333546584845f70c4cfecb992341aaef0e708166 bd51f17597bf32268e68a560b206898c3960cda2 0e6e235919dd8e9285ba8eb5adf703af9ad99378 ab40a95b00725307e79c2fd271000aa8af9759f4 86a55cb968d51770cba2a1630d6cc637b574580a f7cff662c5e581e6f3f1a85ffdd2bcb35825f6ba 717e9b37cdb7eb9917ca8e30aa3f986e6d5b177d 78814e84a217894517c2de392b903ed05e6871a4 0613c8e59a3ddb9789072ef52f1ed13496489bb4 $ cat dest/.hg/shamap 333546584845f70c4cfecb992341aaef0e708166 a4a1dae0fe3514cefd9b8541b7abbc8f44f946d5 0394b0d5e4f761ced559fd0bbdc6afc16cb3f7d1 a4a1dae0fe3514cefd9b8541b7abbc8f44f946d5 0e6e235919dd8e9285ba8eb5adf703af9ad99378 bd51f17597bf32268e68a560b206898c3960cda2 86a55cb968d51770cba2a1630d6cc637b574580a ab40a95b00725307e79c2fd271000aa8af9759f4 717e9b37cdb7eb9917ca8e30aa3f986e6d5b177d f7cff662c5e581e6f3f1a85ffdd2bcb35825f6ba 0613c8e59a3ddb9789072ef52f1ed13496489bb4 78814e84a217894517c2de392b903ed05e6871a4 Verify meta data converted correctly: $ hg -R dest log -r 1 --debug -p --git changeset: 1:bd51f17597bf32268e68a560b206898c3960cda2 phase: draft parent: 0:a4a1dae0fe3514cefd9b8541b7abbc8f44f946d5 parent: -1:0000000000000000000000000000000000000000 manifest: 1:040c72ed9b101773c24ac314776bfc846943781f user: test date: Thu Jan 01 00:00:00 1970 +0000 files+: b e extra: branch=default description: 3: copy a to e, change b diff --git a/b b/b new file mode 100644 --- /dev/null +++ b/b @@ -0,0 +1,1 @@ +b diff --git a/a b/e copy from a copy to e Verify files included and excluded correctly: $ hg -R dest manifest -r tip 0 a b e Make changes in dest and convert back: $ hg -R dest up -q $ echo dest > dest/dest $ hg -R dest ci -Aqm 'change in dest' $ hg -R dest tip changeset: 5:a2e0e3cc6d1d tag: tip user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: change in dest (converting merges back after using a filemap will probably cause chaos so we exclude merges.) $ hg convert dest source --config convert.hg.revs='!merge()' scanning source... sorting... converting... 0 change in dest Verify the conversion back: $ hg -R source log --debug -r tip changeset: 7:e6d364a69ff1248b2099e603b0c145504cade6f0 tag: tip phase: draft parent: 6:0613c8e59a3ddb9789072ef52f1ed13496489bb4 parent: -1:0000000000000000000000000000000000000000 manifest: 7:aa3e9542f3b76d4f1f1b2e9c7ce9dbb48b6a95ec user: test date: Thu Jan 01 00:00:00 1970 +0000 files+: dest extra: branch=default description: change in dest Files that had been excluded are still present: $ hg -R source manifest -r tip 0 a b dest dir/b dir/c dir/d e More source changes $ cd source $ echo 1 >> a $ hg ci -m '8: source first branch' created new head $ hg up -qr -2 $ echo 2 >> a $ hg ci -m '9: source second branch' $ hg merge -q --tool internal:local $ hg ci -m '10: source merge' $ echo >> a $ hg ci -m '11: source change' $ hg mani 0 a b dest dir/b dir/c dir/d e $ glog -r 6: @ 11 0c8927d1f7f4 (draft) "11: source change" files: a | o 10 9ccb7ee8d261 (draft) "10: source merge" files: a |\ | o 9 f131b1518dba (draft) "9: source second branch" files: a | | o | 8 669cf0e74b50 (draft) "8: source first branch" files: a | | | o 7 e6d364a69ff1 (draft) "change in dest" files: dest |/ o 6 0613c8e59a3d (public) "6: change a" files: a | ~ $ cd .. $ hg convert --filemap filemap source dest --config convert.hg.revs=3: scanning source... sorting... converting... 3 8: source first branch 2 9: source second branch 1 10: source merge 0 11: source change $ glog -R dest o 9 8432d597b263 (draft) "11: source change" files: a | o 8 632ffacdcd6f (draft) "10: source merge" files: a |\ | o 7 049cfee90ee6 (draft) "9: source second branch" files: a | | o | 6 9b6845e036e5 (draft) "8: source first branch" files: a | | | @ 5 a2e0e3cc6d1d (draft) "change in dest" files: dest |/ o 4 78814e84a217 (draft) "6: change a" files: a | o 3 f7cff662c5e5 (draft) "5: merge 2 and 3, copy b to dir/d" files: e |\ | o 2 ab40a95b0072 (draft) "4: change a" files: a | | o | 1 bd51f17597bf (draft) "3: copy a to e, change b" files: b e |/ o 0 a4a1dae0fe35 (draft) "1: add a and dir/b" files: 0 a $ cd .. Two way tests $ hg init 0 $ echo f > 0/f $ echo a > 0/a-only $ echo b > 0/b-only $ hg -R 0 ci -Aqm0 $ cat << EOF > filemap-a > exclude b-only > EOF $ cat << EOF > filemap-b > exclude a-only > EOF $ hg convert --filemap filemap-a 0 a initializing destination a repository scanning source... sorting... converting... 0 0 $ hg -R a up -q $ echo a > a/f $ hg -R a ci -ma $ hg convert --filemap filemap-b 0 b initializing destination b repository scanning source... sorting... converting... 0 0 $ hg -R b up -q $ echo b > b/f $ hg -R b ci -mb $ tail 0/.hg/shamap 86f3f774ffb682bffb5dc3c1d3b3da637cb9a0d6 8a028c7c77f6c7bd6d63bc3f02ca9f779eabf16a dd9f218eb91fb857f2a62fe023e1d64a4e7812fe 8a028c7c77f6c7bd6d63bc3f02ca9f779eabf16a $ tail a/.hg/shamap 8a028c7c77f6c7bd6d63bc3f02ca9f779eabf16a 86f3f774ffb682bffb5dc3c1d3b3da637cb9a0d6 $ tail b/.hg/shamap 8a028c7c77f6c7bd6d63bc3f02ca9f779eabf16a dd9f218eb91fb857f2a62fe023e1d64a4e7812fe $ hg convert a 0 scanning source... sorting... converting... 0 a $ hg convert b 0 scanning source... sorting... converting... 0 b $ hg -R 0 log -G o changeset: 2:637fbbbe96b6 | tag: tip | parent: 0:8a028c7c77f6 | user: test | date: Thu Jan 01 00:00:00 1970 +0000 | summary: b | | o changeset: 1:ec7b9c96e692 |/ user: test | date: Thu Jan 01 00:00:00 1970 +0000 | summary: a | @ changeset: 0:8a028c7c77f6 user: test date: Thu Jan 01 00:00:00 1970 +0000 summary: 0 $ hg convert --filemap filemap-b 0 a --config convert.hg.revs=1:: scanning source... sorting... converting... $ hg -R 0 up -r1 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ echo f >> 0/f $ hg -R 0 ci -mx $ hg convert --filemap filemap-b 0 a --config convert.hg.revs=1:: scanning source... sorting... converting... 0 x $ hg -R a log -G -T '{rev} {desc|firstline} ({files})\n' o 2 x (f) | @ 1 a (f) | o 0 0 (a-only f) $ hg -R a mani -r tip a-only f An additional round, demonstrating that unchanged files don't get converted $ echo f >> 0/f $ echo f >> 0/a-only $ hg -R 0 ci -m "extra f+a-only change" $ hg convert --filemap filemap-b 0 a --config convert.hg.revs=1:: scanning source... sorting... converting... 0 extra f+a-only change $ hg -R a log -G -T '{rev} {desc|firstline} ({files})\n' o 3 extra f+a-only change (f) | o 2 x (f) | @ 1 a (f) | o 0 0 (a-only f) Conversion after rollback $ hg -R a rollback -f repository tip rolled back to revision 2 (undo convert) $ hg convert --filemap filemap-b 0 a --config convert.hg.revs=1:: scanning source... sorting... converting... 0 extra f+a-only change $ hg -R a log -G -T '{rev} {desc|firstline} ({files})\n' o 3 extra f+a-only change (f) | o 2 x (f) | @ 1 a (f) | o 0 0 (a-only f) Convert with --full adds and removes files that didn't change $ echo f >> 0/f $ hg -R 0 ci -m "f" $ hg convert --filemap filemap-b --full 0 a --config convert.hg.revs=1:: scanning source... sorting... converting... 0 f $ hg -R a status --change tip M f A b-only R a-only Recorded {files} list does not get confused about flags on merge commits #if execbit $ cd .. $ hg init merge-flags-orig $ cd merge-flags-orig $ echo 0 > 0 $ hg ci -Aqm 'add 0' $ echo a > a $ chmod +x a $ hg ci -qAm 'add executable file' $ hg co -q 0 $ echo b > b $ hg ci -qAm 'add file' $ hg merge -q $ hg ci -m 'merge' $ hg log -G -T '{rev} {desc}\n' @ 3 merge |\ | o 2 add file | | o | 1 add executable file |/ o 0 add 0 # No files changed $ hg log -r 3 -T '{files}\n' $ cd .. $ hg convert merge-flags-orig merge-flags-new -q $ cd merge-flags-new $ hg log -G -T '{rev} {desc}\n' o 3 merge |\ | o 2 add file | | o | 1 add executable file |/ o 0 add 0 # Still no files $ hg log -r 3 -T '{files}\n' #endif