annotate tests/test-convert-bzr-merges.t @ 30442:41a8106789ca

util: implement zstd compression engine Now that zstd is vendored and being built (in some configurations), we can implement a compression engine for zstd! The zstd engine is a little different from existing engines. Because it may not always be present, we have to defer load the module in case importing it fails. We facilitate this via a cached property that holds a reference to the module or None. The "available" method is implemented to reflect reality. The zstd engine declares its ability to handle bundles using the "zstd" human name and the "ZS" internal name. The latter was chosen because internal names are 2 characters (by only convention I think) and "ZS" seems reasonable. The engine, like others, supports specifying the compression level. However, there are no consumers of this API that yet pass in that argument. I have plans to change that, so stay tuned. Since all we need to do to support bundle generation with a new compression engine is implement and register the compression engine, bundle generation with zstd "just works!" Tests demonstrating this have been added. How does performance of zstd for bundle generation compare? On the mozilla-unified repo, `hg bundle --all -t <engine>-v2` yields the following on my i7-6700K on Linux: engine CPU time bundle size vs orig size throughput none 97.0s 4,054,405,584 100.0% 41.8 MB/s bzip2 (l=9) 393.6s 975,343,098 24.0% 10.3 MB/s gzip (l=6) 184.0s 1,140,533,074 28.1% 22.0 MB/s zstd (l=1) 108.2s 1,119,434,718 27.6% 37.5 MB/s zstd (l=2) 111.3s 1,078,328,002 26.6% 36.4 MB/s zstd (l=3) 113.7s 1,011,823,727 25.0% 35.7 MB/s zstd (l=4) 116.0s 1,008,965,888 24.9% 35.0 MB/s zstd (l=5) 121.0s 977,203,148 24.1% 33.5 MB/s zstd (l=6) 131.7s 927,360,198 22.9% 30.8 MB/s zstd (l=7) 139.0s 912,808,505 22.5% 29.2 MB/s zstd (l=12) 198.1s 854,527,714 21.1% 20.5 MB/s zstd (l=18) 681.6s 789,750,690 19.5% 5.9 MB/s On compression, zstd for bundle generation delivers: * better compression than gzip with significantly less CPU utilization * better than bzip2 compression ratios while still being significantly faster than gzip * ability to aggressively tune compression level to achieve significantly smaller bundles That last point is important. With clone bundles, a server can pre-generate a bundle file, upload it to a static file server, and redirect clients to transparently download it during clone. The server could choose to produce a zstd bundle with the highest compression settings possible. This would take a very long time - a magnitude longer than a typical zstd bundle generation - but the result would be hundreds of megabytes smaller! For the clone volume we do at Mozilla, this could translate to petabytes of bandwidth savings per year and faster clones (due to smaller transfer size). I don't have detailed numbers to report on decompression. However, zstd decompression is fast: >1 GB/s output throughput on this machine, even through the Python bindings. And it can do that regardless of the compression level of the input. By the time you have enough data to worry about overhead of decompression, you have plenty of other things to worry about performance wise. zstd is wins all around. I can't wait to implement support for it on the wire protocol and in revlogs.
author Gregory Szorc <gregory.szorc@gmail.com>
date Fri, 11 Nov 2016 01:10:07 -0800
parents 89872688893f
children 30a027c0e327
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
26066
89872688893f tests: move '#require bzr' into .t files
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16913
diff changeset
1 #require bzr
89872688893f tests: move '#require bzr' into .t files
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16913
diff changeset
2
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
3 N.B. bzr 1.13 has a bug that breaks this test. If you see this
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
4 test fail, check your bzr version. Upgrading to bzr 1.13.1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
5 should fix it.
7053
209ef5f3534c convert: add bzr source
Marek Kubica <marek@xivilization.net>
parents:
diff changeset
6
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
7 $ . "$TESTDIR/bzr-definitions"
8084
5b3fee9c1f4d Add comment about this test failing under bzr 1.13 due to a bug in bzr.
Greg Ward <greg-hg@gerg.ca>
parents: 7604
diff changeset
8
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
9 test multiple merges at once
7053
209ef5f3534c convert: add bzr source
Marek Kubica <marek@xivilization.net>
parents:
diff changeset
10
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
11 $ mkdir test-multimerge
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
12 $ cd test-multimerge
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
13 $ bzr init -q source
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
14 $ cd source
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
15 $ echo content > file
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
16 $ bzr add -q file
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
17 $ bzr commit -q -m 'Initial add'
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
18 $ cd ..
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
19 $ bzr branch -q source source-branch1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
20 $ cd source-branch1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
21 $ echo morecontent >> file
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
22 $ echo evenmorecontent > file-branch1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
23 $ bzr add -q file-branch1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
24 $ bzr commit -q -m 'Added branch1 file'
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
25 $ cd ../source
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
26 $ sleep 1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
27 $ echo content > file-parent
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
28 $ bzr add -q file-parent
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
29 $ bzr commit -q -m 'Added parent file'
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
30 $ cd ..
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
31 $ bzr branch -q source source-branch2
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
32 $ cd source-branch2
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
33 $ echo somecontent > file-branch2
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
34 $ bzr add -q file-branch2
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
35 $ bzr commit -q -m 'Added brach2 file'
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
36 $ sleep 1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
37 $ cd ../source
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
38 $ bzr merge -q ../source-branch1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
39 $ bzr merge -q --force ../source-branch2
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
40 $ bzr commit -q -m 'Merged branches'
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
41 $ cd ..
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
42 $ hg convert --datesort source source-hg
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
43 initializing destination source-hg repository
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
44 scanning source...
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
45 sorting...
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
46 converting...
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
47 4 Initial add
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
48 3 Added branch1 file
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
49 2 Added parent file
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
50 1 Added brach2 file
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
51 0 Merged branches
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
52 $ glog -R source-hg
16060
f84dda152a55 convert/bzr: convert all branches (issue3229) (BC)
Patrick Mezard <pmezard@gmail.com>
parents: 12516
diff changeset
53 o 5@source "(octopus merge fixup)" files:
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
54 |\
16060
f84dda152a55 convert/bzr: convert all branches (issue3229) (BC)
Patrick Mezard <pmezard@gmail.com>
parents: 12516
diff changeset
55 | o 4@source "Merged branches" files: file-branch2
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
56 | |\
16060
f84dda152a55 convert/bzr: convert all branches (issue3229) (BC)
Patrick Mezard <pmezard@gmail.com>
parents: 12516
diff changeset
57 o---+ 3@source-branch2 "Added brach2 file" files: file-branch2
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
58 / /
16060
f84dda152a55 convert/bzr: convert all branches (issue3229) (BC)
Patrick Mezard <pmezard@gmail.com>
parents: 12516
diff changeset
59 | o 2@source "Added parent file" files: file-parent
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
60 | |
16060
f84dda152a55 convert/bzr: convert all branches (issue3229) (BC)
Patrick Mezard <pmezard@gmail.com>
parents: 12516
diff changeset
61 o | 1@source-branch1 "Added branch1 file" files: file file-branch1
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
62 |/
16060
f84dda152a55 convert/bzr: convert all branches (issue3229) (BC)
Patrick Mezard <pmezard@gmail.com>
parents: 12516
diff changeset
63 o 0@source "Initial add" files: file
12516
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
64
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
65 $ manifest source-hg tip
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
66 % manifest of tip
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
67 644 file
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
68 644 file-branch1
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
69 644 file-branch2
90efbd1a2a56 tests: unify test-convert-bzr-merges
Matt Mackall <mpm@selenic.com>
parents: 8084
diff changeset
70 644 file-parent
16913
f2719b387380 tests: add missing trailing 'cd ..'
Mads Kiilerich <mads@kiilerich.com>
parents: 16060
diff changeset
71
f2719b387380 tests: add missing trailing 'cd ..'
Mads Kiilerich <mads@kiilerich.com>
parents: 16060
diff changeset
72 $ cd ..