Mercurial > hg-stable
annotate contrib/bdiff-torture.py @ 30443:2e484bdea8c4
zstd: vendor zstd 1.1.1
zstd is a new compression format and it is awesome, yielding
higher compression ratios and significantly faster compression
and decompression operations compared to zlib (our current
compression engine of choice) across the board.
We want zstd to be a 1st class citizen in Mercurial and to eventually
be the preferred compression format for various operations.
This patch starts the formal process of supporting zstd by vendoring
a copy of zstd. Why do we need to vendor zstd? Good question.
First, zstd is relatively new and not widely available yet. If we
didn't vendor zstd or distribute it with Mercurial, most users likely
wouldn't have zstd installed or even available to install. What good
is a feature if you can't use it? Vendoring and distributing the zstd
sources gives us the highest liklihood that zstd will be available to
Mercurial installs.
Second, the Python bindings to zstd (which will be vendored in a
separate changeset) make use of zstd APIs that are only available
via static linking. One reason they are only available via static
linking is that they are unstable and could change at any time.
While it might be possible for the Python bindings to attempt to
talk to different versions of the zstd C library, the safest thing to
do is link against a specific, known-working version of zstd. This
is why the Python zstd bindings themselves vendor zstd and why we
must as well. This also explains why the added files are in a
"python-zstandard" directory.
The added files are from the 1.1.1 release of zstd (Git commit
4c0b44f8ced84c4c8edfa07b564d31e4fa3e8885 from
https://github.com/facebook/zstd) and are added without modifications.
Not all files from the zstd "distribution" have been added. Notably
missing are files to support interacting with "legacy," pre-1.0
versions of zstd. The decision of which files to include is made by
the upstream python-zstandard project (which I'm the author of). The
files in this commit are a snapshot of the files from the 0.5.0
release of that project, Git commit
e637c1b214d5f869cf8116c550dcae23ec13b677 from
https://github.com/indygreg/python-zstandard.
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Thu, 10 Nov 2016 21:45:29 -0800 |
parents | eccfd6500636 |
children | ded48ad55146 |
rev | line source |
---|---|
29012
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
1 # Randomized torture test generation for bdiff |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
2 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
3 from __future__ import absolute_import, print_function |
29209
eccfd6500636
py3: make contrib/bdiff-torture.py conform to our import style
Yuya Nishihara <yuya@tcha.org>
parents:
29012
diff
changeset
|
4 import random |
eccfd6500636
py3: make contrib/bdiff-torture.py conform to our import style
Yuya Nishihara <yuya@tcha.org>
parents:
29012
diff
changeset
|
5 import sys |
eccfd6500636
py3: make contrib/bdiff-torture.py conform to our import style
Yuya Nishihara <yuya@tcha.org>
parents:
29012
diff
changeset
|
6 |
29012
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
7 from mercurial import ( |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
8 bdiff, |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
9 mpatch, |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
10 ) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
11 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
12 def reducetest(a, b): |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
13 tries = 0 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
14 reductions = 0 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
15 print("reducing...") |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
16 while tries < 1000: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
17 a2 = "\n".join(l for l in a.splitlines() |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
18 if random.randint(0, 100) > 0) + "\n" |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
19 b2 = "\n".join(l for l in b.splitlines() |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
20 if random.randint(0, 100) > 0) + "\n" |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
21 if a2 == a and b2 == b: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
22 continue |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
23 if a2 == b2: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
24 continue |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
25 tries += 1 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
26 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
27 try: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
28 test1(a, b) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
29 except Exception as inst: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
30 reductions += 1 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
31 tries = 0 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
32 a = a2 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
33 b = b2 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
34 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
35 print("reduced:", reductions, len(a) + len(b), |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
36 repr(a), repr(b)) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
37 try: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
38 test1(a, b) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
39 except Exception as inst: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
40 print("failed:", inst) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
41 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
42 sys.exit(0) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
43 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
44 def test1(a, b): |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
45 d = bdiff.bdiff(a, b) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
46 if not d: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
47 raise ValueError("empty") |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
48 c = mpatch.patches(a, [d]) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
49 if c != b: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
50 raise ValueError("bad") |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
51 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
52 def testwrap(a, b): |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
53 try: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
54 test1(a, b) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
55 return |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
56 except Exception as inst: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
57 pass |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
58 print("exception:", inst) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
59 reducetest(a, b) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
60 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
61 def test(a, b): |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
62 testwrap(a, b) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
63 testwrap(b, a) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
64 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
65 def rndtest(size, noise): |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
66 a = [] |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
67 src = " aaaaaaaabbbbccd" |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
68 for x in xrange(size): |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
69 a.append(src[random.randint(0, len(src) - 1)]) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
70 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
71 while True: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
72 b = [c for c in a if random.randint(0, 99) > noise] |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
73 b2 = [] |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
74 for c in b: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
75 b2.append(c) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
76 while random.randint(0, 99) < noise: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
77 b2.append(src[random.randint(0, len(src) - 1)]) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
78 if b2 != a: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
79 break |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
80 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
81 a = "\n".join(a) + "\n" |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
82 b = "\n".join(b2) + "\n" |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
83 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
84 test(a, b) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
85 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
86 maxvol = 10000 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
87 startsize = 2 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
88 while True: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
89 size = startsize |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
90 count = 0 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
91 while size < maxvol: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
92 print(size) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
93 volume = 0 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
94 while volume < maxvol: |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
95 rndtest(size, 2) |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
96 volume += size |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
97 count += 2 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
98 size *= 2 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
99 maxvol *= 4 |
4bd67ae7d75a
bdiff: fix latent normalization bug
Matt Mackall <mpm@selenic.com>
parents:
diff
changeset
|
100 startsize *= 4 |