annotate tests/test-bdiff.py @ 30435:b86a448a2965

zstd: vendor python-zstandard 0.5.0 As the commit message for the previous changeset says, we wish for zstd to be a 1st class citizen in Mercurial. To make that happen, we need to enable Python to talk to the zstd C API. And that requires bindings. This commit vendors a copy of existing Python bindings. Why do we need to vendor? As the commit message of the previous commit says, relying on systems in the wild to have the bindings or zstd present is a losing proposition. By distributing the zstd and bindings with Mercurial, we significantly increase our chances that zstd will work. Since zstd will deliver a better end-user experience by achieving better performance, this benefits our users. Another reason is that the Python bindings still aren't stable and the API is somewhat fluid. While Mercurial could be coded to target multiple versions of the Python bindings, it is safer to bundle an explicit, known working version. The added Python bindings are mostly a fully-featured interface to the zstd C API. They allow one-shot operations, streaming, reading and writing from objects implements the file object protocol, dictionary compression, control over low-level compression parameters, and more. The Python bindings work on Python 2.6, 2.7, and 3.3+ and have been tested on Linux and Windows. There are CFFI bindings, but they are lacking compared to the C extension. Upstream work will be needed before we can support zstd with PyPy. But it will be possible. The files added in this commit come from Git commit e637c1b214d5f869cf8116c550dcae23ec13b677 from https://github.com/indygreg/python-zstandard and are added without modifications. Some files from the upstream repository have been omitted, namely files related to continuous integration. In the spirit of full disclosure, I'm the maintainer of the "python-zstandard" project and have authored 100% of the code added in this commit. Unfortunately, the Python bindings have not been formally code reviewed by anyone. While I've tested much of the code thoroughly (I even have tests that fuzz APIs), there's a good chance there are bugs, memory leaks, not well thought out APIs, etc. If someone wants to review the code and send feedback to the GitHub project, it would be greatly appreciated. Despite my involvement with both projects, my opinions of code style differ from Mercurial's. The code in this commit introduces numerous code style violations in Mercurial's linters. So, the code is excluded from most lints. However, some violations I agree with. These have been added to the known violations ignore list for now.
author Gregory Szorc <gregory.szorc@gmail.com>
date Thu, 10 Nov 2016 22:15:58 -0800
parents 96f2f50d923f
children 1b393a93a7df
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
28734
4e51f9d1683e py3: use print_function in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 28733
diff changeset
1 from __future__ import absolute_import, print_function
8656
284fda4cd093 removed unused imports
Martin Geisler <mg@lazybytes.net>
parents: 8449
diff changeset
2 import struct
28733
2e54aaa65afc py3: use absolute_import in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 15530
diff changeset
3 from mercurial import (
2e54aaa65afc py3: use absolute_import in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 15530
diff changeset
4 bdiff,
2e54aaa65afc py3: use absolute_import in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 15530
diff changeset
5 mpatch,
2e54aaa65afc py3: use absolute_import in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 15530
diff changeset
6 )
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
7
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
8 def test1(a, b):
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
9 d = bdiff.bdiff(a, b)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
10 c = a
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
11 if d:
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
12 c = mpatch.patches(a, [d])
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
13 if c != b:
30427
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
14 print("bad diff+patch result from\n %r to\n %r:" % (a, b))
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
15 print("bdiff: %r" % d)
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
16 print("patched: %r" % c[:200])
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
17
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
18 def test(a, b):
30427
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
19 print("test", repr(a), repr(b))
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
20 test1(a, b)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
21 test1(b, a)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
22
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
23 test("a\nc\n\n\n\n", "a\nb\n\n\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
24 test("a\nb\nc\n", "a\nc\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
25 test("", "")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
26 test("a\nb\nc", "a\nb\nc")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
27 test("a\nb\nc\nd\n", "a\nd\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
28 test("a\nb\nc\nd\n", "a\nc\ne\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
29 test("a\nb\nc\n", "a\nc\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
30 test("a\n", "c\na\nb\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
31 test("a\n", "")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
32 test("a\n", "b\nc\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
33 test("a\n", "c\na\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
34 test("", "adjfkjdjksdhfksj")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
35 test("", "ab")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
36 test("", "abc")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
37 test("a", "a")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
38 test("ab", "ab")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
39 test("abc", "abc")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
40 test("a\n", "a\n")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
41 test("a\nb", "a\nb")
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
42
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
43 #issue1295
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
44 def showdiff(a, b):
30427
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
45 print('showdiff(\n %r,\n %r):' % (a, b))
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
46 bin = bdiff.bdiff(a, b)
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
47 pos = 0
30427
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
48 q = 0
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
49 while pos < len(bin):
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
50 p1, p2, l = struct.unpack(">lll", bin[pos:pos + 12])
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
51 pos += 12
30427
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
52 if p1:
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
53 print('', repr(a[q:p1]))
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
54 print('', p1, p2, repr(a[p1:p2]), '->', repr(bin[pos:pos + l]))
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
55 pos += l
30427
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
56 q = p2
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
57 if q < len(a):
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
58 print('', repr(a[q:]))
ede7bc45bf0a tests: make test-bdiff.py easier to maintain
Mads Kiilerich <madski@unity3d.com>
parents: 29013
diff changeset
59
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
60 showdiff("x\n\nx\n\nx\n\nx\n\nz\n", "x\n\nx\n\ny\n\nx\n\nx\n\nz\n")
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
61 showdiff("x\n\nx\n\nx\n\nx\n\nz\n", "x\n\nx\n\ny\n\nx\n\ny\n\nx\n\nz\n")
29013
9a8363d23419 bdiff: deal better with duplicate lines
Matt Mackall <mpm@selenic.com>
parents: 28734
diff changeset
62 # we should pick up abbbc. rather than bc.de as the longest match
9a8363d23419 bdiff: deal better with duplicate lines
Matt Mackall <mpm@selenic.com>
parents: 28734
diff changeset
63 showdiff("a\nb\nb\nb\nc\n.\nd\ne\n.\nf\n",
9a8363d23419 bdiff: deal better with duplicate lines
Matt Mackall <mpm@selenic.com>
parents: 28734
diff changeset
64 "a\nb\nb\na\nb\nb\nb\nc\n.\nb\nc\n.\nd\ne\nf\n")
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 814
diff changeset
65
28734
4e51f9d1683e py3: use print_function in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 28733
diff changeset
66 print("done")
15530
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
67
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
68 def testfixws(a, b, allws):
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
69 c = bdiff.fixws(a, allws)
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
70 if c != b:
28734
4e51f9d1683e py3: use print_function in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 28733
diff changeset
71 print("*** fixws", repr(a), repr(b), allws)
4e51f9d1683e py3: use print_function in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 28733
diff changeset
72 print("got:")
4e51f9d1683e py3: use print_function in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 28733
diff changeset
73 print(repr(c))
15530
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
74
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
75 testfixws(" \ta\r b\t\n", "ab\n", 1)
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
76 testfixws(" \ta\r b\t\n", " a b\n", 0)
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
77 testfixws("", "", 1)
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
78 testfixws("", "", 0)
eeac5e179243 mdiff: replace wscleanup() regexps with C loops
Patrick Mezard <pmezard@gmail.com>
parents: 12865
diff changeset
79
28734
4e51f9d1683e py3: use print_function in test-bdiff.py
Robert Stanca <robert.stanca7@gmail.com>
parents: 28733
diff changeset
80 print("done")
30428
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
81
30431
8c0c75aa3ff4 bdiff: give slight preference to longest matches in the middle of the B side
Mads Kiilerich <madski@unity3d.com>
parents: 30429
diff changeset
82 print("Nice diff for a trivial change:")
30428
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
83 showdiff(
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
84 ''.join('<%s\n-\n' % i for i in range(5)),
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
85 ''.join('>%s\n-\n' % i for i in range(5)))
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
86
30432
3633403888ae bdiff: give slight preference to appending lines
Mads Kiilerich <madski@unity3d.com>
parents: 30431
diff changeset
87 print("Diff 1 to 3 lines - preference for appending:")
30428
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
88 showdiff('a\n', 'a\n' * 3)
30432
3633403888ae bdiff: give slight preference to appending lines
Mads Kiilerich <madski@unity3d.com>
parents: 30431
diff changeset
89 print("Diff 1 to 5 lines - preference for appending:")
30428
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
90 showdiff('a\n', 'a\n' * 5)
30433
96f2f50d923f bdiff: give slight preference to removing trailing lines
Mads Kiilerich <madski@unity3d.com>
parents: 30432
diff changeset
91 print("Diff 3 to 1 lines - preference for removing trailing lines:")
30428
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
92 showdiff('a\n' * 3, 'a\n')
30433
96f2f50d923f bdiff: give slight preference to removing trailing lines
Mads Kiilerich <madski@unity3d.com>
parents: 30432
diff changeset
93 print("Diff 5 to 1 lines - preference for removing trailing lines:")
30428
3743e5dbb824 tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com>
parents: 30427
diff changeset
94 showdiff('a\n' * 5, 'a\n')