annotate mercurial/pure/base85.py @ 51723:9367571fea21

cext: correct the argument handling of `b85encode()` The type stub indicated that this argument is `Optional`, which implies None is allowed. I don't see in the documentation where that's the case for `i`[1], and trying it in `hg debugshell` resulted in the method failing with a TypeError. I guess it was typed as an `int` argument because the `p` format unit wasn't added until Python 3.3[2]. In any event, 2 clients in core (`pvec` and `obsolete`) call this with no argument supplied, and `mdiff` calls it with True. So I guess we've avoided the None arg case, and when no arg is supplied, it defaults to the 0 initialization of the `pad` variable in C. Since the `p` format unit accepts both `int` and None, as well as `bool`, I'm not bothering to bump the module version- this code is more permissive than it was, in addition to being more correct. Interestingly, when I first imported the `cext` and `pure` methods in the same manner as the previous commit, it dropped the `Optional` part of the argument type when generating `util.pyi`. No idea why. [1] https://docs.python.org/3/c-api/arg.html#numbers [2] https://docs.python.org/3/c-api/arg.html#other-objects
author Matt Harbison <matt_harbison@yahoo.com>
date Sat, 20 Jul 2024 01:55:09 -0400
parents 6000f5b25c9b
children f4733654f144
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
1 # base85.py: pure python base85 codec
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
2 #
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
3 # Copyright (C) 2009 Brendan Cully <brendan@kublai.com>
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
4 #
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7881
diff changeset
5 # This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 9029
diff changeset
6 # GNU General Public License version 2 or any later version.
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
7
27334
9007f697e8ef base85: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16598
diff changeset
8
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
9 import struct
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
10
35944
01b4d88ccb24 py3: use pycompat.bytestr to convert _b85chars to bytes
Pulkit Goyal <7895pulkit@gmail.com>
parents: 27334
diff changeset
11 from .. import pycompat
01b4d88ccb24 py3: use pycompat.bytestr to convert _b85chars to bytes
Pulkit Goyal <7895pulkit@gmail.com>
parents: 27334
diff changeset
12
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
13 _b85chars = pycompat.bytestr(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
14 b"0123456789ABCDEFGHIJKLMNOPQRSTUVWXYZabcdef"
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
15 b"ghijklmnopqrstuvwxyz!#$%&()*+-;<=>?@^_`{|}~"
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
16 )
7835
2505e9f84153 Optimization of pure.base85.b85encode
Mads Kiilerich <mads@kiilerich.com>
parents: 7701
diff changeset
17 _b85chars2 = [(a + b) for a in _b85chars for b in _b85chars]
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
18 _b85dec = {}
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
19
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
20
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
21 def _mkb85dec():
8632
9e055cfdd620 replace "i in range(len(xs))" with "i, x in enumerate(xs)"
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
22 for i, c in enumerate(_b85chars):
9e055cfdd620 replace "i in range(len(xs))" with "i, x in enumerate(xs)"
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
23 _b85dec[c] = i
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
24
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
25
51723
9367571fea21 cext: correct the argument handling of `b85encode()`
Matt Harbison <matt_harbison@yahoo.com>
parents: 48875
diff changeset
26 def b85encode(text: bytes, pad: bool = False) -> bytes:
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
27 """encode text in base85 format"""
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
28 l = len(text)
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
29 r = l % 4
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
30 if r:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
31 text += b'\0' * (4 - r)
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
32 longs = len(text) >> 2
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
33 words = struct.unpack(b'>%dL' % longs, text)
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
34
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
35 out = b''.join(
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
36 _b85chars[(word // 52200625) % 85]
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
37 + _b85chars2[(word // 7225) % 7225]
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
38 + _b85chars2[word % 7225]
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
39 for word in words
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
40 )
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
41
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
42 if pad:
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
43 return out
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
44
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
45 # Trim padding
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
46 olen = l % 4
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
47 if olen:
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
48 olen += 1
9029
0001e49f1c11 compat: use // for integer division
Alejandro Santos <alejolp@alejolp.com>
parents: 8632
diff changeset
49 olen += l // 4 * 5
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
50 return out[:olen]
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
51
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
52
51723
9367571fea21 cext: correct the argument handling of `b85encode()`
Matt Harbison <matt_harbison@yahoo.com>
parents: 48875
diff changeset
53 def b85decode(text: bytes) -> bytes:
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
54 """decode base85-encoded text"""
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
55 if not _b85dec:
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
56 _mkb85dec()
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
57
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
58 l = len(text)
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
59 out = []
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
60 for i in range(0, len(text), 5):
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
61 chunk = text[i : i + 5]
36191
80301c90a2dc py3: converts bytes to pycompat.bytestr to get bytechrs while enumerating
Pulkit Goyal <7895pulkit@gmail.com>
parents: 35944
diff changeset
62 chunk = pycompat.bytestr(chunk)
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
63 acc = 0
8632
9e055cfdd620 replace "i in range(len(xs))" with "i, x in enumerate(xs)"
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
64 for j, c in enumerate(chunk):
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
65 try:
8632
9e055cfdd620 replace "i in range(len(xs))" with "i, x in enumerate(xs)"
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
66 acc = acc * 85 + _b85dec[c]
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
67 except KeyError:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
68 raise ValueError(
43666
4394687b298b pure: use string for exception in the pure version of base85
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
69 'bad base85 character at position %d' % (i + j)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
70 )
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
71 if acc > 4294967295:
43667
4cd911040ba5 pure: use string for another exception in the pure version of base85
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43666
diff changeset
72 raise ValueError('Base85 overflow in hunk starting at byte %d' % i)
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
73 out.append(acc)
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
74
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
75 # Pad final chunk if necessary
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
76 cl = l % 5
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
77 if cl:
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
78 acc *= 85 ** (5 - cl)
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
79 if cl > 1:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
80 acc += 0xFFFFFF >> (cl - 2) * 8
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
81 out[-1] = acc
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
82
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
83 out = struct.pack(b'>%dL' % (len(out)), *out)
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
84 if cl:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36191
diff changeset
85 out = out[: -(5 - cl)]
7701
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
86
4bdead043d8d Pure python base85 fallback
Brendan Cully <brendan@kublai.com>
parents:
diff changeset
87 return out