Wed, 14 Dec 2022 01:51:33 -0500 typing: add type hints to pycompat.bytestr
Matt Harbison <matt_harbison@yahoo.com> [Wed, 14 Dec 2022 01:51:33 -0500] rev 49898
typing: add type hints to pycompat.bytestr The problem with leaving pytype to its own devices here was that for functions that returned a bytestr, pytype inferred `Union[bytes, int]`. It now accepts that it can be treated as plain bytes. I wasn't able to figure out the arg type for `__getitem__`- `SupportsIndex` (which PyCharm indicated is how the superclass function is typed) got flagged: File "/mnt/c/Users/Matt/hg/mercurial/pycompat.py", line 236, in __getitem__: unsupported operand type(s) for item retrieval: bytestr and SupportsIndex [unsupported-operands] Function __getitem__ on bytestr expects int But some caller got flagged when I marked it as `int`. There's some minor spillover problems elsewhere- pytype doesn't seem to recognize that `bytes.startswith()` can optionally take a 3rd and 4th arg, so those few places have the warning disabled. It also flags where the tar API is being abused, but that would be a tricky refactor (and would require typing extensions until py3.7 is dropped), so disable those too.
Wed, 14 Dec 2022 01:38:52 -0500 pycompat: explicitly prefix builtin attr usage with `builtins.`
Matt Harbison <matt_harbison@yahoo.com> [Wed, 14 Dec 2022 01:38:52 -0500] rev 49897
pycompat: explicitly prefix builtin attr usage with `builtins.` It doesn't seem like this would fix any bug, because the wrapped functions that take bytes instead of str are defined after these calls. But PyCharm was flagging the second and third uses, saying "Type 'str' doesn't have expected attribute 'decode'". It wasn't flagging the first, but I changed it for consistency.
Wed, 14 Dec 2022 01:32:03 -0500 typing: add type hints to global variables in mercurial/pycompat.py
Matt Harbison <matt_harbison@yahoo.com> [Wed, 14 Dec 2022 01:32:03 -0500] rev 49896
typing: add type hints to global variables in mercurial/pycompat.py The way `osaltsep` and `sysexecutable` were defined, pytype determined them to be `Union[bytes, str]`. This was a problem because that cascaded to all of the callers, and also because it couldn't be annotated as bytes on the initial assignment. Therefore, we use a ternary operator. The documentation says that `sys.executable` can either be None or an empty string if the value couldn't be determined. We opt for an empty string here because there are places that blindly pass it to `os.path.xxx()` functions, which crash if given None. Other places test `if pycompat.sysexecutable`, so empty string works for both.
Tue, 13 Dec 2022 16:48:47 -0500 windows: drop an unused method
Matt Harbison <matt_harbison@yahoo.com> [Tue, 13 Dec 2022 16:48:47 -0500] rev 49895
windows: drop an unused method The only caller was removed in 563eb25e079b.
Mon, 12 Dec 2022 14:10:12 -0500 typing: add type hints to the prompt methods in mercurial/ui.py
Matt Harbison <matt_harbison@yahoo.com> [Mon, 12 Dec 2022 14:10:12 -0500] rev 49894
typing: add type hints to the prompt methods in mercurial/ui.py The @overloads allow for the callers that pass a non-None `default` to not have to worry about handling a None return to appease pytype.
Mon, 12 Dec 2022 14:17:05 -0500 ui: split the `default` arg out of **kwargs for the internal prompt method
Matt Harbison <matt_harbison@yahoo.com> [Mon, 12 Dec 2022 14:17:05 -0500] rev 49893
ui: split the `default` arg out of **kwargs for the internal prompt method This arg was required anyway, based on how it was accessed. Having it separate allows it to be typed though, and this will simplify things for the callers- if a non-None `default` is passed, the return can never be None. That can be expressed with `@overload` when the arg can be typed, but that's not possible when it is rolled up in **kwargs. The default value is simply copied from the public `prompt()` above it.
Sun, 11 Dec 2022 00:10:56 -0500 typing: add trivial type hints to mercurial/ui.py
Matt Harbison <matt_harbison@yahoo.com> [Sun, 11 Dec 2022 00:10:56 -0500] rev 49892
typing: add trivial type hints to mercurial/ui.py There's not really a pattern here; it's mostly obvious return types and in a few cases, obvious parameter types. Some other "obvious" functions are left out because of quirks in how the return value for the various config() functions are inferred cause pytype to complain.
Sat, 10 Dec 2022 14:57:42 -0500 doc: don't pass str to ui methods in check-seclevel.py
Matt Harbison <matt_harbison@yahoo.com> [Sat, 10 Dec 2022 14:57:42 -0500] rev 49891
doc: don't pass str to ui methods in check-seclevel.py
Sat, 10 Dec 2022 14:44:46 -0500 typing: add type hints related to message output in mercurial/ui.py
Matt Harbison <matt_harbison@yahoo.com> [Sat, 10 Dec 2022 14:44:46 -0500] rev 49890
typing: add type hints related to message output in mercurial/ui.py This will shake loose some bytes vs str issues in the doc checker.
Sat, 10 Dec 2022 00:22:13 -0500 typing: add type hints related to progress bars in mercurial/ui.py
Matt Harbison <matt_harbison@yahoo.com> [Sat, 10 Dec 2022 00:22:13 -0500] rev 49889
typing: add type hints related to progress bars in mercurial/ui.py Pretty low hanging fruit while trying to deal with other more complicated parts of this module.
Fri, 25 Nov 2022 18:39:47 -0500 pytype: stop excluding mercurial/ui.py
Matt Harbison <matt_harbison@yahoo.com> [Fri, 25 Nov 2022 18:39:47 -0500] rev 49888
pytype: stop excluding mercurial/ui.py ui.extractchoices() is perhaps making assumptions that it shouldn't about the pattern always matching, but presumably we have test coverage for that. PyCharm flags the updated classes with a warning "Class xxx must implement all abstract methods", and suggests adding `abc.ABC` to the superclasses. I'm not sure why, unless it doesn't recognize the `__getattr__()` delegation pattern. Additionally, we can't unconditionally subclass `typing.BinaryIO` because that defeats the `__getattr__` delegation to the wrapped object at runtime. Instead, it has to only subclass during the type checking phase[1]. In any event, this fixes: File "/mnt/c/Users/Matt/hg/mercurial/ui.py", line 1518, in _runpager: Function subprocess.Popen.__new__ was called with the wrong arguments [wrong-arg-types] Expected: (cls, args, bufsize, executable, stdin, stdout: Optional[Union[IO, int]] = ..., ...) Actually passed: (cls, args, bufsize, stdin, stdout: Union[mercurial.utils.procutil.WriteAllWrapper, mercurial.windows.winstdout], ...) File "/mnt/c/Users/Matt/hg/mercurial/ui.py", line 1798, in extractchoices: No attribute 'group' on None [attribute-error] In Optional[Match[bytes]] File "/mnt/c/Users/Matt/hg/mercurial/ui.py", line 1799, in extractchoices: No attribute 'group' on None [attribute-error] In Optional[Match[bytes]] [1] https://stackoverflow.com/q/71365594
Wed, 07 Dec 2022 20:12:23 +0100 bundle: emit full snapshot as is, without doing a redelta
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Dec 2022 20:12:23 +0100] rev 49887
bundle: emit full snapshot as is, without doing a redelta With the new `forced` delta-reused policy, it become important to be able to send full snapshot where full snapshot are needed. Otherwise, the fallback delta will simply be used on the client sideā€¦ creating monstrous delta chain, since revision that are meant as a reset of delta-chain chain becoming too complex are simply adding a new full delta-tree on the leaf of another one. In the `non-forced` cases, client process full snapshot from the bundle differently from deltas, so client will still try to convert the full snapshot into a delta if possible. So this will no lead to pathological storage explosion. I have considered making this configurable, but the impact seems limited enough that it does not seems to be worth it. Especially with the current sparse-revlog format that use "delta-tree" with multiple level snapshots, full snapshot are much less frequent and not that different from other intermediate snapshot that we are already sending over the wire anyway. CPU wise, this will help the bundling side a little as it will not need to reconstruct revisions and compute deltas. The unbundling side might save a tiny amount of CPU as it won't need to reconstruct the delta-base to reconstruct the revision full text. This only slightly visible in some of the benchmarks. And have no real impact on most of them. ### data-env-vars.name = pypy-2018-08-01-zstd-sparse-revlog # benchmark.name = perf-bundle # benchmark.variants.revs = last-40000 before: 11.467186 seconds just-emit-full: 11.190576 seconds (-2.41%) with-pull-force: 11.041091 seconds (-3.72%) # benchmark.name = perf-unbundle # benchmark.variants.revs = last-40000 before: 16.744862 just-emit-full:: 16.561036 seconds (-1.10%) with-pull-force: 16.389344 seconds (-2.12%) # benchmark.name = pull # benchmark.variants.revs = last-40000 before: 26.870569 just-emit-full: 26.391188 seconds (-1.78%) with-pull-force: 25.633184 seconds (-4.60%) Space wise (so network-wise) the impact is fairly small. When taking compression into account. Below are tests the size of `hg bundle --all` for a handful of benchmark repositories (with bzip, zstd compression and without it) This show a small increase in the bundle size, but nothing really significant except maybe for mozilla-try (+12%) that nobody really pulls large chunk of anyway. Mozilla-try is also the repository that benefit the most for not having to recompute deltas client size. ### mercurial: bzip-before: 26 406 342 bytes bzip-after: 26 691 543 bytes +1.08% zstd-before: 27 918 645 bytes zstd-after: 28 075 896 bytes +0.56% none-before: 98 675 601 bytes none-after: 100 411 237 bytes +1.76% ### pypy bzip-before: 201 295 752 bytes bzip-after: 209 780 282 bytes +4.21% zstd-before: 202 974 795 bytes zstd-after: 205 165 780 bytes +1.08% none-before: 871 070 261 bytes none-after: 993 595 057 bytes +14.07% ### netbeans bzip-before: 601 314 330 bytes bzip-after: 614 246 241 bytes +2.15% zstd-before: 604 745 136 bytes zstd-after: 615 497 705 bytes +1.78% none-before: 3 338 238 571 bytes none-after: 3 439 422 535 bytes +3.03% ### mozilla-central bzip-before: 1 493 006 921 bytes bzip-after: 1 549 650 570 bytes +3.79% zstd-before: 1 481 910 102 bytes zstd-after: 1 513 052 415 bytes +2.10% none-before: 6 535 929 910 bytes none-after: 7 010 191 342 bytes +7.26% ### mozilla-try bzip-before: 6 583 425 999 bytes bzip-after: 7 423 536 928 bytes +12.76% zstd-before: 6 021 009 212 bytes zstd-after: 6 674 922 420 bytes +10.86% none-before: 22 954 739 558 bytes none-after: 26 013 854 771 bytes +13.32%
Tue, 06 Dec 2022 12:10:31 +0100 bundle: when forcing acceptance of incoming delta also accept snapshot
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 06 Dec 2022 12:10:31 +0100] rev 49886
bundle: when forcing acceptance of incoming delta also accept snapshot Snapshot where never considered reusable and the unbundling side always tried to find a delta from them. In the `forced` mode this is counter-productive because it will either connect two delta-tree that should not be connected or it will spend potentially a lot of time because creating a full snapshot anyway. So in this mode, we accept the full snapshot as is. This changeset is benchmarked with its children so please do not split them apart when landing.
Wed, 07 Dec 2022 20:05:19 +0100 delta-find: properly report full snapshot used from cache as such
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Dec 2022 20:05:19 +0100] rev 49885
delta-find: properly report full snapshot used from cache as such The number of tries and the delta base is reported differently so we missed there detection initially.
Wed, 07 Dec 2022 22:40:54 +0100 test-acl: glob the payload size again
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 07 Dec 2022 22:40:54 +0100] rev 49884
test-acl: glob the payload size again This size of bundle-2 payload are irrelevant for this test and only appears in its output because other pieces of the debug output are important. We glob it these number before they get in our way again.
(0) -30000 -10000 -3000 -1000 -300 -100 -15 +15 +100 +300 +1000 tip