Sat, 28 Sep 2024 19:12:18 -0400 interfaces: introduce and use a protocol class for the `bdiff` module
Matt Harbison <matt_harbison@yahoo.com> [Sat, 28 Sep 2024 19:12:18 -0400] rev 51933
interfaces: introduce and use a protocol class for the `bdiff` module This is allowed by PEP 544[1], and we basically follow the example there. The class here is copied from `mercurial.pure.bdiff`, and the implementation removed. There are several modules that have a few different implementations, and the implementation chosen is controlled by `HGMODULEPOLICY`. The module is loaded via `mercurial/policy.py`, and has been inferred by pytype as `Any` up to this point. Therefore it and PyCharm were blind to all functions on the module, and their signatures. Also, having multiple instances of the same module allows their signatures to get out of sync. Introducing a protocol class allows the loaded module that is stored in a variable to be given type info, which cascades through the various places it is used. This change alters 11 *.pyi files, for example. In theory, this would also allow us to ensure the various implementations of the same module are kept in alignment- simply import the module in a test module, attempt to pass it to a function that uses the corresponding protocol as an argument, and run pytype on it. In practice, this doesn't work (yet). PyCharm (erroneously) flags imported modules being passed where a protocol class is used[2]. Pytype has problems the other way- it fails to detect when a module that doesn't adhere to the protocol is passed to a protocol argument. The good news is that mypy properly detects this case. The bad news is that mypy spews a bunch of other errors when importing even simple modules, like the various `bdiff` modules. Therefore I'm punting on the tests for now because the type info around a loaded module in PyCharm is a clear win by itself. [1] https://peps.python.org/pep-0544/#modules-as-implementations-of-protocols [2] https://youtrack.jetbrains.com/issue/PY-58679/Support-modules-implementing-protocols
Sat, 28 Sep 2024 19:11:39 -0400 mdiff: tweak calls into `bdiff.fixws` to match its type hints
Matt Harbison <matt_harbison@yahoo.com> [Sat, 28 Sep 2024 19:11:39 -0400] rev 51932
mdiff: tweak calls into `bdiff.fixws` to match its type hints It turns out that protocol classes can be used for modules too, which is great because all of the dynamically loaded modules (and their attributes) are currently inferred as `Any`. See the next commit for details. A protocol class for the `bdiff` module detected this (trivial) mismatch, so correct it first. The various implementations of this method are typed as taking a `bool`. The `cext` implementation parses its arguments with `PyArg_ParseTuple(args, "Sb:fixws", &s, &allws)`, which wants an `int`. But experimenting in `hg debugshell` under py38, passing `True` or `False` to `cext.fixws()` also works. We can change the implementation to use "p" (which was introduced in py33) instead of "b", but that's beyond the scope of this.
Tue, 01 Oct 2024 15:04:06 -0400 util: minor copy editing of the documentation for `mmapread()`
Matt Harbison <matt_harbison@yahoo.com> [Tue, 01 Oct 2024 15:04:06 -0400] rev 51931
util: minor copy editing of the documentation for `mmapread()`
Tue, 01 Oct 2024 15:00:39 -0400 util: make `mmapread()` work on Windows again
Matt Harbison <matt_harbison@yahoo.com> [Tue, 01 Oct 2024 15:00:39 -0400] rev 51930
util: make `mmapread()` work on Windows again 522b4d729e89 started referencing `mmap.MAP_PRIVATE`, but that's not available on Windows, so `hg version` worked, but `make local` did not. That commit also started calling the constructor with the fine-grained `flags` and `prot` args, but those aren't available on Windows either[1] (though the backing C code doesn't seem conditionalized to disallow usage of them). I assume the change away from from the `access` arg was to provide the same options, plus `MAP_POPULATE`. Looking at the source code[2], they're not quite the same- `ACCESS_READ` is equivalent to `flags = MAP_SHARED` and `prot = PROT_READ`. `MAP_PRIVATE` is only used with `ACCESS_COPY`, which allows read and write. Therefore, we can't quite get the same baseline flags on Windows, but this was the status quo ante and `MAP_POPULATE` is a Linux thing, so presumably it works. I realize that typically the OS differences are abstracted into the platform modules, but I'm leaving it here so that it is obvious what the differences are between the platforms. [1] https://docs.python.org/3/library/mmap.html#mmap.mmap [2] https://github.com/python/cpython/blob/5e0abb47886bc665eefdcc19fde985f803e49d4c/Modules/mmapmodule.c#L1539
Fri, 27 Sep 2024 12:30:37 -0400 typing: add type annotations to the dirstate classes
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 12:30:37 -0400] rev 51929
typing: add type annotations to the dirstate classes The basic procedure here was to use `merge-pyi` to merge the `git/dirstate.pyi` file in (after renaming the interface class to match), cleaning up the import statement mess, and then repeating the procedure for `mercurial/dirstate.pyi`. Surprisingly, git's dirstate had more hints inferred in its *.pyi file. After that, it was a manual examination of each method in the interface, and how they were implemented in the core and git classes to verify what was inferred by pytype, and fill in the missing gaps. Since this involved jumping around between three different files, I applied the same type info to all three at the same time. Complex types I rolled up into type aliases in the interface module, and used that as needed. That way if it changes, there's one place to edit. There are some hints still missing, and some documentation that doesn't match the signatures. They should all be marked with TODOs. There are also a bunch of methods on the core class that aren't on the Protocol class that seem like maybe they should be (like `set_tracked()`). There are even more methods missing from the git class. But that's a project for another time.
Fri, 27 Sep 2024 12:10:25 -0400 interfaces: change a couple of dirstate fields to `@property`
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 12:10:25 -0400] rev 51928
interfaces: change a couple of dirstate fields to `@property` As I was adding type hints here and to the concrete classes, PyCharm flagged the property in the core class as not being compatible with the base class's version.
Fri, 27 Sep 2024 12:05:48 -0400 git: make `dirstate.parents()` return a list like the core class
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 12:05:48 -0400] rev 51927
git: make `dirstate.parents()` return a list like the core class The core class returned a list, so that's how I type annotated it, and this got flagged. I suppose we could annotate it as a `Sequence[bytes]`, but it's a trivial difference.
Fri, 27 Sep 2024 11:57:42 -0400 typing: add type hints for the overloads of `matchmod.readpatternfile()`
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 11:57:42 -0400] rev 51926
typing: add type hints for the overloads of `matchmod.readpatternfile()` The return type is conditional on an argument passed, and it very much confused both pytype and PyCharm inside `dirstate._ignorefileandline()` after adding type hints for the return value there.
Thu, 26 Sep 2024 18:52:46 -0400 dirstate: subclass the new dirstate Protocol class
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:52:46 -0400] rev 51925
dirstate: subclass the new dirstate Protocol class Behold the chaos that ensues. We'll use the generated *.pyi files to apply type annotations to the interface, and see how much agrees with the documentation. Since the CamelCase name was used to try to work around pytype issues with zope interfaces and is a new innovation this cycle (see c1d7ac70980b), drop the CamelCase name. I think the Protocol classes *should* be CamelCase, but that can be done later in one pass. For now, the CamelCase alias is extra noise in the *.pyi files.
Thu, 26 Sep 2024 18:51:03 -0400 git: correct some signature mismatches between dirstate and the Protocol class
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:51:03 -0400] rev 51924
git: correct some signature mismatches between dirstate and the Protocol class These were flagged by PyCharm when subclassing the Protocol class. Note that both `is_changing_xxx` were only flagged when the Protocol class used a plain field, as mentioned in the previous commit. After converting those attrs in the Protocol class to @property to match the regular dirstate class, it stopped flagging these. But I don't think that makes sense- `@property` should look like an attribute to the outside world, not a callable.
Thu, 26 Sep 2024 18:15:36 -0400 interfaces: convert the zope `Attribute` attrs to regular fields
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:15:36 -0400] rev 51923
interfaces: convert the zope `Attribute` attrs to regular fields At this point, we should have a useful protocol class. The file syntax requires the type to be supplied for any fields that are declared, but we'll leave the complex ones partially unspecified for now, for simplicity. (Also, the things documented as `Callable` are really as future type annotating worked showed- roll with it for now, but they're marked as TODO for fixing later.) All of the fields and all of the attrs will need type annotations, or the type rules say they are considered to be `Any`. That can be done in a separate pass, possibly applying the `dirstate.pyi` file generated from the concrete class. The first cut of this turned the `interfaceutil.Attribute` fields into plain fields, and thus the types on them. PyCharm flagged a few things as having incompatible signatures when the concrete dirstate class subclassed this, when the concrete class has them declared as `@property`. So they've been changed to `@property` here in those cases. The remaining fields that are decorated in the concrete class have comments noting the differences. We'll see if they need to be changed going forward, but leave them for now. We'll be in trouble if the `@util.propertycache` is needed, because we can't import that module here at runtime, due to circular imports.
Thu, 26 Sep 2024 18:09:33 -0400 interfaces: add the missing `self` arg to the dirstate Protocol class
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:09:33 -0400] rev 51922
interfaces: add the missing `self` arg to the dirstate Protocol class This clears all of the errors that PyCharm has been flagging in this file, since the zope interface was declared here.
Thu, 26 Sep 2024 18:04:31 -0400 interfaces: convert the dirstate zope interface to a Protocol class
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:04:31 -0400] rev 51921
interfaces: convert the dirstate zope interface to a Protocol class This is a small trial run for converting the repository interfaces enmasse, in the same series of steps. I'm not sure that this current code is valid (it has zope attribute fields, and it's missing all of the `self` args on its functions, but that was the previous state of things, and made PyCharm really unhappy). But it will be easier to review the repository interface changes if this change is separate from adding `self` and dropping the zope attributes all over. Having an empty constructor in a protocol is weird. I'm not sure if these args should be converted to fields that all subclasses would have, and comments around existing attributes say some should be going away. Comment it out for now so that it's not in the way, but also not forgotten.
Thu, 26 Sep 2024 17:47:39 -0400 tests: disable `test-check-interfaces.py` while converting to protocols
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 17:47:39 -0400] rev 51920
tests: disable `test-check-interfaces.py` while converting to protocols The goal is to convert everything, so get it all out of the way. The interfaces don't get that much maintenance that this needs to be tested right now.
Wed, 02 Oct 2024 14:51:56 +0100 tests: always access the mercurial repo through `helpers-testrepo.sh`
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 02 Oct 2024 14:51:56 +0100] rev 51919
tests: always access the mercurial repo through `helpers-testrepo.sh` In some contexts the mercurial repo needs to be accessed through system hg. That's what `helpers-testrepo.sh` enforces, but some tests incorrectly use the mercurial repo without going through that script. This patch fixes those tests.
Wed, 02 Oct 2024 14:49:07 +0100 tests: in helpers-testrepo.sh switch from shell aliases to functions
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 02 Oct 2024 14:49:07 +0100] rev 51918
tests: in helpers-testrepo.sh switch from shell aliases to functions The reason is that I want this script to work in non-interactive shells too. (will be used in the next commit)
Fri, 27 Sep 2024 17:25:15 +0100 rust: fix the deprecation warning in NaiveDateTime::from_timestamp
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 27 Sep 2024 17:25:15 +0100] rev 51917
rust: fix the deprecation warning in NaiveDateTime::from_timestamp This warning appears between chrono 0.4.34 and 0.4.38, so isn't affecting the current lock file, but it would come when we upgraded the version.
Fri, 27 Sep 2024 15:53:56 +0200 run-tests: ensure that --no-rust do not use rust stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 27 Sep 2024 15:53:56 +0200] rev 51916
run-tests: ensure that --no-rust do not use rust Since having a valid value in HGWITHRUSTEXT is enough to trigger the use of rust, we need to unset it before install to be sure.
Sat, 20 Jul 2024 03:04:48 +0200 revlogutils: teach issue6528 filtering about grandparents
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 03:04:48 +0200] rev 51915
revlogutils: teach issue6528 filtering about grandparents During dynamic filtering, we should assume that the current repository is correct. Therefore the parents of the delta base can tell us if that parent has metadata without having to build the whole text.
Sat, 20 Jul 2024 00:43:08 +0200 revlogutils: remember known metadata parents for issue6528
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:43:08 +0200] rev 51914
revlogutils: remember known metadata parents for issue6528 In the cases where the parent revs tell us for sure that the parent has metadata, remember this fact to avoid content recomputations later.
Sat, 20 Jul 2024 00:44:59 +0200 revlogutils: for issue6528 fix, pre-cache nullrev as metadata-free
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:44:59 +0200] rev 51913
revlogutils: for issue6528 fix, pre-cache nullrev as metadata-free
Sat, 20 Jul 2024 00:59:50 +0200 revlogutils: for issue6528 fix, cache results for null changes
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:59:50 +0200] rev 51912
revlogutils: for issue6528 fix, cache results for null changes
Sat, 20 Jul 2024 00:41:37 +0200 revlogutils: fix _chunk() reference
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:41:37 +0200] rev 51911
revlogutils: fix _chunk() reference _chunk is only found in the inner revlog object and not directly exposed outside.
Mon, 02 Sep 2024 22:14:38 +0200 rev-branch-cache: reenable memory mapping of the revision data
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 02 Sep 2024 22:14:38 +0200] rev 51910
rev-branch-cache: reenable memory mapping of the revision data Now that we are no longer truncating it, we can mmap it again. This provide a sizeable speedup on repository with a very large amount of revision for example for a mozilla-try clone with 5 793 383 revisions, this provide a speedup of 5ms - 10ms. Since they happens within the "critical" locked path during push. These miliseconds are important. In addition, the v3 branchmap format is use the rev-branch-cache more than the v2 branchmap cache so this will be important. On smaller repository we consistently see an improvement of one or two percents, but the gain in absolute time is usually < 10 ms. #### benchmark.name = hg.command.unbundle # benchmark.variants.issue6528 = disabled # benchmark.variants.reuse-external-delta-parent = yes # benchmark.variants.revs = any-1-extra-rev # benchmark.variants.source = unbundle # benchmark.variants.verbosity = quiet ### data-env-vars.name = mozilla-try-2024-03-26-zstd-sparse-revlog ## bin-env-vars.hg.flavor = default e51161b12c7e: 3.527923 ebdcfe85b070: 3.468178 (-1.69%, -0.06) ## bin-env-vars.hg.flavor = rust e51161b12c7e: 3.580158 ebdcfe85b070: 3.480564 (-2.78%, -0.10) ### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm ## bin-env-vars.hg.flavor = rust e51161b12c7e: 3.527923 ebdcfe85b070: 3.468178 (-1.69%, -0.06)
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -24 +24 +50 +100 +300 tip