Matt Harbison <matt_harbison@yahoo.com> [Sat, 28 Sep 2024 19:12:18 -0400] rev 51933
interfaces: introduce and use a protocol class for the `bdiff` module
This is allowed by PEP 544[1], and we basically follow the example there. The
class here is copied from `mercurial.pure.bdiff`, and the implementation
removed.
There are several modules that have a few different implementations, and the
implementation chosen is controlled by `HGMODULEPOLICY`. The module is loaded
via `mercurial/policy.py`, and has been inferred by pytype as `Any` up to this
point. Therefore it and PyCharm were blind to all functions on the module, and
their signatures. Also, having multiple instances of the same module allows
their signatures to get out of sync.
Introducing a protocol class allows the loaded module that is stored in a
variable to be given type info, which cascades through the various places it is
used. This change alters 11 *.pyi files, for example. In theory, this would
also allow us to ensure the various implementations of the same module are kept
in alignment- simply import the module in a test module, attempt to pass it to a
function that uses the corresponding protocol as an argument, and run pytype on
it.
In practice, this doesn't work (yet). PyCharm (erroneously) flags imported
modules being passed where a protocol class is used[2]. Pytype has problems the
other way- it fails to detect when a module that doesn't adhere to the protocol
is passed to a protocol argument. The good news is that mypy properly detects
this case. The bad news is that mypy spews a bunch of other errors when
importing even simple modules, like the various `bdiff` modules. Therefore I'm
punting on the tests for now because the type info around a loaded module in
PyCharm is a clear win by itself.
[1] https://peps.python.org/pep-0544/#modules-as-implementations-of-protocols
[2] https://youtrack.jetbrains.com/issue/PY-58679/Support-modules-implementing-protocols
Matt Harbison <matt_harbison@yahoo.com> [Sat, 28 Sep 2024 19:11:39 -0400] rev 51932
mdiff: tweak calls into `bdiff.fixws` to match its type hints
It turns out that protocol classes can be used for modules too, which is great
because all of the dynamically loaded modules (and their attributes) are
currently inferred as `Any`. See the next commit for details.
A protocol class for the `bdiff` module detected this (trivial) mismatch, so
correct it first. The various implementations of this method are typed as
taking a `bool`. The `cext` implementation parses its arguments with
`PyArg_ParseTuple(args, "Sb:fixws", &s, &allws)`, which wants an `int`. But
experimenting in `hg debugshell` under py38, passing `True` or `False` to
`cext.fixws()` also works. We can change the implementation to use "p" (which
was introduced in py33) instead of "b", but that's beyond the scope of this.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 01 Oct 2024 15:04:06 -0400] rev 51931
util: minor copy editing of the documentation for `mmapread()`
Matt Harbison <matt_harbison@yahoo.com> [Tue, 01 Oct 2024 15:00:39 -0400] rev 51930
util: make `mmapread()` work on Windows again
522b4d729e89 started referencing `mmap.MAP_PRIVATE`, but that's not available on
Windows, so `hg version` worked, but `make local` did not. That commit also
started calling the constructor with the fine-grained `flags` and `prot` args,
but those aren't available on Windows either[1] (though the backing C code
doesn't seem conditionalized to disallow usage of them).
I assume the change away from from the `access` arg was to provide the same
options, plus `MAP_POPULATE`. Looking at the source code[2], they're not quite
the same- `ACCESS_READ` is equivalent to `flags = MAP_SHARED` and `prot = PROT_READ`.
`MAP_PRIVATE` is only used with `ACCESS_COPY`, which allows read and write.
Therefore, we can't quite get the same baseline flags on Windows, but this was
the status quo ante and `MAP_POPULATE` is a Linux thing, so presumably it works.
I realize that typically the OS differences are abstracted into the platform
modules, but I'm leaving it here so that it is obvious what the differences are
between the platforms.
[1] https://docs.python.org/3/library/mmap.html#mmap.mmap
[2] https://github.com/python/cpython/blob/
5e0abb47886bc665eefdcc19fde985f803e49d4c/Modules/mmapmodule.c#L1539
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 12:30:37 -0400] rev 51929
typing: add type annotations to the dirstate classes
The basic procedure here was to use `merge-pyi` to merge the `git/dirstate.pyi`
file in (after renaming the interface class to match), cleaning up the import
statement mess, and then repeating the procedure for `mercurial/dirstate.pyi`.
Surprisingly, git's dirstate had more hints inferred in its *.pyi file.
After that, it was a manual examination of each method in the interface, and how
they were implemented in the core and git classes to verify what was inferred by
pytype, and fill in the missing gaps. Since this involved jumping around
between three different files, I applied the same type info to all three at the
same time. Complex types I rolled up into type aliases in the interface module,
and used that as needed. That way if it changes, there's one place to edit.
There are some hints still missing, and some documentation that doesn't match
the signatures. They should all be marked with TODOs. There are also a bunch
of methods on the core class that aren't on the Protocol class that seem like
maybe they should be (like `set_tracked()`). There are even more methods
missing from the git class. But that's a project for another time.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 12:10:25 -0400] rev 51928
interfaces: change a couple of dirstate fields to `@property`
As I was adding type hints here and to the concrete classes, PyCharm flagged the
property in the core class as not being compatible with the base class's
version.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 12:05:48 -0400] rev 51927
git: make `dirstate.parents()` return a list like the core class
The core class returned a list, so that's how I type annotated it, and this got
flagged. I suppose we could annotate it as a `Sequence[bytes]`, but it's a
trivial difference.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 27 Sep 2024 11:57:42 -0400] rev 51926
typing: add type hints for the overloads of `matchmod.readpatternfile()`
The return type is conditional on an argument passed, and it very much confused
both pytype and PyCharm inside `dirstate._ignorefileandline()` after adding
type hints for the return value there.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:52:46 -0400] rev 51925
dirstate: subclass the new dirstate Protocol class
Behold the chaos that ensues. We'll use the generated *.pyi files to apply type
annotations to the interface, and see how much agrees with the documentation.
Since the CamelCase name was used to try to work around pytype issues with zope
interfaces and is a new innovation this cycle (see
c1d7ac70980b), drop the
CamelCase name. I think the Protocol classes *should* be CamelCase, but that
can be done later in one pass. For now, the CamelCase alias is extra noise in
the *.pyi files.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:51:03 -0400] rev 51924
git: correct some signature mismatches between dirstate and the Protocol class
These were flagged by PyCharm when subclassing the Protocol class. Note that
both `is_changing_xxx` were only flagged when the Protocol class used a plain
field, as mentioned in the previous commit. After converting those attrs in the
Protocol class to @property to match the regular dirstate class, it stopped
flagging these. But I don't think that makes sense- `@property` should look
like an attribute to the outside world, not a callable.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:15:36 -0400] rev 51923
interfaces: convert the zope `Attribute` attrs to regular fields
At this point, we should have a useful protocol class.
The file syntax requires the type to be supplied for any fields that are
declared, but we'll leave the complex ones partially unspecified for now, for
simplicity. (Also, the things documented as `Callable` are really as future
type annotating worked showed- roll with it for now, but they're marked as TODO
for fixing later.) All of the fields and all of the attrs will need type
annotations, or the type rules say they are considered to be `Any`. That can be
done in a separate pass, possibly applying the `dirstate.pyi` file generated
from the concrete class.
The first cut of this turned the `interfaceutil.Attribute` fields into plain
fields, and thus the types on them. PyCharm flagged a few things as having
incompatible signatures when the concrete dirstate class subclassed this, when
the concrete class has them declared as `@property`. So they've been changed to
`@property` here in those cases. The remaining fields that are decorated in the
concrete class have comments noting the differences. We'll see if they need to
be changed going forward, but leave them for now. We'll be in trouble if the
`@util.propertycache` is needed, because we can't import that module here at
runtime, due to circular imports.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:09:33 -0400] rev 51922
interfaces: add the missing `self` arg to the dirstate Protocol class
This clears all of the errors that PyCharm has been flagging in this file, since
the zope interface was declared here.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 18:04:31 -0400] rev 51921
interfaces: convert the dirstate zope interface to a Protocol class
This is a small trial run for converting the repository interfaces enmasse, in
the same series of steps. I'm not sure that this current code is valid (it has
zope attribute fields, and it's missing all of the `self` args on its functions,
but that was the previous state of things, and made PyCharm really unhappy).
But it will be easier to review the repository interface changes if this change
is separate from adding `self` and dropping the zope attributes all over.
Having an empty constructor in a protocol is weird. I'm not sure if these args
should be converted to fields that all subclasses would have, and comments
around existing attributes say some should be going away. Comment it out for
now so that it's not in the way, but also not forgotten.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 17:47:39 -0400] rev 51920
tests: disable `test-check-interfaces.py` while converting to protocols
The goal is to convert everything, so get it all out of the way. The interfaces
don't get that much maintenance that this needs to be tested right now.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 02 Oct 2024 14:51:56 +0100] rev 51919
tests: always access the mercurial repo through `helpers-testrepo.sh`
In some contexts the mercurial repo needs to be accessed through system hg.
That's what `helpers-testrepo.sh` enforces, but some tests incorrectly
use the mercurial repo without going through that script.
This patch fixes those tests.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 02 Oct 2024 14:49:07 +0100] rev 51918
tests: in helpers-testrepo.sh switch from shell aliases to functions
The reason is that I want this script to work in non-interactive shells too.
(will be used in the next commit)
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 27 Sep 2024 17:25:15 +0100] rev 51917
rust: fix the deprecation warning in NaiveDateTime::from_timestamp
This warning appears between chrono 0.4.34 and 0.4.38, so
isn't affecting the current lock file, but it would come
when we upgraded the version.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 27 Sep 2024 15:53:56 +0200] rev 51916
run-tests: ensure that --no-rust do not use rust
Since having a valid value in HGWITHRUSTEXT is enough to trigger the use of
rust, we need to unset it before install to be sure.
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 03:04:48 +0200] rev 51915
revlogutils: teach
issue6528 filtering about grandparents
During dynamic filtering, we should assume that the current repository
is correct. Therefore the parents of the delta base can tell us if that
parent has metadata without having to build the whole text.
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:43:08 +0200] rev 51914
revlogutils: remember known metadata parents for
issue6528
In the cases where the parent revs tell us for sure that the parent has
metadata, remember this fact to avoid content recomputations later.
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:44:59 +0200] rev 51913
revlogutils: for
issue6528 fix, pre-cache nullrev as metadata-free
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:59:50 +0200] rev 51912
revlogutils: for
issue6528 fix, cache results for null changes
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:41:37 +0200] rev 51911
revlogutils: fix _chunk() reference
_chunk is only found in the inner revlog object and not directly exposed
outside.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 02 Sep 2024 22:14:38 +0200] rev 51910
rev-branch-cache: reenable memory mapping of the revision data
Now that we are no longer truncating it, we can mmap it again.
This provide a sizeable speedup on repository with a very large amount of
revision for example for a mozilla-try clone with 5 793 383 revisions, this
provide a speedup of 5ms - 10ms. Since they happens within the "critical" locked
path during push. These miliseconds are important.
In addition, the v3 branchmap format is use the rev-branch-cache more than the
v2 branchmap cache so this will be important.
On smaller repository we consistently see an improvement of one or two percents,
but the gain in absolute time is usually < 10 ms.
#### benchmark.name = hg.command.unbundle
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
# benchmark.variants.source = unbundle
# benchmark.variants.verbosity = quiet
### data-env-vars.name = mozilla-try-2024-03-26-zstd-sparse-revlog
## bin-env-vars.hg.flavor = default
e51161b12c7e: 3.527923
ebdcfe85b070: 3.468178 (-1.69%, -0.06)
## bin-env-vars.hg.flavor = rust
e51161b12c7e: 3.580158
ebdcfe85b070: 3.480564 (-2.78%, -0.10)
### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm
## bin-env-vars.hg.flavor = rust
e51161b12c7e: 3.527923
ebdcfe85b070: 3.468178 (-1.69%, -0.06)