Raphaël Gomès <rgomes@octobus.net> [Wed, 31 Jul 2024 13:35:54 +0200] rev 52178
rust-revlog: don't create an in-memory nodemap for filelogs from Python
Explanations inline.
Benchmarks from this change affect positively the only repo that showed this
being a problem:
```
### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm
# benchmark.name = hg.command.cat
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.files = all-root
# benchmark.variants.output = plain
# benchmark.variants.rev = tip
default: 62.848869 ~~~~~
before-this-patch: 58.113051 (-7.54%, -4.74)
this-patch: 57.407533 (-8.66%, -5.44)
### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 10
# benchmark.variants.patch = yes
# benchmark.variants.rev = none
default: 3.173532 ~~~~~
before-this-patch: 3.543591 (+11.66%, +0.37)
this-patch: 3.297235 (+3.90%, +0.12)
```
Raphaël Gomès <rgomes@octobus.net> [Wed, 31 Jul 2024 15:02:55 +0200] rev 52177
rust-revlog: move non-persistent-nodemap rev lookup to the index
It only uses index features and does not need to be on the revlog. A later
patch will make use of this function from a different context.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:39:34 +0200] rev 52176
revlog: add glue to use a pure-Rust VFS
This will save us a lot of calling back into Python, which is always
horribly expensive.
We are now faster in all benchmarked cases except for `log --patch`
specifically on mozilla-try. Fixing this will happen in a later patch.
```
### data-env-vars.name = mercurial-devel-2024-03-22-ds2-pnm
# benchmark.name = hg.command.cat
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.files = all-root
# benchmark.variants.output = plain
# benchmark.variants.rev = tip
e679697a6ca4: 1.760765 ~~~~~
5559d7e63ec3: 1.555513 (-11.66%, -0.21)
### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm
# benchmark.name = hg.command.cat
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.files = all-root
# benchmark.variants.output = plain
# benchmark.variants.rev = tip
e679697a6ca4: 62.848869 ~~~~~
5559d7e63ec3: 58.113051 (-7.54%, -4.74)
### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 10
# benchmark.variants.patch = yes
# benchmark.variants.rev = none
e679697a6ca4: 3.173532 ~~~~~
5559d7e63ec3: 3.543591 (+11.66%, +0.37)
### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 1000
# benchmark.variants.patch = no
# benchmark.variants.rev = none
e679697a6ca4: 1.214698 ~~~~~
5559d7e63ec3: 1.192478 (-1.83%, -0.02)
### data-env-vars.name = mozilla-unified-2024-03-22-ds2-pnm
# benchmark.name = hg.command.cat
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.files = all-root
# benchmark.variants.output = plain
# benchmark.variants.rev = tip
e679697a6ca4: 56.205474 ~~~~~
5559d7e63ec3: 51.520074 (-8.34%, -4.69)
### data-env-vars.name = mozilla-unified-2024-03-22-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 10
# benchmark.variants.patch = yes
# benchmark.variants.rev = none
e679697a6ca4: 2.105419 ~~~~~
5559d7e63ec3: 2.051849 (-2.54%, -0.05)
### data-env-vars.name = mozilla-unified-2024-03-22-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 1000
# benchmark.variants.patch = no
# benchmark.variants.rev = none
e679697a6ca4: 0.309960 ~~~~~
5559d7e63ec3: 0.299035 (-3.52%, -0.01)
### data-env-vars.name = tryton-public-2024-03-22-ds2-pnm
# benchmark.name = hg.command.cat
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.files = all-root
# benchmark.variants.output = plain
# benchmark.variants.rev = tip
e679697a6ca4: 1.849832 ~~~~~
5559d7e63ec3: 1.805076 (-2.42%, -0.04)
### data-env-vars.name = tryton-public-2024-03-22-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 10
# benchmark.variants.patch = yes
# benchmark.variants.rev = none
e679697a6ca4: 0.289521 ~~~~~
5559d7e63ec3: 0.279889 (-3.33%, -0.01)
### data-env-vars.name = tryton-public-2024-03-22-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 1000
# benchmark.variants.patch = no
# benchmark.variants.rev = none
e679697a6ca4: 0.332270 ~~~~~
5559d7e63ec3: 0.323324 (-2.69%, -0.01)
```
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:35:44 +0200] rev 52175
fncache: add attribute to check whether we're using dotencode
This will make it easy to know if we can use the Rust implementation that
doesn't support older forms of encoding.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:34:38 +0200] rev 52174
fncachestore: add typing information
This helps with autocomplete.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:34:06 +0200] rev 52173
fncache: refactor load check into a property
This makes the intent more obvious new callers less prone to error.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:49:07 +0200] rev 52172
hg-core: add FnCacheVFS
This will allow us to only call back into Python to add items to the fncache,
which should save us a lot of FFI overhead.
This is also of course a stepping stone for more pure Rust work.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:47:43 +0200] rev 52171
hg-core: add a complete VFS
This will be used from Python in a later change.
More changes are needed in hg-core and rhg to properly clean up the APIs
of the old VFS implementation but it can be done when the dust settles
and we start adding more functionality to the pure Rust VFS.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:28:42 +0200] rev 52170
hg-core: add fncache module
For now it's only a super simple trait. It will be used for calling back into
Python soon, and later will be fleshed out into a full fncache.
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Sep 2024 13:55:26 +0200] rev 52169
rust: populate mmap by default if available
See
522b4d729e89edc76544fa549ed36de4aea0b7fb for more details.
Background population to follow in a later patch.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 18:20:22 +0200] rev 52168
rust-changelog: switch away from deprecated APIs for datetime use
This was caught by clippy, nothing was changed aside from some light API
changes.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 19:10:49 +0200] rev 52167
revlog: add the glue to use the Rust `InnerRevlog` from Python
The performance of this has been looked at for quite some time, and some
workflows are actually quite a bit faster than with the Python + C code.
However, we are still (up to 20%) slower in some crucial places like cloning
certain repos, log, cat, which makes this an incomplete rewrite. This is
mostly due to the high amount of overhead in Python <-> Rust FFI, especially
around the VFS code. A future patch series will rewrite the VFS code in
pure Rust, which should hopefully get us up to par with current perfomance,
if not better in all important cases.
This is a "save state" of sorts, as this is a ton of code, and I don't want
to pile up even more things in a single review.
Continuing to try to match the current performance will take an extremely
long time, if it's not impossible, without the aforementioned VFS work.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 17:03:13 +0200] rev 52166
changelog: also set the general delta config flag in the data config
This duplication is dubious, but it's a decision to be made at a later date,
this is the fix.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 15:03:52 +0200] rev 52165
rust-index: use `IndexEntry::offset` to compute read segments
This only matters for inline revlogs where the impact is debatable, but
this is what the C index does.
Raphaël Gomès <rgomes@octobus.net> [Thu, 10 Oct 2024 10:34:51 +0200] rev 52164
rust-revlog: add a Rust-only `InnerRevlog`
This mirrors the Python `InnerRevlog` and will be used in a future patch
to replace said Python implementation. This allows us to start doing more
things in pure Rust, in particular reading and writing operations.
A lot of changes have to be introduced all at once, it wouldn't be very
useful to separate this patch IMO since all of them are either interlocked
or only useful with the rest.
Raphaël Gomès <rgomes@octobus.net> [Thu, 10 Oct 2024 10:38:35 +0200] rev 52163
rust-index: fix the computation of data start
This was falling into place instead of being correct, we clean up the logic
by differenciating the on-disk offset and the actual start of the data
more cleanly.
Raphaël Gomès <rgomes@octobus.net> [Thu, 10 Oct 2024 10:38:10 +0200] rev 52162
rust-index: return an error on a bad index header
This is more idiomatic and allows us to better handle the problem later.
Raphaël Gomès <rgomes@octobus.net> [Thu, 17 Oct 2024 15:22:38 +0200] rev 52161
rust-vfs: add a TODO to remember a decision taken about naming
Explanations inline.
Raphaël Gomès <rgomes@octobus.net> [Wed, 25 Sep 2024 18:24:15 +0200] rev 52160
rust-revlog: introduce an `options` module
This helps group all the relevant revlog options code and makes the `mod.rs`
more readable.
Raphaël Gomès <rgomes@octobus.net> [Wed, 25 Sep 2024 18:10:03 +0200] rev 52159
rust-revlog: add file IO helpers
This will be useful for the upcoming `InnerRevlog`.
Raphaël Gomès <rgomes@octobus.net> [Wed, 25 Sep 2024 16:42:21 +0200] rev 52158
rust-revlog: add compression helpers
This will be used in the upcoming `InnerRevlog` when reading/writing data.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 31 Oct 2024 17:24:18 -0400] rev 52157
hgweb: skip logging ConnectionAbortedError
Not stacktracing on `ConnectionResetError` was added in
6bbb12cba5a8 (though it
was spelled differently for py2 support), but for some reason Windows
occasionally triggers a `ConnectionAbortedError` here across various *.t files
(notably `test-archive.t` and `test-lfs-serve-access.t`, but there are others).
The payload that fails to send seems to be the html that describes the error to
the client, so I suspect some code is seeing the error status code and closing
the connection before the server gets to write this html. So don't log it, for
test stability- nothing we can do anyway.
FWIW, the CPython implementation of wsgihander specifically ignores these two
errors, plus `BrokenPipeError`, with a comment that "we expect the client to
close the connection abruptly from time to time"[1]. The `BrokenPipeError` is
swallowed a level up in `do_write()`, and avoids writing the response following
this stacktrace. I'm puzzled why a response is being written after these
connection errors are detected- the CPython code referenced doesn't, and the
connection is now broken at this point. Perhaps these errors should both be
handled with the `BrokenPipeError` after the freeze.
(The refactoring away from py2 compat may not be desireable in the freeze, but
this is much easier to read, and obviously correct given the referenced CPython
code.)
I suspect this is what
6bceecb28806 was attempting to fix, but it wasn't
specific about the sporadic errors it was seeing.
[1] https://github.com/python/cpython/blob/
b2eaa75b176e07730215d76d8dce4d63fb493391/Lib/wsgiref/handlers.py#L139
Matt Harbison <matt_harbison@yahoo.com> [Fri, 25 Oct 2024 17:15:53 -0400] rev 52156
ci: add a runner for Windows 10
This is currently only manually invoked, and allows for failure because we only
have a single runner that takes over 2h for a full run, and there are a handful
of flakey tests, plus 3 known failing tests.
The system being used here is running MSYS, Python, Visual Studio, etc, as
installed by `install-windows-dependencies.ps1`. This script installs
everything to a specific directory instead of using the defaults, so we adjust
the MinGW shell path to compensate. Additionally, the script doesn't install
the launcher `py.exe`. It is possible to adjust the script to install it, but
it's an option to an existing python install (instead of a standalone installer),
and I've had the whole python install fail and rollback when requested to install
the launcher if it detects a newer one is already installed. In short, it is
a point of failure for a feature we don't (yet?) need.
Unlike other systems where the intepreter name includes the version, everything
here is `python.exe`, so they can't all exist on `PATH` and let the script
choose the desired one. (The `py.exe` launcher would accomplish, using the
registry instead of `PATH`, but that wouldn't allow for venv installs.) Because
of this, switch to the absolute path of the python interpreter to be used (in
this case a venv created from the py39 install, which is old, but what both
pyoxidizer and TortoiseHg currently use).
The `RUNTEST_ARGS` hardcodes `-j8` because this system has 4 cores, and
therefore runs 4 parallel tests by default. However on Windows, using more
parallel tests than cores results in better performance for whatever reason. I
don't have an optimal value yet (ideally the runner itself can make the
adjustment on Windows), but this results in saving ~15m on a full run that
otherwise takes ~2.5h. I'm also not concerned about how it would affect other
Windows machines, because we don't have any at this point, and I have no idea
when we can get more.
As far as system setup goes, the CI is run by a dedicated user that lacks admin
rights. The install script was run by an admin user, and then the standard user
was configured to use it. If I set this up again, I'd probably give the
dedicated user admin rights to run the install script, and reset to standard
user rights when done. The python intepreter failed in weird ways when run by
the standard user until it was manually reinstalled by the standard user:
Fatal Python error: init_fs_encoding: failed to get the Python codec of the
filesystem encoding
Additionally, changing the environment through the Windows UI prompts to
escalate to an admin user, and then setting the user level environment variables
like `TEMP` and `PATH` (to try to avoid exceeding the 260 character path limit)
didn't actually change the user's environment. (Likely it changed the admin
user's environment, but I didn't confirm that.) I ended up having to use the
registry editor for the standard user to make those changes.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 11 Oct 2024 15:04:13 -0400] rev 52155
tests: disable a section of `test-hgrc.t` that may hit a zeroconf bug
This effectively re-disables the same test as
cce9e7d2fb92, but unconditionally
because it's not a pyoxidizer-specific problem (see below and
997c9b2069d1).
I can run the test locally fine, with the same venv as CI is using, and have had
multiple CI runs that don't hit this. But one failed with this:
--- /private/tmp/mercurial-ci/tests/test-hgrc.t
+++ /private/tmp/mercurial-ci/tests/test-hgrc.t.err
@@ -305,5 +305,17 @@
[255]
$ HGRCSKIPREPO=1 hg paths --config extensions.zeroconf=
+ Traceback (most recent call last):
+ File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 966, in run
+ self.readers[sock].handle_read()
+ File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 1020, in handle_read
+ msg = DNSIncoming(data)
+ File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 537, in __init__
+ self.readOthers()
+ File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 650, in readOthers
+ self.readCharacterString(),
+ File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 584, in readCharacterString
+ length = ord(self.data[self.offset])
+ TypeError: ord() expected string of length 1, but int found
foo = $TESTTMP/bar
The zeroconf extension has bytes vs str problems that are obvious from
inspection alone, and nobody has complained, so I'm not going to let this block
getting CI for macOS up and running. Given that it's in the packet read code,
I suspect that this 1) requires something on the network to speak mDNS, and 2)
it is a timing issue if this is seen or not. (The bytes vs str issue itself is
real, but only happen if a response is received quickly.)
Matt Harbison <matt_harbison@yahoo.com> [Fri, 11 Oct 2024 11:03:21 -0400] rev 52154
tests: disable `test-git-interop.t` with a requirements directive
Note that the failures in this test affect all platforms.
I don't like this, but the test has been broken for awhile because of dirstate
API changes, and nobody noticed because the required `pygit2` package isn't
installed on the CI systems. I did install it on the mac CI system, which
triggers this failure. Disabling it is no worse than not running it due to the
missing package, but at least this way the CI systems can get the package
installed, and the test can be enabled and fixed eventually, without needing to
alter the CI systems.
The feature here is kind of abused. I thought about adding one specifically to
test for CI, but didn't feel like doing it at this point. Maybe if we need to
disable things to get the Windows CI off the ground (but that likely requires
testing for CI + platform).
Matt Harbison <matt_harbison@yahoo.com> [Fri, 01 Nov 2024 16:22:40 -0400] rev 52153
tests: stabilize `test-extdiff.t` on macOS
The recent change in the extdiff extension to take into account whether the GUI
is accessible in
d1b54c152673 started triggering this. I was able to run the
test cleanly without this change at the console, but somewhere along the line, I
read that the CI runner isn't able to access the GUI when not run as the root
user. This is causing CI failures, so we conditionalize these tests out where
`DISPLAY` is set to a non empty value to force `procutil.isgui()` to be True,
when it in fact doesn't have GUI access.
Raphaël Gomès <rgomes@octobus.net> [Tue, 29 Oct 2024 09:38:48 +0100] rev 52152
branching: merge stable into default
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 27 Oct 2024 23:34:50 +0100] rev 52151
ci: build a wheel and use it to run c tests
First step into building and testing wheel automatically.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 27 Oct 2024 14:10:45 +0100] rev 52150
ci: split the jobs on more stage
We start to have a lot of job, grouping them help to clarifying the pipeline.
We don't actually create dependency between each stage, so everything still run
concurrently. However we are about to introduce some wheel-building job that
will be reused by some tests. So some dependencies are coming.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 27 Oct 2024 14:08:57 +0100] rev 52149
ci: unify the way `check-pytype` inherit the common setting
All the other job use this syntax, so lets us it there too.