Wed, 31 Jul 2024 13:35:54 +0200 rust-revlog: don't create an in-memory nodemap for filelogs from Python
Raphaël Gomès <rgomes@octobus.net> [Wed, 31 Jul 2024 13:35:54 +0200] rev 52178
rust-revlog: don't create an in-memory nodemap for filelogs from Python Explanations inline. Benchmarks from this change affect positively the only repo that showed this being a problem: ``` ### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm # benchmark.name = hg.command.cat # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.files = all-root # benchmark.variants.output = plain # benchmark.variants.rev = tip default: 62.848869 ~~~~~ before-this-patch: 58.113051 (-7.54%, -4.74) this-patch: 57.407533 (-8.66%, -5.44) ### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.limit-rev = 10 # benchmark.variants.patch = yes # benchmark.variants.rev = none default: 3.173532 ~~~~~ before-this-patch: 3.543591 (+11.66%, +0.37) this-patch: 3.297235 (+3.90%, +0.12) ```
Wed, 31 Jul 2024 15:02:55 +0200 rust-revlog: move non-persistent-nodemap rev lookup to the index
Raphaël Gomès <rgomes@octobus.net> [Wed, 31 Jul 2024 15:02:55 +0200] rev 52177
rust-revlog: move non-persistent-nodemap rev lookup to the index It only uses index features and does not need to be on the revlog. A later patch will make use of this function from a different context.
Mon, 29 Jul 2024 20:39:34 +0200 revlog: add glue to use a pure-Rust VFS
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:39:34 +0200] rev 52176
revlog: add glue to use a pure-Rust VFS This will save us a lot of calling back into Python, which is always horribly expensive. We are now faster in all benchmarked cases except for `log --patch` specifically on mozilla-try. Fixing this will happen in a later patch. ``` ### data-env-vars.name = mercurial-devel-2024-03-22-ds2-pnm # benchmark.name = hg.command.cat # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.files = all-root # benchmark.variants.output = plain # benchmark.variants.rev = tip e679697a6ca4: 1.760765 ~~~~~ 5559d7e63ec3: 1.555513 (-11.66%, -0.21) ### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm # benchmark.name = hg.command.cat # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.files = all-root # benchmark.variants.output = plain # benchmark.variants.rev = tip e679697a6ca4: 62.848869 ~~~~~ 5559d7e63ec3: 58.113051 (-7.54%, -4.74) ### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.limit-rev = 10 # benchmark.variants.patch = yes # benchmark.variants.rev = none e679697a6ca4: 3.173532 ~~~~~ 5559d7e63ec3: 3.543591 (+11.66%, +0.37) ### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.limit-rev = 1000 # benchmark.variants.patch = no # benchmark.variants.rev = none e679697a6ca4: 1.214698 ~~~~~ 5559d7e63ec3: 1.192478 (-1.83%, -0.02) ### data-env-vars.name = mozilla-unified-2024-03-22-ds2-pnm # benchmark.name = hg.command.cat # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.files = all-root # benchmark.variants.output = plain # benchmark.variants.rev = tip e679697a6ca4: 56.205474 ~~~~~ 5559d7e63ec3: 51.520074 (-8.34%, -4.69) ### data-env-vars.name = mozilla-unified-2024-03-22-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.limit-rev = 10 # benchmark.variants.patch = yes # benchmark.variants.rev = none e679697a6ca4: 2.105419 ~~~~~ 5559d7e63ec3: 2.051849 (-2.54%, -0.05) ### data-env-vars.name = mozilla-unified-2024-03-22-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.limit-rev = 1000 # benchmark.variants.patch = no # benchmark.variants.rev = none e679697a6ca4: 0.309960 ~~~~~ 5559d7e63ec3: 0.299035 (-3.52%, -0.01) ### data-env-vars.name = tryton-public-2024-03-22-ds2-pnm # benchmark.name = hg.command.cat # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.files = all-root # benchmark.variants.output = plain # benchmark.variants.rev = tip e679697a6ca4: 1.849832 ~~~~~ 5559d7e63ec3: 1.805076 (-2.42%, -0.04) ### data-env-vars.name = tryton-public-2024-03-22-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.limit-rev = 10 # benchmark.variants.patch = yes # benchmark.variants.rev = none e679697a6ca4: 0.289521 ~~~~~ 5559d7e63ec3: 0.279889 (-3.33%, -0.01) ### data-env-vars.name = tryton-public-2024-03-22-ds2-pnm # benchmark.name = hg.command.log # bin-env-vars.hg.flavor = rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.limit-rev = 1000 # benchmark.variants.patch = no # benchmark.variants.rev = none e679697a6ca4: 0.332270 ~~~~~ 5559d7e63ec3: 0.323324 (-2.69%, -0.01) ```
Mon, 29 Jul 2024 20:35:44 +0200 fncache: add attribute to check whether we're using dotencode
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:35:44 +0200] rev 52175
fncache: add attribute to check whether we're using dotencode This will make it easy to know if we can use the Rust implementation that doesn't support older forms of encoding.
Mon, 29 Jul 2024 20:34:38 +0200 fncachestore: add typing information
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:34:38 +0200] rev 52174
fncachestore: add typing information This helps with autocomplete.
Mon, 29 Jul 2024 20:34:06 +0200 fncache: refactor load check into a property
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:34:06 +0200] rev 52173
fncache: refactor load check into a property This makes the intent more obvious new callers less prone to error.
Mon, 29 Jul 2024 20:49:07 +0200 hg-core: add FnCacheVFS
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:49:07 +0200] rev 52172
hg-core: add FnCacheVFS This will allow us to only call back into Python to add items to the fncache, which should save us a lot of FFI overhead. This is also of course a stepping stone for more pure Rust work.
Mon, 29 Jul 2024 20:47:43 +0200 hg-core: add a complete VFS
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:47:43 +0200] rev 52171
hg-core: add a complete VFS This will be used from Python in a later change. More changes are needed in hg-core and rhg to properly clean up the APIs of the old VFS implementation but it can be done when the dust settles and we start adding more functionality to the pure Rust VFS.
Mon, 29 Jul 2024 20:28:42 +0200 hg-core: add fncache module
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 20:28:42 +0200] rev 52170
hg-core: add fncache module For now it's only a super simple trait. It will be used for calling back into Python soon, and later will be fleshed out into a full fncache.
Thu, 26 Sep 2024 13:55:26 +0200 rust: populate mmap by default if available
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Sep 2024 13:55:26 +0200] rev 52169
rust: populate mmap by default if available See 522b4d729e89edc76544fa549ed36de4aea0b7fb for more details. Background population to follow in a later patch.
Wed, 19 Jun 2024 18:20:22 +0200 rust-changelog: switch away from deprecated APIs for datetime use
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 18:20:22 +0200] rev 52168
rust-changelog: switch away from deprecated APIs for datetime use This was caught by clippy, nothing was changed aside from some light API changes.
Wed, 19 Jun 2024 19:10:49 +0200 revlog: add the glue to use the Rust `InnerRevlog` from Python
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 19:10:49 +0200] rev 52167
revlog: add the glue to use the Rust `InnerRevlog` from Python The performance of this has been looked at for quite some time, and some workflows are actually quite a bit faster than with the Python + C code. However, we are still (up to 20%) slower in some crucial places like cloning certain repos, log, cat, which makes this an incomplete rewrite. This is mostly due to the high amount of overhead in Python <-> Rust FFI, especially around the VFS code. A future patch series will rewrite the VFS code in pure Rust, which should hopefully get us up to par with current perfomance, if not better in all important cases. This is a "save state" of sorts, as this is a ton of code, and I don't want to pile up even more things in a single review. Continuing to try to match the current performance will take an extremely long time, if it's not impossible, without the aforementioned VFS work.
Wed, 19 Jun 2024 17:03:13 +0200 changelog: also set the general delta config flag in the data config
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 17:03:13 +0200] rev 52166
changelog: also set the general delta config flag in the data config This duplication is dubious, but it's a decision to be made at a later date, this is the fix.
Mon, 29 Jul 2024 15:03:52 +0200 rust-index: use `IndexEntry::offset` to compute read segments
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 15:03:52 +0200] rev 52165
rust-index: use `IndexEntry::offset` to compute read segments This only matters for inline revlogs where the impact is debatable, but this is what the C index does.
Thu, 10 Oct 2024 10:34:51 +0200 rust-revlog: add a Rust-only `InnerRevlog`
Raphaël Gomès <rgomes@octobus.net> [Thu, 10 Oct 2024 10:34:51 +0200] rev 52164
rust-revlog: add a Rust-only `InnerRevlog` This mirrors the Python `InnerRevlog` and will be used in a future patch to replace said Python implementation. This allows us to start doing more things in pure Rust, in particular reading and writing operations. A lot of changes have to be introduced all at once, it wouldn't be very useful to separate this patch IMO since all of them are either interlocked or only useful with the rest.
Thu, 10 Oct 2024 10:38:35 +0200 rust-index: fix the computation of data start
Raphaël Gomès <rgomes@octobus.net> [Thu, 10 Oct 2024 10:38:35 +0200] rev 52163
rust-index: fix the computation of data start This was falling into place instead of being correct, we clean up the logic by differenciating the on-disk offset and the actual start of the data more cleanly.
Thu, 10 Oct 2024 10:38:10 +0200 rust-index: return an error on a bad index header
Raphaël Gomès <rgomes@octobus.net> [Thu, 10 Oct 2024 10:38:10 +0200] rev 52162
rust-index: return an error on a bad index header This is more idiomatic and allows us to better handle the problem later.
Thu, 17 Oct 2024 15:22:38 +0200 rust-vfs: add a TODO to remember a decision taken about naming
Raphaël Gomès <rgomes@octobus.net> [Thu, 17 Oct 2024 15:22:38 +0200] rev 52161
rust-vfs: add a TODO to remember a decision taken about naming Explanations inline.
Wed, 25 Sep 2024 18:24:15 +0200 rust-revlog: introduce an `options` module
Raphaël Gomès <rgomes@octobus.net> [Wed, 25 Sep 2024 18:24:15 +0200] rev 52160
rust-revlog: introduce an `options` module This helps group all the relevant revlog options code and makes the `mod.rs` more readable.
Wed, 25 Sep 2024 18:10:03 +0200 rust-revlog: add file IO helpers
Raphaël Gomès <rgomes@octobus.net> [Wed, 25 Sep 2024 18:10:03 +0200] rev 52159
rust-revlog: add file IO helpers This will be useful for the upcoming `InnerRevlog`.
Wed, 25 Sep 2024 16:42:21 +0200 rust-revlog: add compression helpers
Raphaël Gomès <rgomes@octobus.net> [Wed, 25 Sep 2024 16:42:21 +0200] rev 52158
rust-revlog: add compression helpers This will be used in the upcoming `InnerRevlog` when reading/writing data.
Thu, 31 Oct 2024 17:24:18 -0400 hgweb: skip logging ConnectionAbortedError stable
Matt Harbison <matt_harbison@yahoo.com> [Thu, 31 Oct 2024 17:24:18 -0400] rev 52157
hgweb: skip logging ConnectionAbortedError Not stacktracing on `ConnectionResetError` was added in 6bbb12cba5a8 (though it was spelled differently for py2 support), but for some reason Windows occasionally triggers a `ConnectionAbortedError` here across various *.t files (notably `test-archive.t` and `test-lfs-serve-access.t`, but there are others). The payload that fails to send seems to be the html that describes the error to the client, so I suspect some code is seeing the error status code and closing the connection before the server gets to write this html. So don't log it, for test stability- nothing we can do anyway. FWIW, the CPython implementation of wsgihander specifically ignores these two errors, plus `BrokenPipeError`, with a comment that "we expect the client to close the connection abruptly from time to time"[1]. The `BrokenPipeError` is swallowed a level up in `do_write()`, and avoids writing the response following this stacktrace. I'm puzzled why a response is being written after these connection errors are detected- the CPython code referenced doesn't, and the connection is now broken at this point. Perhaps these errors should both be handled with the `BrokenPipeError` after the freeze. (The refactoring away from py2 compat may not be desireable in the freeze, but this is much easier to read, and obviously correct given the referenced CPython code.) I suspect this is what 6bceecb28806 was attempting to fix, but it wasn't specific about the sporadic errors it was seeing. [1] https://github.com/python/cpython/blob/b2eaa75b176e07730215d76d8dce4d63fb493391/Lib/wsgiref/handlers.py#L139
Fri, 25 Oct 2024 17:15:53 -0400 ci: add a runner for Windows 10 stable
Matt Harbison <matt_harbison@yahoo.com> [Fri, 25 Oct 2024 17:15:53 -0400] rev 52156
ci: add a runner for Windows 10 This is currently only manually invoked, and allows for failure because we only have a single runner that takes over 2h for a full run, and there are a handful of flakey tests, plus 3 known failing tests. The system being used here is running MSYS, Python, Visual Studio, etc, as installed by `install-windows-dependencies.ps1`. This script installs everything to a specific directory instead of using the defaults, so we adjust the MinGW shell path to compensate. Additionally, the script doesn't install the launcher `py.exe`. It is possible to adjust the script to install it, but it's an option to an existing python install (instead of a standalone installer), and I've had the whole python install fail and rollback when requested to install the launcher if it detects a newer one is already installed. In short, it is a point of failure for a feature we don't (yet?) need. Unlike other systems where the intepreter name includes the version, everything here is `python.exe`, so they can't all exist on `PATH` and let the script choose the desired one. (The `py.exe` launcher would accomplish, using the registry instead of `PATH`, but that wouldn't allow for venv installs.) Because of this, switch to the absolute path of the python interpreter to be used (in this case a venv created from the py39 install, which is old, but what both pyoxidizer and TortoiseHg currently use). The `RUNTEST_ARGS` hardcodes `-j8` because this system has 4 cores, and therefore runs 4 parallel tests by default. However on Windows, using more parallel tests than cores results in better performance for whatever reason. I don't have an optimal value yet (ideally the runner itself can make the adjustment on Windows), but this results in saving ~15m on a full run that otherwise takes ~2.5h. I'm also not concerned about how it would affect other Windows machines, because we don't have any at this point, and I have no idea when we can get more. As far as system setup goes, the CI is run by a dedicated user that lacks admin rights. The install script was run by an admin user, and then the standard user was configured to use it. If I set this up again, I'd probably give the dedicated user admin rights to run the install script, and reset to standard user rights when done. The python intepreter failed in weird ways when run by the standard user until it was manually reinstalled by the standard user: Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding Additionally, changing the environment through the Windows UI prompts to escalate to an admin user, and then setting the user level environment variables like `TEMP` and `PATH` (to try to avoid exceeding the 260 character path limit) didn't actually change the user's environment. (Likely it changed the admin user's environment, but I didn't confirm that.) I ended up having to use the registry editor for the standard user to make those changes.
Fri, 11 Oct 2024 15:04:13 -0400 tests: disable a section of `test-hgrc.t` that may hit a zeroconf bug stable
Matt Harbison <matt_harbison@yahoo.com> [Fri, 11 Oct 2024 15:04:13 -0400] rev 52155
tests: disable a section of `test-hgrc.t` that may hit a zeroconf bug This effectively re-disables the same test as cce9e7d2fb92, but unconditionally because it's not a pyoxidizer-specific problem (see below and 997c9b2069d1). I can run the test locally fine, with the same venv as CI is using, and have had multiple CI runs that don't hit this. But one failed with this: --- /private/tmp/mercurial-ci/tests/test-hgrc.t +++ /private/tmp/mercurial-ci/tests/test-hgrc.t.err @@ -305,5 +305,17 @@ [255] $ HGRCSKIPREPO=1 hg paths --config extensions.zeroconf= + Traceback (most recent call last): + File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 966, in run + self.readers[sock].handle_read() + File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 1020, in handle_read + msg = DNSIncoming(data) + File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 537, in __init__ + self.readOthers() + File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 650, in readOthers + self.readCharacterString(), + File "/private/tmp/hgtests.7idf706t/install/lib/python/hgext/zeroconf/Zeroconf.py", line 584, in readCharacterString + length = ord(self.data[self.offset]) + TypeError: ord() expected string of length 1, but int found foo = $TESTTMP/bar The zeroconf extension has bytes vs str problems that are obvious from inspection alone, and nobody has complained, so I'm not going to let this block getting CI for macOS up and running. Given that it's in the packet read code, I suspect that this 1) requires something on the network to speak mDNS, and 2) it is a timing issue if this is seen or not. (The bytes vs str issue itself is real, but only happen if a response is received quickly.)
Fri, 11 Oct 2024 11:03:21 -0400 tests: disable `test-git-interop.t` with a requirements directive stable
Matt Harbison <matt_harbison@yahoo.com> [Fri, 11 Oct 2024 11:03:21 -0400] rev 52154
tests: disable `test-git-interop.t` with a requirements directive Note that the failures in this test affect all platforms. I don't like this, but the test has been broken for awhile because of dirstate API changes, and nobody noticed because the required `pygit2` package isn't installed on the CI systems. I did install it on the mac CI system, which triggers this failure. Disabling it is no worse than not running it due to the missing package, but at least this way the CI systems can get the package installed, and the test can be enabled and fixed eventually, without needing to alter the CI systems. The feature here is kind of abused. I thought about adding one specifically to test for CI, but didn't feel like doing it at this point. Maybe if we need to disable things to get the Windows CI off the ground (but that likely requires testing for CI + platform).
Fri, 01 Nov 2024 16:22:40 -0400 tests: stabilize `test-extdiff.t` on macOS stable
Matt Harbison <matt_harbison@yahoo.com> [Fri, 01 Nov 2024 16:22:40 -0400] rev 52153
tests: stabilize `test-extdiff.t` on macOS The recent change in the extdiff extension to take into account whether the GUI is accessible in d1b54c152673 started triggering this. I was able to run the test cleanly without this change at the console, but somewhere along the line, I read that the CI runner isn't able to access the GUI when not run as the root user. This is causing CI failures, so we conditionalize these tests out where `DISPLAY` is set to a non empty value to force `procutil.isgui()` to be True, when it in fact doesn't have GUI access.
Tue, 29 Oct 2024 09:38:48 +0100 branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Tue, 29 Oct 2024 09:38:48 +0100] rev 52152
branching: merge stable into default
Sun, 27 Oct 2024 23:34:50 +0100 ci: build a wheel and use it to run c tests stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 27 Oct 2024 23:34:50 +0100] rev 52151
ci: build a wheel and use it to run c tests First step into building and testing wheel automatically.
Sun, 27 Oct 2024 14:10:45 +0100 ci: split the jobs on more stage stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 27 Oct 2024 14:10:45 +0100] rev 52150
ci: split the jobs on more stage We start to have a lot of job, grouping them help to clarifying the pipeline. We don't actually create dependency between each stage, so everything still run concurrently. However we are about to introduce some wheel-building job that will be reused by some tests. So some dependencies are coming.
Sun, 27 Oct 2024 14:08:57 +0100 ci: unify the way `check-pytype` inherit the common setting stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 27 Oct 2024 14:08:57 +0100] rev 52149
ci: unify the way `check-pytype` inherit the common setting All the other job use this syntax, so lets us it there too.
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -30 +30 +50 +100 tip