Mon, 18 Dec 2023 14:51:20 -0800 matchers: use correct method for finding index in vector
Martin von Zweigbergk <martinvonz@google.com> [Mon, 18 Dec 2023 14:51:20 -0800] rev 51294
matchers: use correct method for finding index in vector The path matcher has an optimization for when all paths are `rootfilesin:`. This optimization exists in both Python and Rust. However, the Rust implementation currently has a bug that makes it fail in most cases. The bug is that it `rfind()` where it was clearly intended to use `rposition()`. This patch fixes that and adds a test.
Tue, 12 Dec 2023 17:08:45 +0100 dirstate: make the `transaction` argument of `setbranch` mandatory
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 12 Dec 2023 17:08:45 +0100] rev 51293
dirstate: make the `transaction` argument of `setbranch` mandatory This is deprecated since 6.4. We should drop it now.
Wed, 20 Dec 2023 14:59:31 +0100 rust-clippy: apply some more trivial fixes
Raphaël Gomès <rgomes@octobus.net> [Wed, 20 Dec 2023 14:59:31 +0100] rev 51292
rust-clippy: apply some more trivial fixes All of these were hinted at by clippy and make the code simpler.
Wed, 20 Dec 2023 14:58:36 +0100 rust-clippy: simplify `match` to `if let`
Raphaël Gomès <rgomes@octobus.net> [Wed, 20 Dec 2023 14:58:36 +0100] rev 51291
rust-clippy: simplify `match` to `if let` This was hinted at by clippy, and makes it more obvious that nothing is happening in the `None` case.
Fri, 01 Dec 2023 22:56:08 +0100 censor: accept multiple revision in a single call
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:56:08 +0100] rev 51290
censor: accept multiple revision in a single call This is useful when dealing with corruption, as all the corrupted revision can be dealt with in one go.
Fri, 01 Dec 2023 22:46:46 +0100 censor: be more verbose about the other steps too
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:46:46 +0100] rev 51289
censor: be more verbose about the other steps too If we informs the user about head checking, we should tell him when the other operation happens too. Otherwise the user can imagine to still be in the head checking part.
Fri, 01 Dec 2023 22:44:33 +0100 censor: add a command flag to skip the head checks
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:44:33 +0100] rev 51288
censor: add a command flag to skip the head checks In some case we spend hours of time checking the heads to censors a simple file is not a good behavior. Especially when censors is used to removed corrupted content.
Fri, 01 Dec 2023 22:33:35 +0100 censor: inform the user that we are spending time checking heads
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:33:35 +0100] rev 51287
censor: inform the user that we are spending time checking heads The time this can consume can be a surprise to the user, lets be explicit about it.
Fri, 01 Dec 2023 22:25:52 +0100 censor: mention that we check the heads in the help
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:25:52 +0100] rev 51286
censor: mention that we check the heads in the help And add a message to will explain the possibly long time spent doing this.
Thu, 14 Dec 2023 09:57:25 +0100 rust-index: only access offsets if revlog is inline
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Dec 2023 09:57:25 +0100] rev 51285
rust-index: only access offsets if revlog is inline Accessing the `RwLock` ended up showing up in profiles even with no contention. Offsets only exist for inline revlogs, so gate everything behind an inline check.
Wed, 06 Dec 2023 11:04:18 +0100 rust-index: cache the head nodeids python list
Raphaël Gomès <rgomes@octobus.net> [Wed, 06 Dec 2023 11:04:18 +0100] rev 51284
rust-index: cache the head nodeids python list Same optimization as before, but for the nodeids this time.
Tue, 05 Dec 2023 14:50:05 +0100 rust-index: add fast-path for getting a list of all heads as nodes
Raphaël Gomès <rgomes@octobus.net> [Tue, 05 Dec 2023 14:50:05 +0100] rev 51283
rust-index: add fast-path for getting a list of all heads as nodes This avoids a lot of back-and-forth between Python and Rust. We forgo adding a fast-path in the `filteredchangelog` case yet. If it shows up in profiling, we might add the variant with a filter.
Wed, 29 Nov 2023 23:22:51 -0500 rust-index-cpython: cache the heads' PyList representation
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 23:22:51 -0500] rev 51282
rust-index-cpython: cache the heads' PyList representation This is the same optimization that the C index does, we just have more separation of the Python and native sides.
Wed, 29 Nov 2023 15:58:24 -0500 rust-index: use a `BitVec` instead of plain `Vec` for heads computation
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 15:58:24 -0500] rev 51281
rust-index: use a `BitVec` instead of plain `Vec` for heads computation The `Vec` method uses one byte per revision, this uses 1 per 8 revisions, which improves our memory footprint. For large graphs (10+ millions), this can make a measurable difference server-side. I have seen no measurable impact on execution speed.
Wed, 29 Nov 2023 10:04:41 -0500 rust-index: implement faster retain heads using a vec instead of a hashset
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 10:04:41 -0500] rev 51280
rust-index: implement faster retain heads using a vec instead of a hashset This is the same optimization that the C index does, we're only catching up now because this showed up as slow in benchmarking.
Thu, 14 Dec 2023 11:52:05 +0100 rust-index: allow inlining VCSGraph parents across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Dec 2023 11:52:05 +0100] rev 51279
rust-index: allow inlining VCSGraph parents across crates
Thu, 23 Nov 2023 18:48:07 +0100 rust-index: allow inlining `parents` across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 18:48:07 +0100] rev 51278
rust-index: allow inlining `parents` across crates
Thu, 23 Nov 2023 18:47:42 +0100 rust-index: allow inlining `check_revision` across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 18:47:42 +0100] rev 51277
rust-index: allow inlining `check_revision` across crates
Thu, 23 Nov 2023 03:41:58 +0100 rust-index: document safety invariants being upheld for every `unsafe` block
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 03:41:58 +0100] rev 51276
rust-index: document safety invariants being upheld for every `unsafe` block We've added a lot of `unsafe` code that shares Rust structs with Python. While this is unfortunate, it is also unavoidable, so let's at least systematically explain why each call to `unsafe` is sound. If any of the unsafe code ends up being wrong (because everyone screws up at some point), this change at least continues the unspoken rule of always explaining the need for `unsafe`, so we at least get a chance to think.
Sun, 29 Oct 2023 12:18:03 +0100 rust-index: renamed `MixedIndex` as `Index`
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:18:03 +0100] rev 51275
rust-index: renamed `MixedIndex` as `Index` It is simply not mixed any more, hence the name had become a future source of confusion.
Sun, 29 Oct 2023 23:54:05 +0100 rust-index: stop instantiating a C Index
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 23:54:05 +0100] rev 51274
rust-index: stop instantiating a C Index The only missing piece was the `cache` to be returned from `revlog.parse_index_v1_mixed`, and it really seems that it is essentially repetition of the input, if `inline` is `True`. Not worth a Rust implementation (C implementation is probably there for historical reasons).
Mon, 30 Oct 2023 21:28:30 +0100 rust-revlog: using the ad-hoc `NodeTree` in scmutil
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:28:30 +0100] rev 51273
rust-revlog: using the ad-hoc `NodeTree` in scmutil Now that we have an independent `NodeTree` class able to work natively on the pure Rust index, we use it in `mercurial.scmutil`, with automatic invalidation after mutation of the index. This code path is tested by `test-revisions.t` and `test-template-functions.t`
Mon, 30 Oct 2023 22:36:30 +0100 rust-revlog: add invalidation detection to `NodeTree` class
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 22:36:30 +0100] rev 51272
rust-revlog: add invalidation detection to `NodeTree` class This will be useful for callers, such as `scmutil` who reuse a `NodeTree` instance as a cache. They would otherwise get hard errors if any mutation of the index occurred since instantiation. This is something the C index does not provide.
Thu, 02 Nov 2023 15:50:13 +0100 rust-index: add support for `del index[r]`
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 15:50:13 +0100] rev 51271
rust-index: add support for `del index[r]` Only the `del index[r:]` syntax was supported, but the comment said otherwise. It's not actually used in core code, but the C index supports it.
Mon, 30 Oct 2023 21:26:17 +0100 rust-revlog: bare minimal NodeTree exposition
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:26:17 +0100] rev 51270
rust-revlog: bare minimal NodeTree exposition The independent `NodeTree` instances needs to be associated to an index (for forward-checks of candidates) but do not need to encompass all revisions from that index. This is exactly how it is used in `scmutil.shortesthenodeidprefix` and we restrict the implementation to the bare minimum needed there and to write convincing tests. It would of course be fairly trivial to add more.
Mon, 30 Oct 2023 21:25:28 +0100 rust-index: a property to identify the Rust index as such
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:25:28 +0100] rev 51269
rust-index: a property to identify the Rust index as such Will be useful soon in `mercurial.scmutil` and potentially elsewhere
Mon, 30 Oct 2023 15:32:33 +0100 rust-cpython-revlog: renamed NodeTree import as CoreNodeTree
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 15:32:33 +0100] rev 51268
rust-cpython-revlog: renamed NodeTree import as CoreNodeTree We're about to introduce a `NodeTree` Python class (hence also a Rust struct) and it would be a collision with the import
Fri, 20 Oct 2023 09:48:53 +0200 rust-index: stop using C index
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:48:53 +0200] rev 51267
rust-index: stop using C index We still keep its wrapper implementation in `hg-cpython::cindex`, because we might want to recreate ancestors handling objects using it for the case of REVLOGV2. Also, we still instantiate it (from Python code) and store it as attribute, for the likes of `get_cindex` and the caller that relies on it, but that is soon to be removed, too.
Sun, 29 Oct 2023 12:07:05 +0100 rust-index: using `hg::index::Index` in discovery
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 12:07:05 +0100] rev 51266
rust-index: using `hg::index::Index` in discovery At this point the C index is not used any more: we had to remove `pyindex_to_graph()` to avoid the dead code warning.
Sun, 29 Oct 2023 12:01:57 +0100 rust-python-testing: separated base test classes
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:01:57 +0100] rev 51265
rust-python-testing: separated base test classes This will allow, e.g., to change `test-rust-discovery.py` simply by adding the appropriate base class.
Sun, 29 Oct 2023 11:21:18 +0100 rust-discovery: encapsulated conversions to vec for instance methods
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:21:18 +0100] rev 51264
rust-discovery: encapsulated conversions to vec for instance methods This new `pyiter_to_vec` is pretty trivial, and only mildly reduces code duplication. The main advantage is that it encapsulates access to the `index` attribute, which will be changed when we replace the C index by the Rust index, given as `PySharedRef`.
Sun, 29 Oct 2023 11:10:09 +0100 rust-discovery: moving most of hg-cpython methods to regular code blocks
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:10:09 +0100] rev 51263
rust-discovery: moving most of hg-cpython methods to regular code blocks The chosen methods are those with conversion of an incoming Python iterable, as they will be changed the most when we will remove the C index, and `takefullsample` for consistency with `takequicksample`.
Sun, 29 Oct 2023 10:47:54 +0100 rust-index: using `hg::index::Index` in `hg-cpython::dagops`
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 10:47:54 +0100] rev 51262
rust-index: using `hg::index::Index` in `hg-cpython::dagops` Hooking `headrevs` to the Rust index is straightforward as long as we go the `PySharedRef` way. Direct attempts of obtaining a reference to the inner `hg::index::Index` fail for lifetime reasons: the reference is bound to the GIL, yet the `as_set` local variable is considered to be static (the borrow checker clearly does not realize or care that this set only stores `Revision` values). In `rank()`, the chosen solution is the simplest as far as `hg-cpython` is concerned, but it has the defect of removing an implementation that would be easily adaptable if the core index did implement `RankedGraph` (returning the same error as long as only `REVLOGV1` is supported), but that would introduce a direct dependency of `hg-core` on the ``vcsgraph` crate.
Sat, 28 Oct 2023 22:50:10 +0200 rust-index: using `hg::index::Index` in MissingAncestors
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sat, 28 Oct 2023 22:50:10 +0200] rev 51261
rust-index: using `hg::index::Index` in MissingAncestors With this, the whole `hg-cpython::ancestors` module can now work without the C index.
Fri, 27 Oct 2023 22:11:05 +0200 rust-index: using the `hg::index::Index` in ancestors iterator and lazy set
Georges Racinet <georges.racinet@octobus.net> [Fri, 27 Oct 2023 22:11:05 +0200] rev 51260
rust-index: using the `hg::index::Index` in ancestors iterator and lazy set Since there is no Rust implementation for REVLOGV2/CHANGELOGv2, we declare them to be incompatible with Rust, hence indexes in these formats will use the implementations from Python `mercurial.ancestor`. If this is an unacceptable performance hit for current users of these formats, we can later on add Rust implementations based on the C index for them or implement these formats for the Rust indexes. Among the challenges that we had to meet, we wanted to avoid taking the GIL each time the inner (vcsgraph) iterator has to call the parents function. This would probably still be acceptable in terms of performance with `AncestorsIterator`, but not with `LazyAncestors` nor for the upcoming change in `MissingAncestors`. Hence we enclose the reference to the index in a `PySharedRef`, leading to more rigourous checking of mutations, which does pass now that there no logically immutable methods of `hg::index::Index` that take a mutable reference as input.
Fri, 27 Oct 2023 23:29:29 +0200 revlog: always use a Rust index for REVLOGv1 if rustext is present
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:29:29 +0200] rev 51259
revlog: always use a Rust index for REVLOGv1 if rustext is present We are about to change classes such as `rustext.AncestorsIterator` to take a Rust index, hence we cannot have the option not to use the Rust index. Note: this can be refined depending on whether we want to keep this option or not. We will have to make two versions of `AncestorsIterator` and its sibling to support REVLOGV2 and CHANGELOGv2 anyway. Meanwhile, this is the simplest change to make the tests pass.
Sun, 29 Oct 2023 18:35:32 +0100 rust-index: disabling flagprocessor tests
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 18:35:32 +0100] rev 51258
rust-index: disabling flagprocessor tests The list of flags supported by the Rust index is not dynamic, hence flagprocessor has no chance to work.
Tue, 31 Oct 2023 17:58:56 +0100 rust-index: support `unionrepo`'s compressed length hack
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:58:56 +0100] rev 51257
rust-index: support `unionrepo`'s compressed length hack Explanations inline.
Fri, 27 Oct 2023 23:21:50 +0200 rust-index: honour incoming using_general_delta in `deltachain`
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:21:50 +0200] rev 51256
rust-index: honour incoming using_general_delta in `deltachain` It looks to be a leftover from some past, but the C index considers only the value passed from Python whereas up to now the Rust index was using the value of its attribute. As a middle ground, we make this argument of `deltachain` optional from the Python side, with the Rust implementation only defaulting to its attribute. This way, we reduce false leads when a difference in results is spotted.
Fri, 27 Oct 2023 21:48:45 +0200 rust-index: use interior mutability in head revs and caches
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 21:48:45 +0200] rev 51255
rust-index: use interior mutability in head revs and caches For upcoming changes in `hg-cpython` switching to the `hg-core` index in ancestors iterators, we will need to avoid excessive mutability, restricting the use of mutable references on `hg::index::Index` to methods that actually logically mutate it, whereas the maintenance of caches such as `head_revs` clearly does not. We illustrate that immediately by switching to immutable borrows in the corresponding methods of `hg-cpython::MixedIndex`
Thu, 26 Oct 2023 15:26:19 +0200 rust-index: add Sync bound to all relevant mmap-derived values
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Oct 2023 15:26:19 +0200] rev 51254
rust-index: add Sync bound to all relevant mmap-derived values All readonly mmaps are Sync as far as Rust is concerned. Integrity of the mmap'ed file is a concern separate to Rust's memory model, since it requires out-of-program handling via locks, etc. This will help when we start sharing the Rust Index with Python.
Tue, 31 Oct 2023 18:09:43 +0100 debugindexstats: handle the lack of Rust support better
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 18:09:43 +0100] rev 51253
debugindexstats: handle the lack of Rust support better We don't have any stats in the Rust index. Currently it is not known which stats would be interesting to get, so if they end up being important, we can add them later.
Tue, 31 Oct 2023 17:36:59 +0100 rust-python-index: don't panic on a corrupted index when calling from Python
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:36:59 +0100] rev 51252
rust-python-index: don't panic on a corrupted index when calling from Python This makes `test-verify.t` pass again. In an ideal world, we would find the exact commit where this test breaks and amend part of this change there, but this is a long enough series.
Tue, 31 Oct 2023 17:34:31 +0100 tests: ignore test-storage when using Rust
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:34:31 +0100] rev 51251
tests: ignore test-storage when using Rust This is only relevant for Python code and the SQLite backend, which is in a half-abandoned state.
Fri, 20 Oct 2023 09:12:22 +0200 rust-index: optimize find_gca_candidates() on less than 8 revisions
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:12:22 +0200] rev 51250
rust-index: optimize find_gca_candidates() on less than 8 revisions This is expected to be by far the most common case, given that, e.g., merging involves using it on two revisions. Using a `u8` as support for the bitset obviously divides the amount of RAM needed by 8. To state the obvious, on a repository with 10 million changesets, this spares 70MB. It is also possible that it'd be slightly faster, because it is easier to allocate and provides better cache locality. It is possible that some exhaustive listing of the traits implemented by `u8` and `u64` would avoid the added duplication, but that can be done later and would need a replacement for the `MAX` consts.
Fri, 20 Oct 2023 08:54:49 +0200 rust-index: simplification in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:54:49 +0200] rev 51249
rust-index: simplification in find_gca_candidates() `parent_seen` can be made a mutable ref, making this part more obvious, not needing to be commented so much. The micro-optimization of avoiding the union if `parent_seen` and `current_seen` agree is pushed down in the `union()` method of the fast, `u64` based bit set implementation (in case it matters).
Fri, 20 Oct 2023 08:43:00 +0200 rust-index: avoid double negation in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:43:00 +0200] rev 51248
rust-index: avoid double negation in find_gca_candidates()
Fri, 20 Oct 2023 08:17:00 +0200 rust-index: avoid some cloning in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:17:00 +0200] rev 51247
rust-index: avoid some cloning in find_gca_candidates() Instead of keeping the information whether the current revision is poisoned on `current_seen`, we extract it as a boolean. This also allows us to simplify the explanation of `seen[r].is_poisoned()`, as the exceptional case where it is poisoned right after `r` has been determined to be a solution does no longer exist.
(0) -30000 -10000 -3000 -1000 -300 -100 -48 +48 +100 +300 tip