Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:58:56 +0100] rev 51236
rust-index: support `unionrepo`'s compressed length hack
Explanations inline.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:21:50 +0200] rev 51235
rust-index: honour incoming using_general_delta in `deltachain`
It looks to be a leftover from some past, but the C index considers
only the value passed from Python whereas up to now the Rust index
was using the value of its attribute.
As a middle ground, we make this argument of `deltachain` optional from
the Python side, with the Rust implementation only defaulting to its
attribute. This way, we reduce false leads when a difference in results
is spotted.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 21:48:45 +0200] rev 51234
rust-index: use interior mutability in head revs and caches
For upcoming changes in `hg-cpython` switching to the `hg-core` index in
ancestors iterators, we will need to avoid excessive mutability, restricting
the use of mutable references on `hg::index::Index` to methods that actually
logically mutate it, whereas the maintenance of caches such as `head_revs`
clearly does not. We illustrate that immediately by switching to immutable
borrows in the corresponding methods of `hg-cpython::MixedIndex`
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Oct 2023 15:26:19 +0200] rev 51233
rust-index: add Sync bound to all relevant mmap-derived values
All readonly mmaps are Sync as far as Rust is concerned. Integrity of the
mmap'ed file is a concern separate to Rust's memory model, since it requires
out-of-program handling via locks, etc.
This will help when we start sharing the Rust Index with Python.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 18:09:43 +0100] rev 51232
debugindexstats: handle the lack of Rust support better
We don't have any stats in the Rust index. Currently it is not known which
stats would be interesting to get, so if they end up being important, we can
add them later.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:36:59 +0100] rev 51231
rust-python-index: don't panic on a corrupted index when calling from Python
This makes `test-verify.t` pass again. In an ideal world, we would find
the exact commit where this test breaks and amend part of this change there,
but this is a long enough series.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:34:31 +0100] rev 51230
tests: ignore test-storage when using Rust
This is only relevant for Python code and the SQLite backend, which is in a
half-abandoned state.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:12:22 +0200] rev 51229
rust-index: optimize find_gca_candidates() on less than 8 revisions
This is expected to be by far the most common case, given that, e.g.,
merging involves using it on two revisions.
Using a `u8` as support for the bitset obviously divides the amount of
RAM needed by 8. To state the obvious, on a repository with 10 million
changesets, this spares 70MB. It is also possible that it'd be slightly
faster, because it is easier to allocate and provides better cache locality.
It is possible that some exhaustive listing of the traits implemented by
`u8` and `u64` would avoid the added duplication, but that can be done later
and would need a replacement for the `MAX` consts.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:54:49 +0200] rev 51228
rust-index: simplification in find_gca_candidates()
`parent_seen` can be made a mutable ref, making this part more obvious,
not needing to be commented so much.
The micro-optimization of avoiding the union if `parent_seen` and
`current_seen` agree is pushed down in the `union()` method of the
fast, `u64` based bit set implementation (in case it matters).
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:43:00 +0200] rev 51227
rust-index: avoid double negation in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:17:00 +0200] rev 51226
rust-index: avoid some cloning in find_gca_candidates()
Instead of keeping the information whether the current revision is
poisoned on `current_seen`, we extract it as a boolean.
This also allows us to simplify the explanation of `seen[r].is_poisoned()`,
as the exceptional case where it is poisoned right after `r` has been
determined to be a solution does no longer exist.
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 15:35:38 +0200] rev 51225
rust-index: implement common_ancestors_heads() and ancestors()
The only differences betwwen `common_ancestors_heads()` and
`find_gca_candidates()` seems to be that:
- the former accepts "overlapping" input revisions (meaning with duplicates).
- limitation to 24 inputs (in the C code), that we translate to using the
arbitrary size bit sets in the Rust code because we cannot bail to Python.
Given that the input is expected to be small in most cases, we take the
heavy handed approach of going through a HashSet and wait for perfomance
assessment
In case this is used via `hg-cpython`, we can anyway absorb the overhead
by having `commonancestorheads` build a vector of unique values
directly, and introduce a thin wrapper over `find_gca_candidates`, to take
care of bit set type dispatching only.
As far as `ancestors` is concerneed, this is just chaining
`common_ancestors_heads()` with `find_deepest_revs`.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Tue, 17 Oct 2023 22:42:40 +0200] rev 51224
rust-index: find_gca_candidates bit sets genericization
This allows to use arbitratry size of inputs in `find_gca_candidates()`.
We're genericizing so that the common case of up to 63 inputs can be
treated with the efficient implementation backed by `u64`.
Some complications with the borrow checker came, because arbitrary sized
bit sets will not be `Copy`, hence mutating them keeps a mut ref on the `seen`
vector. This is solved by some cloning, most of which can be avoided,
preferably in a follow-up after proof that this works (hence after exposition
to Python layer).
As far as performance is concerned, calling `clone()` on a `Copy` object
(good case when number of revs is less than 64) should end up just doing a
copy, according to this excerpt of the `Clone` trait documentation:
Types that are Copy should have a trivial implementation of Clone.
More formally: if T: Copy, x: T, and y: &T, then let x = y.clone();
is equivalent to let x = *y;.
Manual implementations should be careful to uphold this invariant;
however, unsafe code must not rely on it to ensure memory safety.
We kept the general structure, hence why there are some double negations.
This also could be made nicer in a follow-up.
The `NonStaticPoisonableBitSet` is included to ensure that the
`PoisonableBitSet` trait is general enough (had to correct `vec_of_empty()` for
instance). Moving the genericization one level to encompass the `seen`
vector and not its elements would be better for performance, if worth it.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:45:20 +0100] rev 51223
rust-index: core impl for find_gca_candidates and find_deepest
This still follows closely the C original and not able to treat more than 63
input revisions (bitset backed by `u64` and one bit reserved for poisoning).
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:57:36 +0100] rev 51222
rust-index: add support for `reachableroots2`
Exposition in `hg-cpython` done in regular impl block, again
for rustfmt support etc.
Georges Racinet <georges.racinet@octobus.net> [Thu, 02 Nov 2023 12:17:06 +0100] rev 51221
hg-cpython: rev_pyiter_collect_or_else
It will be useful to give callers the control on the generated errors
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:54:42 +0100] rev 51220
rust-index: add support for `computephasesmapsets`
Exposition in `hg-cpython` done in the regular `impl` block to enjoy
rustfmt and clearer compilartion errors.
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 15:59:03 +0200] rev 51219
rust-index: slicechunktodensity returns Rust result
Ready for removal of the scaffolding.
This time, we allow ourselves a minor optimization: we avoid
allocating for each chunk. Instead, we reuse the same vector,
and perform at most one allocation per chunk.
The `PyList` constructor will copy the buffer anyway.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:40:23 +0100] rev 51218
rust-index: add support for `_slicechunktodensity`
Georges Racinet <georges.racinet@octobus.net> [Fri, 29 Sep 2023 20:51:49 +0200] rev 51217
rust-index: headrevsfiltered() returning Rust result
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:14:25 +0100] rev 51216
rust-index: add support for `headrevsfiltered`
The implementation is merged with that of `headrevs` also to make sure that
caches are up to date.
Raphaël Gomès <rgomes@octobus.net> [Tue, 19 Sep 2023 15:21:43 +0200] rev 51215
rust-index: implement headrevs
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 16:52:40 +0200] rev 51214
rust-index: variant of assert_py_eq with normalizer expression
The example given in doc-comment is the main use case: some methods
may require ordering insensitive comparison. This is about to be
used for `reachableroots2`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:50:14 +0200] rev 51213
rust-index: add support for delta-chain computation
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:01:34 +0200] rev 51212
rust-index: add support for `find_snapshots`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 12:05:32 +0200] rev 51211
rust-index: add `is_snapshot` method
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:33 +0200] rev 51210
rust-index: use the Rust index in `partialmatch`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 14:50:17 +0200] rev 51209
rust-index: add missing special case for null rev
This was an oversight, it was never a problem because we didn't use the index
much for user-facing things in the past, which is the only real way of getting
to this edge case.
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:17 +0200] rev 51208
rust-index: use the rust index in `shortest`
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 14:34:21 +0200] rev 51207
rust-index: add checks that `__contains__` is synchronized