Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 03:41:58 +0100] rev 51255
rust-index: document safety invariants being upheld for every `unsafe` block
We've added a lot of `unsafe` code that shares Rust structs with Python.
While this is unfortunate, it is also unavoidable, so let's at least
systematically explain why each call to `unsafe` is sound.
If any of the unsafe code ends up being wrong (because everyone screws up
at some point), this change at least continues the unspoken rule of always
explaining the need for `unsafe`, so we at least get a chance to think.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:18:03 +0100] rev 51254
rust-index: renamed `MixedIndex` as `Index`
It is simply not mixed any more, hence the name had become a
future source of confusion.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 23:54:05 +0100] rev 51253
rust-index: stop instantiating a C Index
The only missing piece was the `cache` to be returned from
`revlog.parse_index_v1_mixed`, and it really seems that it is
essentially repetition of the input, if `inline` is `True`.
Not worth a Rust implementation (C implementation is probably there
for historical reasons).
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:28:30 +0100] rev 51252
rust-revlog: using the ad-hoc `NodeTree` in scmutil
Now that we have an independent `NodeTree` class able to work natively
on the pure Rust index, we use it in `mercurial.scmutil`, with automatic
invalidation after mutation of the index.
This code path is tested by `test-revisions.t` and `test-template-functions.t`
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 22:36:30 +0100] rev 51251
rust-revlog: add invalidation detection to `NodeTree` class
This will be useful for callers, such as `scmutil` who reuse a
`NodeTree` instance as a cache. They would otherwise get hard
errors if any mutation of the index occurred since instantiation.
This is something the C index does not provide.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 15:50:13 +0100] rev 51250
rust-index: add support for `del index[r]`
Only the `del index[r:]` syntax was supported, but the comment said otherwise.
It's not actually used in core code, but the C index supports it.
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:26:17 +0100] rev 51249
rust-revlog: bare minimal NodeTree exposition
The independent `NodeTree` instances needs to be associated to an
index (for forward-checks of candidates) but do not need to
encompass all revisions from that index.
This is exactly how it is used in `scmutil.shortesthenodeidprefix`
and we restrict the implementation to the bare minimum needed there
and to write convincing tests.
It would of course be fairly trivial to add more.
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:25:28 +0100] rev 51248
rust-index: a property to identify the Rust index as such
Will be useful soon in `mercurial.scmutil` and potentially elsewhere
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 15:32:33 +0100] rev 51247
rust-cpython-revlog: renamed NodeTree import as CoreNodeTree
We're about to introduce a `NodeTree` Python class (hence also
a Rust struct) and it would be a collision with the import
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:48:53 +0200] rev 51246
rust-index: stop using C index
We still keep its wrapper implementation in `hg-cpython::cindex`,
because we might want to recreate ancestors handling objects using
it for the case of REVLOGV2.
Also, we still instantiate it (from Python code) and store it as
attribute, for the likes of `get_cindex` and the caller that
relies on it, but that is soon to be removed, too.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 12:07:05 +0100] rev 51245
rust-index: using `hg::index::Index` in discovery
At this point the C index is not used any more: we had to
remove `pyindex_to_graph()` to avoid the dead code warning.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:01:57 +0100] rev 51244
rust-python-testing: separated base test classes
This will allow, e.g., to change `test-rust-discovery.py` simply
by adding the appropriate base class.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:21:18 +0100] rev 51243
rust-discovery: encapsulated conversions to vec for instance methods
This new `pyiter_to_vec` is pretty trivial, and only mildly reduces
code duplication. The main advantage is that it encapsulates access
to the `index` attribute, which will be changed when we replace the
C index by the Rust index, given as `PySharedRef`.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:10:09 +0100] rev 51242
rust-discovery: moving most of hg-cpython methods to regular code blocks
The chosen methods are those with conversion of an incoming Python iterable,
as they will be changed the most when we will remove the C index, and
`takefullsample` for consistency with `takequicksample`.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 10:47:54 +0100] rev 51241
rust-index: using `hg::index::Index` in `hg-cpython::dagops`
Hooking `headrevs` to the Rust index is straightforward as long as
we go the `PySharedRef` way. Direct attempts of obtaining a reference
to the inner `hg::index::Index` fail for lifetime reasons: the reference
is bound to the GIL, yet the `as_set` local variable is considered to
be static (the borrow checker clearly does not realize or care that this
set only stores `Revision` values).
In `rank()`, the chosen solution is the simplest as far as `hg-cpython` is
concerned, but it has the defect of removing an implementation
that would be easily adaptable if the core index did implement `RankedGraph`
(returning the same error as long as only `REVLOGV1` is supported), but that
would introduce a direct dependency of `hg-core` on the ``vcsgraph` crate.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sat, 28 Oct 2023 22:50:10 +0200] rev 51240
rust-index: using `hg::index::Index` in MissingAncestors
With this, the whole `hg-cpython::ancestors` module can now work without
the C index.
Georges Racinet <georges.racinet@octobus.net> [Fri, 27 Oct 2023 22:11:05 +0200] rev 51239
rust-index: using the `hg::index::Index` in ancestors iterator and lazy set
Since there is no Rust implementation for REVLOGV2/CHANGELOGv2, we declare
them to be incompatible with Rust, hence indexes in these formats will use
the implementations from Python `mercurial.ancestor`. If this is an unacceptable
performance hit for current users of these formats, we can later on add Rust
implementations based on the C index for them or implement these formats for
the Rust indexes.
Among the challenges that we had to meet, we wanted to avoid taking the GIL each
time the inner (vcsgraph) iterator has to call the parents function. This would probably
still be acceptable in terms of performance with `AncestorsIterator`, but not with
`LazyAncestors` nor for the upcoming change in `MissingAncestors`.
Hence we enclose the reference to the index in a `PySharedRef`, leading to more
rigourous checking of mutations, which does pass now that there no logically immutable
methods of `hg::index::Index` that take a mutable reference as input.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:29:29 +0200] rev 51238
revlog: always use a Rust index for REVLOGv1 if rustext is present
We are about to change classes such as `rustext.AncestorsIterator` to
take a Rust index, hence we cannot have the option not to use the Rust
index.
Note: this can be refined depending on whether we want to keep this
option or not. We will have to make two versions of `AncestorsIterator`
and its sibling to support REVLOGV2 and CHANGELOGv2 anyway.
Meanwhile, this is the simplest change to make the tests pass.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 18:35:32 +0100] rev 51237
rust-index: disabling flagprocessor tests
The list of flags supported by the Rust index is not dynamic, hence
flagprocessor has no chance to work.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:58:56 +0100] rev 51236
rust-index: support `unionrepo`'s compressed length hack
Explanations inline.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:21:50 +0200] rev 51235
rust-index: honour incoming using_general_delta in `deltachain`
It looks to be a leftover from some past, but the C index considers
only the value passed from Python whereas up to now the Rust index
was using the value of its attribute.
As a middle ground, we make this argument of `deltachain` optional from
the Python side, with the Rust implementation only defaulting to its
attribute. This way, we reduce false leads when a difference in results
is spotted.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 21:48:45 +0200] rev 51234
rust-index: use interior mutability in head revs and caches
For upcoming changes in `hg-cpython` switching to the `hg-core` index in
ancestors iterators, we will need to avoid excessive mutability, restricting
the use of mutable references on `hg::index::Index` to methods that actually
logically mutate it, whereas the maintenance of caches such as `head_revs`
clearly does not. We illustrate that immediately by switching to immutable
borrows in the corresponding methods of `hg-cpython::MixedIndex`
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Oct 2023 15:26:19 +0200] rev 51233
rust-index: add Sync bound to all relevant mmap-derived values
All readonly mmaps are Sync as far as Rust is concerned. Integrity of the
mmap'ed file is a concern separate to Rust's memory model, since it requires
out-of-program handling via locks, etc.
This will help when we start sharing the Rust Index with Python.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 18:09:43 +0100] rev 51232
debugindexstats: handle the lack of Rust support better
We don't have any stats in the Rust index. Currently it is not known which
stats would be interesting to get, so if they end up being important, we can
add them later.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:36:59 +0100] rev 51231
rust-python-index: don't panic on a corrupted index when calling from Python
This makes `test-verify.t` pass again. In an ideal world, we would find
the exact commit where this test breaks and amend part of this change there,
but this is a long enough series.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:34:31 +0100] rev 51230
tests: ignore test-storage when using Rust
This is only relevant for Python code and the SQLite backend, which is in a
half-abandoned state.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:12:22 +0200] rev 51229
rust-index: optimize find_gca_candidates() on less than 8 revisions
This is expected to be by far the most common case, given that, e.g.,
merging involves using it on two revisions.
Using a `u8` as support for the bitset obviously divides the amount of
RAM needed by 8. To state the obvious, on a repository with 10 million
changesets, this spares 70MB. It is also possible that it'd be slightly
faster, because it is easier to allocate and provides better cache locality.
It is possible that some exhaustive listing of the traits implemented by
`u8` and `u64` would avoid the added duplication, but that can be done later
and would need a replacement for the `MAX` consts.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:54:49 +0200] rev 51228
rust-index: simplification in find_gca_candidates()
`parent_seen` can be made a mutable ref, making this part more obvious,
not needing to be commented so much.
The micro-optimization of avoiding the union if `parent_seen` and
`current_seen` agree is pushed down in the `union()` method of the
fast, `u64` based bit set implementation (in case it matters).
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:43:00 +0200] rev 51227
rust-index: avoid double negation in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:17:00 +0200] rev 51226
rust-index: avoid some cloning in find_gca_candidates()
Instead of keeping the information whether the current revision is
poisoned on `current_seen`, we extract it as a boolean.
This also allows us to simplify the explanation of `seen[r].is_poisoned()`,
as the exceptional case where it is poisoned right after `r` has been
determined to be a solution does no longer exist.