Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:54:49 +0200] rev 51228
rust-index: simplification in find_gca_candidates()
`parent_seen` can be made a mutable ref, making this part more obvious,
not needing to be commented so much.
The micro-optimization of avoiding the union if `parent_seen` and
`current_seen` agree is pushed down in the `union()` method of the
fast, `u64` based bit set implementation (in case it matters).
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:43:00 +0200] rev 51227
rust-index: avoid double negation in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:17:00 +0200] rev 51226
rust-index: avoid some cloning in find_gca_candidates()
Instead of keeping the information whether the current revision is
poisoned on `current_seen`, we extract it as a boolean.
This also allows us to simplify the explanation of `seen[r].is_poisoned()`,
as the exceptional case where it is poisoned right after `r` has been
determined to be a solution does no longer exist.
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 15:35:38 +0200] rev 51225
rust-index: implement common_ancestors_heads() and ancestors()
The only differences betwwen `common_ancestors_heads()` and
`find_gca_candidates()` seems to be that:
- the former accepts "overlapping" input revisions (meaning with duplicates).
- limitation to 24 inputs (in the C code), that we translate to using the
arbitrary size bit sets in the Rust code because we cannot bail to Python.
Given that the input is expected to be small in most cases, we take the
heavy handed approach of going through a HashSet and wait for perfomance
assessment
In case this is used via `hg-cpython`, we can anyway absorb the overhead
by having `commonancestorheads` build a vector of unique values
directly, and introduce a thin wrapper over `find_gca_candidates`, to take
care of bit set type dispatching only.
As far as `ancestors` is concerneed, this is just chaining
`common_ancestors_heads()` with `find_deepest_revs`.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Tue, 17 Oct 2023 22:42:40 +0200] rev 51224
rust-index: find_gca_candidates bit sets genericization
This allows to use arbitratry size of inputs in `find_gca_candidates()`.
We're genericizing so that the common case of up to 63 inputs can be
treated with the efficient implementation backed by `u64`.
Some complications with the borrow checker came, because arbitrary sized
bit sets will not be `Copy`, hence mutating them keeps a mut ref on the `seen`
vector. This is solved by some cloning, most of which can be avoided,
preferably in a follow-up after proof that this works (hence after exposition
to Python layer).
As far as performance is concerned, calling `clone()` on a `Copy` object
(good case when number of revs is less than 64) should end up just doing a
copy, according to this excerpt of the `Clone` trait documentation:
Types that are Copy should have a trivial implementation of Clone.
More formally: if T: Copy, x: T, and y: &T, then let x = y.clone();
is equivalent to let x = *y;.
Manual implementations should be careful to uphold this invariant;
however, unsafe code must not rely on it to ensure memory safety.
We kept the general structure, hence why there are some double negations.
This also could be made nicer in a follow-up.
The `NonStaticPoisonableBitSet` is included to ensure that the
`PoisonableBitSet` trait is general enough (had to correct `vec_of_empty()` for
instance). Moving the genericization one level to encompass the `seen`
vector and not its elements would be better for performance, if worth it.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:45:20 +0100] rev 51223
rust-index: core impl for find_gca_candidates and find_deepest
This still follows closely the C original and not able to treat more than 63
input revisions (bitset backed by `u64` and one bit reserved for poisoning).
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:57:36 +0100] rev 51222
rust-index: add support for `reachableroots2`
Exposition in `hg-cpython` done in regular impl block, again
for rustfmt support etc.
Georges Racinet <georges.racinet@octobus.net> [Thu, 02 Nov 2023 12:17:06 +0100] rev 51221
hg-cpython: rev_pyiter_collect_or_else
It will be useful to give callers the control on the generated errors
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:54:42 +0100] rev 51220
rust-index: add support for `computephasesmapsets`
Exposition in `hg-cpython` done in the regular `impl` block to enjoy
rustfmt and clearer compilartion errors.
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 15:59:03 +0200] rev 51219
rust-index: slicechunktodensity returns Rust result
Ready for removal of the scaffolding.
This time, we allow ourselves a minor optimization: we avoid
allocating for each chunk. Instead, we reuse the same vector,
and perform at most one allocation per chunk.
The `PyList` constructor will copy the buffer anyway.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:40:23 +0100] rev 51218
rust-index: add support for `_slicechunktodensity`
Georges Racinet <georges.racinet@octobus.net> [Fri, 29 Sep 2023 20:51:49 +0200] rev 51217
rust-index: headrevsfiltered() returning Rust result
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:14:25 +0100] rev 51216
rust-index: add support for `headrevsfiltered`
The implementation is merged with that of `headrevs` also to make sure that
caches are up to date.
Raphaël Gomès <rgomes@octobus.net> [Tue, 19 Sep 2023 15:21:43 +0200] rev 51215
rust-index: implement headrevs
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 16:52:40 +0200] rev 51214
rust-index: variant of assert_py_eq with normalizer expression
The example given in doc-comment is the main use case: some methods
may require ordering insensitive comparison. This is about to be
used for `reachableroots2`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:50:14 +0200] rev 51213
rust-index: add support for delta-chain computation
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:01:34 +0200] rev 51212
rust-index: add support for `find_snapshots`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 12:05:32 +0200] rev 51211
rust-index: add `is_snapshot` method
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:33 +0200] rev 51210
rust-index: use the Rust index in `partialmatch`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 14:50:17 +0200] rev 51209
rust-index: add missing special case for null rev
This was an oversight, it was never a problem because we didn't use the index
much for user-facing things in the past, which is the only real way of getting
to this edge case.
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:17 +0200] rev 51208
rust-index: use the rust index in `shortest`
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 14:34:21 +0200] rev 51207
rust-index: add checks that `__contains__` is synchronized
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 11:03:57 +0100] rev 51206
rust-index: using the Rust index in nodemap updating methods
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:19:54 +0100] rev 51205
rust-index: implementation of __getitem__
Although the removed panic tends to prove if the full test suite
did pass that the case when the input is a node id does not happen,
it is best not to remove it right now.
Raising IndexError is crucial for iteration on the index to stop,
given the default CPython sequence iterator, see for instance
https://github.com/zpoint/CPython-Internals/blobs/master/BasicObject/iter/iter.md
This was spotted by `test-rust-ancestors.py`, which does simple interations on
indexes (as preflight checks).
In `revlog.c`, `index_getitem` defaults to `index_get` when called
on revision numbers, which does raise `IndexError` with the same message as
the one we are introducing here.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 11:34:52 +0200] rev 51204
rust-index: optim note for post-scaffolding removal
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:16:13 +0100] rev 51203
rust-index: check that the entry bytes are the same in both indexes
This is a temporary measure to show that both the Rust and C indexes are
kept in sync.
Comes with some related documentation precisions.
For comparison of error cases, see `index_entry_binary()` in `revlog.c`.
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 16:15:56 +0200] rev 51202
rust-index: return variables systematic naming convention
To help knowing at a glance when a method is ready, making
us more comofortable when we are close to the final removal of
scaffolding, we introduce the systematic variable names `rust_res` and
`c_res`. The goal of this series is to always return the formet.
We take again the case of `pack_header` as example.
Our personal opinion is to usually avoid such poor semantics as `res`, but
usually accept it when it close to the actual return, which will be the
case in most methods of this series. Also, the name can simply be dropped
when we remove the scaffolding. To follow on the example, the body of
`pack_header()` should become this in the final version:
```
let index = self.index(py).borrow();
let packed = index.pack_header(args.get_item(py, 0).extract(py)?);
Ok(PyBytes::new(py, &packed).into_object());
```
in these cases it is close to the actual return and will be removed
at the end entirely.
Georges Racinet <georges.racinet@octobus.net> [Fri, 29 Sep 2023 15:51:49 +0200] rev 51201
rust-index: results comparison helper with details
This is a bit simpler to call and has the advantage of systematically log
the encountered deviation.
To avoid committing dead code, we apply it to the `pack_header` method, that
was already returning the Rust result.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 10:59:04 +0200] rev 51200
rust-index: helper for revision not in index not involving nodemap
This is a good match for exceptions raised from the C implementation,
when it is not about a nodemap inconsistency.
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 19:54:18 +0200] rev 51199
rust-index: renamed nodemap error function for rev not in index
The function name was misleading, as the error wording mentions the
nodemap, hence would not be appropriate for missing revisions not
related to a nodemap lookup.
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 10:28:10 +0200] rev 51198
rust-index: add `pack_header` support
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 10:34:48 +0100] rev 51197
rust-index: support cache clearing
I'm not 100% sure how useful it is outside of perf, but it's still worth
implementing.
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Jun 2023 11:37:19 +0200] rev 51196
rust-index: check rindex and cindex return the same get_rev
This is a temporary safeguard while we synchronize both indexes.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 16:43:39 +0200] rev 51195
rust-index: synchronize remove to Rust index
Future steps will bring the two indexes further together until we can
rip the C index entirely when running Rust code.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 11:59:43 +0200] rev 51194
rust-index: remove `__setitem__` method from the mixed index
This is not defined on the Python or C one, and isn't used anywhere.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 11:36:22 +0200] rev 51193
rust-index: check equality between rust and cindex for `__len__`
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 18:24:54 +0200] rev 51192
rust-index: synchronize append method
We now append to the Rust index just as we do to the C index. Future steps
will bring the two indexes further together until we can rip the C index
entirely when running Rust code.
Raphaël Gomès <rgomes@octobus.net> [Mon, 18 Sep 2023 17:11:11 +0200] rev 51191
rust-revlog: teach the revlog opening code to read the repo options
This will become necessary as we start writing revlog data from Rust.
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 17:34:51 +0200] rev 51190
rust-index: pass data down to the Rust index
This will allow us to start keeping the Rust index synchronized with the
cindex as we gradually implement more and more methods in Rust. This will
eventually be removed.
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 16:32:09 +0200] rev 51189
rust-index: add append method
This is the first time the Rust index has any notion of mutability.
This will be used in a future patch from Python, to start synchronizing the
Rust index and the C index.
Raphaël Gomès <rgomes@octobus.net> [Mon, 26 Jun 2023 19:16:07 +0200] rev 51188
rust-index: add an abstraction to support bytes added at runtimes
In order to support appending data to the Rust index, we need to abstract
data access away from the immutable (on-disk) bytes, to seemlessly fetch
either from the preexisting data or from the newly added data.
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Jun 2023 16:09:57 +0200] rev 51187
rust-mixed-index: move the mmap keepalive into a function
The same code will be used for keeping the new index mmap around.
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Jun 2023 15:00:46 +0200] rev 51186
rust-mixed-index: rename variable to make the next change clearer
We're going to add another mmap reference holder, so let's rename this one
first.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 10:08:32 +0200] rev 51185
rust: fix cargo doc for hg-cpython
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 15 Dec 2023 11:10:24 +0100] rev 51184
branching: merge with default
We merge with the current children of the bad merge (
37b52b938579)
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 15 Dec 2023 11:08:41 +0100] rev 51183
branching: merge with stable
This recreates `
37b52b938579` right as a `hg branch --rev
5b186ba40001` screwed
up the content.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Dec 2023 02:07:16 +0100] rev 51182
changelog: disallow delayed write on inline changesets
Since this will never happens, we can make the situation invalid and to stop to
handling the associated the case.
This simplify the random access file reading too.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Dec 2023 22:27:59 +0100] rev 51181
changelog: never inline changelog
The test suite mostly use small repositories, that implies that most changelog in the
tests are inlined. As a result, non-inlined changelog are quite poorly tested.
Since non-inline changelog are most common case for serious repositories, this
lack of testing is a significant problem that results in high profile issue like
the one recently fixed by
66417f55ea33 and
849745d7da89.
Inlining the changelog does not bring much to the table, the number of total
file saved is negligible, and the changelog will be read by most operation
anyway.
So this changeset is make it so we never inline the changelog, and de-inline the
one that are still inlined whenever we touch them.
By doing that, we remove the "dual code path" situation for writing new entry to
the changelog and move to a "single code path" situation. Having a single
code path simplify the code and make sure it is covered by test (if test cover
that situation obviously)
This impact all tests that care about the number of file and the exchange size,
but there is nothing too complicated in them just a lot of churn.
The churn is made "worse" by the fact rust will use the persistent nodemap on
any changelog now. Which is overall a win as it means testing the persistent
nodemap more and having less special cases.
In short, having inline changelog is mostly useless and an endless source of
pain. We get rid of it.