Arun Kulshreshtha <akulshreshtha@janestreet.com> [Thu, 04 Jan 2024 14:45:31 -0500] rev 51309
narrow: prevent removal of ACL-defined excludes
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Thu, 04 Jan 2024 14:41:18 -0500] rev 51308
narrow: add test demonstrating bug in acl exclusion enforcement
Anton Shestakov <av6@dwimlabs.net> [Mon, 08 Jan 2024 13:35:02 +0100] rev 51307
contrib: add a set of scripts to run pytype in Docker
Having a simple way to run pytype for developers can massively shorten
development cycle. Using the same Docker image and scripts that we use on our
CI guarantees that the result achieved locally will be very similar to (if not
the same as) the output of our CI runners.
Things to note: the Dockerfile needs to do a little dance around user
permissions inside /home/ci-runner/ because:
- on one hand, creating new files on the host (e.g. .pyi files inside .pytype/)
should use host user's uid and gid
- on the other hand, when we run the image as uid:gid of host user, it needs to
be able to read/execute files inside the image that are owned by ci-runner
Since local user's uid might be different from ci-runner's uid, we execute this
very broad chmod command inside /home/ci-runner/, but then run the image as the
host user's uid:gid.
There might be a better way to do this.
Anton Shestakov <av6@dwimlabs.net> [Mon, 18 Dec 2023 15:52:17 -0300] rev 51306
pytype: use "$(hg root)" instead of `hg root` to make shellcheck happier
Anton Shestakov <av6@dwimlabs.net> [Mon, 18 Dec 2023 15:40:48 -0300] rev 51305
pytype: update check-pytype.sh to select target automatically
We have python3.11 on CI, so we can run pytype targeting that version. On the
other hand, we don't have python3.7 on CI anymore, so we can't run pytype for
3.7 anymore (interpreter not found). I think it's fine to make pytype select
the appropriate target depending on the version of the interpreter it's running
under.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:54:52 +0100] rev 51304
git-hgext: adjust to the lack of `changelog.heads` method
We don't have a `heads` method returning nodeid, but this is very easy to get
the same result.
This was flagged by pytype.
We can note that the fact this code did not break is probably a good sign that
it is dead code.
However this is a question outside of the scop of this series.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:21:31 +0100] rev 51303
remotefilelog: drop dead code
As pytype flagged bug in this method it highlighted that this methode being
never called anywhere.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:07:59 +0100] rev 51302
pytype: use the right signature for the `__delitem__`
It is not because it is NotImplemented that it should use a bad signature. Fix
it to please pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:07:21 +0100] rev 51301
pytype: use the right signature for the `__setitem__`
It is not because it is NotImplemented that it should use a bad signature. Fix
it to please pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:03:34 +0100] rev 51300
sparse: use with statement for wlock
This will avoid pytype complaining about the try/except range.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:00:47 +0100] rev 51299
remotefilelog: adjust the signature of basepack.createindex
pytype point that the subclass signature have been updated.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 21 Dec 2023 00:19:19 +0100] rev 51298
pytype: add the couple annotations for pytype to understands the lrunode
After loosing 2d6 SAN, I eventually understood that pytype was confused by method
return type. Pytype is now happy.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:40:06 +0100] rev 51297
pytype: ignore some signature mismatch in registrar
pytype is grumpy about a sub method having a different signature than the one we
use here.
pytype error:
internalmerge: Overriding method signature mismatch [signature-mismatch]
Base signature: 'def _funcregistrarbase._extrasetup(self, name, func) -> Any'.
Subclass signature: 'def internalmerge._extrasetup(self, name, func, mergetype, onfailure = None, precheck = None, binary = False, symlink = False) -> Any'.
Parameter 'mergetype' must have a default value.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:38:46 +0100] rev 51296
hgweb: update _runwsgi try/except range to be valid
The `tmpl` variable is used in the `except` and `finally`, so we need it created
before the `try` is open.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:36:52 +0100] rev 51295
pytype: add type information for `annotateresult.lines`
This seems to appease a confused pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:34:47 +0100] rev 51294
pytype: ignore attribute error for time.clock
This seems to be a Windows only attribute.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:27:49 +0100] rev 51293
pytype: ignore certifi import error
This is an optional import so we should not complains about it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:26:30 +0100] rev 51292
pytype: ignore some signature mismatch in configitems
pytype is grumpy about the dict.update having a more complex signature than the
one we use here.
pytype error:
itemregister: Overriding method signature mismatch [signature-mismatch]
Base signature: 'def builtins.dict.update(self) -> None'.
Subclass signature: 'def itemregister.update(self, other) -> Any'.
Parameter 'other' must have a default value.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 16:30:32 +0100] rev 51291
pytype: only output the "pytype crashed" message on error
If pytype did not crash while generating stub, that message is kind of
confusing. It seems simple enough to avoid it in this case.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 22:17:03 +0100] rev 51290
pytype: drop the now useless assert
As the imported types are now used by type annotation, these ugly assert are
no longer needed.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 16:39:03 +0100] rev 51289
pytype: drop the last inline type comment
We can't assign type to the "for" variant on the fly, so we type the variable
and method used instead.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 11:23:09 +0100] rev 51288
pytype: convert type comment for inline variable too
Same logic as for the previous changeset, but for "type comment" annotating
variables, not function/method.
As for the previous changeset, we had to adjust for of the types to actually match what was happening.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:29:34 +0100] rev 51287
pytype: move some type comment to proper annotation
We support direct type annotations now, while pytype is starting to complains
about them.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 20:13:22 +0100] rev 51286
lock: properly convert error to bytes
Flagged by pytype when a later changeset is applied moving typing comment to annotation.
We fix this ahead of the annotation change to make sure pytype remains happy
after the change.
We have to do fairly crazy dance for pytype to be happy. This probably comes
from the fact IOError.filename probably claims to be `str` while it is actually
`bytes` if the filename raising that `IOError` is bytes.
At the same time, `IOError.strerror` is consistently `str` and should be passed
as `str` everywhere.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 12:51:20 +0100] rev 51285
pytype: import typing directly
First we no longer needs the pycompat layer, second having the types imported in
all case will allow to use them more directly in type annotation, something
important to upgrade the old "type comment" to proper type annotation.
A lot a stupid assert are needed to keep pyflakes happy. We should be able to
remove most of them once the type comment have been upgraded.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 08 Nov 2023 01:58:16 +0100] rev 51284
usage: configure uncompressed chunk cache through resource configuration
Let's use this new concept for what it is meant for.
This provides a sizable speed up for reading multiple revision for some complexe
repositories.
### data-env-vars.name = pypy-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.perf.read-revisions
# benchmark.variants.order = reverse
memory-medium: 1.892400
memory-high: 1.722934 (-8.61%)
# benchmark.variants.order = default
memory-medium: 1.751542
memory-high: 1.589340 (-9.49%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 09 Oct 2023 15:12:16 +0200] rev 51283
usage: add configuration option to adjust resources usage
They currently do nothing, but this open the way to actually use them.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 09 Oct 2023 15:06:21 +0200] rev 51282
usage: add a `usage.repository-role` config
This config will be used for behavior and performance adjustment depending of
the repository role.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 13 Dec 2023 13:46:28 +0100] rev 51281
common-pattern: cover "elapsed time" line
These are perfect targets for the common-pattern matching.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Dec 2023 18:02:26 +0100] rev 51280
bundle: do not detect --base argument that match nothing as lack of argument
With the previous version of the code, if --base did not match anything, it will
be handled as if no --base was provided and will fallback to using discovery
with the default path. This has two issues :
- The resulting bundle won't match what the user requested,
- if not default path is configured, it will crash.
We now properly distinct between the two cases and if the --base query does not
find any changeset, we will assume that everything under --rev needs to be sent.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Dec 2023 18:42:13 +0100] rev 51279
bundle: highlight misbehavior when --base does not match any revision
See next changeset for fix and details.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 24 Dec 2023 02:43:53 +0100] rev 51278
branching: merge with stable
I need the fix to `generate-churning-bundle.py`.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 18 Nov 2023 00:16:15 +0100] rev 51277
generate-churning-bundle: fix script for python3
This script has apparently not run for a long time.
Martin von Zweigbergk <martinvonz@google.com> [Sat, 16 Dec 2023 10:48:20 -0800] rev 51276
narrow: strip trailing `/` from manifest dir before matching it
Commit
17a822d7943e broke some of our internal tests at Google because the `dir`
variable contains a trailing slash since that commit. Let's restore the old
behavior by stripping that trailing slash.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 18 Dec 2023 10:13:41 -0800] rev 51275
tests: demonstrate error when narrowing with `rootfilesin:` pattern
This demonstrates a bug introduced in
17a822d7943e.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 18 Dec 2023 14:51:20 -0800] rev 51274
matchers: use correct method for finding index in vector
The path matcher has an optimization for when all paths are `rootfilesin:`. This
optimization exists in both Python and Rust. However, the Rust implementation
currently has a bug that makes it fail in most cases. The bug is that it
`rfind()` where it was clearly intended to use `rposition()`. This patch fixes
that and adds a test.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 12 Dec 2023 17:08:45 +0100] rev 51273
dirstate: make the `transaction` argument of `setbranch` mandatory
This is deprecated since 6.4. We should drop it now.
Raphaël Gomès <rgomes@octobus.net> [Wed, 20 Dec 2023 14:59:31 +0100] rev 51272
rust-clippy: apply some more trivial fixes
All of these were hinted at by clippy and make the code simpler.
Raphaël Gomès <rgomes@octobus.net> [Wed, 20 Dec 2023 14:58:36 +0100] rev 51271
rust-clippy: simplify `match` to `if let`
This was hinted at by clippy, and makes it more obvious that nothing is
happening in the `None` case.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:56:08 +0100] rev 51270
censor: accept multiple revision in a single call
This is useful when dealing with corruption, as all the corrupted revision can
be dealt with in one go.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:46:46 +0100] rev 51269
censor: be more verbose about the other steps too
If we informs the user about head checking, we should tell him when the other
operation happens too. Otherwise the user can imagine to still be in the head
checking part.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:44:33 +0100] rev 51268
censor: add a command flag to skip the head checks
In some case we spend hours of time checking the heads to censors a simple file
is not a good behavior. Especially when censors is used to removed corrupted
content.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:33:35 +0100] rev 51267
censor: inform the user that we are spending time checking heads
The time this can consume can be a surprise to the user, lets be explicit about
it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:25:52 +0100] rev 51266
censor: mention that we check the heads in the help
And add a message to will explain the possibly long time spent doing this.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 21 Dec 2023 01:45:43 +0100] rev 51265
persistent-nodemap: respect the mmap setting when refreshing data
After writing updated data, we reload the in-memory data. However, that logic
was… wrong. We were doing file read when mmap was requested and when the
configuration was requesting to not use mmap… we were using it.
This should now be fine.
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Dec 2023 09:57:25 +0100] rev 51264
rust-index: only access offsets if revlog is inline
Accessing the `RwLock` ended up showing up in profiles even with no contention.
Offsets only exist for inline revlogs, so gate everything behind an inline
check.
Raphaël Gomès <rgomes@octobus.net> [Wed, 06 Dec 2023 11:04:18 +0100] rev 51263
rust-index: cache the head nodeids python list
Same optimization as before, but for the nodeids this time.
Raphaël Gomès <rgomes@octobus.net> [Tue, 05 Dec 2023 14:50:05 +0100] rev 51262
rust-index: add fast-path for getting a list of all heads as nodes
This avoids a lot of back-and-forth between Python and Rust. We forgo adding
a fast-path in the `filteredchangelog` case yet. If it shows up in profiling,
we might add the variant with a filter.
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 23:22:51 -0500] rev 51261
rust-index-cpython: cache the heads' PyList representation
This is the same optimization that the C index does, we just have more
separation of the Python and native sides.
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 15:58:24 -0500] rev 51260
rust-index: use a `BitVec` instead of plain `Vec` for heads computation
The `Vec` method uses one byte per revision, this uses 1 per 8 revisions,
which improves our memory footprint. For large graphs (10+ millions), this
can make a measurable difference server-side.
I have seen no measurable impact on execution speed.
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 10:04:41 -0500] rev 51259
rust-index: implement faster retain heads using a vec instead of a hashset
This is the same optimization that the C index does, we're only catching up
now because this showed up as slow in benchmarking.
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Dec 2023 11:52:05 +0100] rev 51258
rust-index: allow inlining VCSGraph parents across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 18:48:07 +0100] rev 51257
rust-index: allow inlining `parents` across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 18:47:42 +0100] rev 51256
rust-index: allow inlining `check_revision` across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 03:41:58 +0100] rev 51255
rust-index: document safety invariants being upheld for every `unsafe` block
We've added a lot of `unsafe` code that shares Rust structs with Python.
While this is unfortunate, it is also unavoidable, so let's at least
systematically explain why each call to `unsafe` is sound.
If any of the unsafe code ends up being wrong (because everyone screws up
at some point), this change at least continues the unspoken rule of always
explaining the need for `unsafe`, so we at least get a chance to think.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:18:03 +0100] rev 51254
rust-index: renamed `MixedIndex` as `Index`
It is simply not mixed any more, hence the name had become a
future source of confusion.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 23:54:05 +0100] rev 51253
rust-index: stop instantiating a C Index
The only missing piece was the `cache` to be returned from
`revlog.parse_index_v1_mixed`, and it really seems that it is
essentially repetition of the input, if `inline` is `True`.
Not worth a Rust implementation (C implementation is probably there
for historical reasons).
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:28:30 +0100] rev 51252
rust-revlog: using the ad-hoc `NodeTree` in scmutil
Now that we have an independent `NodeTree` class able to work natively
on the pure Rust index, we use it in `mercurial.scmutil`, with automatic
invalidation after mutation of the index.
This code path is tested by `test-revisions.t` and `test-template-functions.t`
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 22:36:30 +0100] rev 51251
rust-revlog: add invalidation detection to `NodeTree` class
This will be useful for callers, such as `scmutil` who reuse a
`NodeTree` instance as a cache. They would otherwise get hard
errors if any mutation of the index occurred since instantiation.
This is something the C index does not provide.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 15:50:13 +0100] rev 51250
rust-index: add support for `del index[r]`
Only the `del index[r:]` syntax was supported, but the comment said otherwise.
It's not actually used in core code, but the C index supports it.
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:26:17 +0100] rev 51249
rust-revlog: bare minimal NodeTree exposition
The independent `NodeTree` instances needs to be associated to an
index (for forward-checks of candidates) but do not need to
encompass all revisions from that index.
This is exactly how it is used in `scmutil.shortesthenodeidprefix`
and we restrict the implementation to the bare minimum needed there
and to write convincing tests.
It would of course be fairly trivial to add more.
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:25:28 +0100] rev 51248
rust-index: a property to identify the Rust index as such
Will be useful soon in `mercurial.scmutil` and potentially elsewhere
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 15:32:33 +0100] rev 51247
rust-cpython-revlog: renamed NodeTree import as CoreNodeTree
We're about to introduce a `NodeTree` Python class (hence also
a Rust struct) and it would be a collision with the import
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:48:53 +0200] rev 51246
rust-index: stop using C index
We still keep its wrapper implementation in `hg-cpython::cindex`,
because we might want to recreate ancestors handling objects using
it for the case of REVLOGV2.
Also, we still instantiate it (from Python code) and store it as
attribute, for the likes of `get_cindex` and the caller that
relies on it, but that is soon to be removed, too.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 12:07:05 +0100] rev 51245
rust-index: using `hg::index::Index` in discovery
At this point the C index is not used any more: we had to
remove `pyindex_to_graph()` to avoid the dead code warning.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:01:57 +0100] rev 51244
rust-python-testing: separated base test classes
This will allow, e.g., to change `test-rust-discovery.py` simply
by adding the appropriate base class.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:21:18 +0100] rev 51243
rust-discovery: encapsulated conversions to vec for instance methods
This new `pyiter_to_vec` is pretty trivial, and only mildly reduces
code duplication. The main advantage is that it encapsulates access
to the `index` attribute, which will be changed when we replace the
C index by the Rust index, given as `PySharedRef`.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:10:09 +0100] rev 51242
rust-discovery: moving most of hg-cpython methods to regular code blocks
The chosen methods are those with conversion of an incoming Python iterable,
as they will be changed the most when we will remove the C index, and
`takefullsample` for consistency with `takequicksample`.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 10:47:54 +0100] rev 51241
rust-index: using `hg::index::Index` in `hg-cpython::dagops`
Hooking `headrevs` to the Rust index is straightforward as long as
we go the `PySharedRef` way. Direct attempts of obtaining a reference
to the inner `hg::index::Index` fail for lifetime reasons: the reference
is bound to the GIL, yet the `as_set` local variable is considered to
be static (the borrow checker clearly does not realize or care that this
set only stores `Revision` values).
In `rank()`, the chosen solution is the simplest as far as `hg-cpython` is
concerned, but it has the defect of removing an implementation
that would be easily adaptable if the core index did implement `RankedGraph`
(returning the same error as long as only `REVLOGV1` is supported), but that
would introduce a direct dependency of `hg-core` on the ``vcsgraph` crate.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sat, 28 Oct 2023 22:50:10 +0200] rev 51240
rust-index: using `hg::index::Index` in MissingAncestors
With this, the whole `hg-cpython::ancestors` module can now work without
the C index.
Georges Racinet <georges.racinet@octobus.net> [Fri, 27 Oct 2023 22:11:05 +0200] rev 51239
rust-index: using the `hg::index::Index` in ancestors iterator and lazy set
Since there is no Rust implementation for REVLOGV2/CHANGELOGv2, we declare
them to be incompatible with Rust, hence indexes in these formats will use
the implementations from Python `mercurial.ancestor`. If this is an unacceptable
performance hit for current users of these formats, we can later on add Rust
implementations based on the C index for them or implement these formats for
the Rust indexes.
Among the challenges that we had to meet, we wanted to avoid taking the GIL each
time the inner (vcsgraph) iterator has to call the parents function. This would probably
still be acceptable in terms of performance with `AncestorsIterator`, but not with
`LazyAncestors` nor for the upcoming change in `MissingAncestors`.
Hence we enclose the reference to the index in a `PySharedRef`, leading to more
rigourous checking of mutations, which does pass now that there no logically immutable
methods of `hg::index::Index` that take a mutable reference as input.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:29:29 +0200] rev 51238
revlog: always use a Rust index for REVLOGv1 if rustext is present
We are about to change classes such as `rustext.AncestorsIterator` to
take a Rust index, hence we cannot have the option not to use the Rust
index.
Note: this can be refined depending on whether we want to keep this
option or not. We will have to make two versions of `AncestorsIterator`
and its sibling to support REVLOGV2 and CHANGELOGv2 anyway.
Meanwhile, this is the simplest change to make the tests pass.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 18:35:32 +0100] rev 51237
rust-index: disabling flagprocessor tests
The list of flags supported by the Rust index is not dynamic, hence
flagprocessor has no chance to work.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:58:56 +0100] rev 51236
rust-index: support `unionrepo`'s compressed length hack
Explanations inline.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:21:50 +0200] rev 51235
rust-index: honour incoming using_general_delta in `deltachain`
It looks to be a leftover from some past, but the C index considers
only the value passed from Python whereas up to now the Rust index
was using the value of its attribute.
As a middle ground, we make this argument of `deltachain` optional from
the Python side, with the Rust implementation only defaulting to its
attribute. This way, we reduce false leads when a difference in results
is spotted.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 21:48:45 +0200] rev 51234
rust-index: use interior mutability in head revs and caches
For upcoming changes in `hg-cpython` switching to the `hg-core` index in
ancestors iterators, we will need to avoid excessive mutability, restricting
the use of mutable references on `hg::index::Index` to methods that actually
logically mutate it, whereas the maintenance of caches such as `head_revs`
clearly does not. We illustrate that immediately by switching to immutable
borrows in the corresponding methods of `hg-cpython::MixedIndex`
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Oct 2023 15:26:19 +0200] rev 51233
rust-index: add Sync bound to all relevant mmap-derived values
All readonly mmaps are Sync as far as Rust is concerned. Integrity of the
mmap'ed file is a concern separate to Rust's memory model, since it requires
out-of-program handling via locks, etc.
This will help when we start sharing the Rust Index with Python.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 18:09:43 +0100] rev 51232
debugindexstats: handle the lack of Rust support better
We don't have any stats in the Rust index. Currently it is not known which
stats would be interesting to get, so if they end up being important, we can
add them later.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:36:59 +0100] rev 51231
rust-python-index: don't panic on a corrupted index when calling from Python
This makes `test-verify.t` pass again. In an ideal world, we would find
the exact commit where this test breaks and amend part of this change there,
but this is a long enough series.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:34:31 +0100] rev 51230
tests: ignore test-storage when using Rust
This is only relevant for Python code and the SQLite backend, which is in a
half-abandoned state.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:12:22 +0200] rev 51229
rust-index: optimize find_gca_candidates() on less than 8 revisions
This is expected to be by far the most common case, given that, e.g.,
merging involves using it on two revisions.
Using a `u8` as support for the bitset obviously divides the amount of
RAM needed by 8. To state the obvious, on a repository with 10 million
changesets, this spares 70MB. It is also possible that it'd be slightly
faster, because it is easier to allocate and provides better cache locality.
It is possible that some exhaustive listing of the traits implemented by
`u8` and `u64` would avoid the added duplication, but that can be done later
and would need a replacement for the `MAX` consts.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:54:49 +0200] rev 51228
rust-index: simplification in find_gca_candidates()
`parent_seen` can be made a mutable ref, making this part more obvious,
not needing to be commented so much.
The micro-optimization of avoiding the union if `parent_seen` and
`current_seen` agree is pushed down in the `union()` method of the
fast, `u64` based bit set implementation (in case it matters).
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:43:00 +0200] rev 51227
rust-index: avoid double negation in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:17:00 +0200] rev 51226
rust-index: avoid some cloning in find_gca_candidates()
Instead of keeping the information whether the current revision is
poisoned on `current_seen`, we extract it as a boolean.
This also allows us to simplify the explanation of `seen[r].is_poisoned()`,
as the exceptional case where it is poisoned right after `r` has been
determined to be a solution does no longer exist.
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 15:35:38 +0200] rev 51225
rust-index: implement common_ancestors_heads() and ancestors()
The only differences betwwen `common_ancestors_heads()` and
`find_gca_candidates()` seems to be that:
- the former accepts "overlapping" input revisions (meaning with duplicates).
- limitation to 24 inputs (in the C code), that we translate to using the
arbitrary size bit sets in the Rust code because we cannot bail to Python.
Given that the input is expected to be small in most cases, we take the
heavy handed approach of going through a HashSet and wait for perfomance
assessment
In case this is used via `hg-cpython`, we can anyway absorb the overhead
by having `commonancestorheads` build a vector of unique values
directly, and introduce a thin wrapper over `find_gca_candidates`, to take
care of bit set type dispatching only.
As far as `ancestors` is concerneed, this is just chaining
`common_ancestors_heads()` with `find_deepest_revs`.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Tue, 17 Oct 2023 22:42:40 +0200] rev 51224
rust-index: find_gca_candidates bit sets genericization
This allows to use arbitratry size of inputs in `find_gca_candidates()`.
We're genericizing so that the common case of up to 63 inputs can be
treated with the efficient implementation backed by `u64`.
Some complications with the borrow checker came, because arbitrary sized
bit sets will not be `Copy`, hence mutating them keeps a mut ref on the `seen`
vector. This is solved by some cloning, most of which can be avoided,
preferably in a follow-up after proof that this works (hence after exposition
to Python layer).
As far as performance is concerned, calling `clone()` on a `Copy` object
(good case when number of revs is less than 64) should end up just doing a
copy, according to this excerpt of the `Clone` trait documentation:
Types that are Copy should have a trivial implementation of Clone.
More formally: if T: Copy, x: T, and y: &T, then let x = y.clone();
is equivalent to let x = *y;.
Manual implementations should be careful to uphold this invariant;
however, unsafe code must not rely on it to ensure memory safety.
We kept the general structure, hence why there are some double negations.
This also could be made nicer in a follow-up.
The `NonStaticPoisonableBitSet` is included to ensure that the
`PoisonableBitSet` trait is general enough (had to correct `vec_of_empty()` for
instance). Moving the genericization one level to encompass the `seen`
vector and not its elements would be better for performance, if worth it.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:45:20 +0100] rev 51223
rust-index: core impl for find_gca_candidates and find_deepest
This still follows closely the C original and not able to treat more than 63
input revisions (bitset backed by `u64` and one bit reserved for poisoning).
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:57:36 +0100] rev 51222
rust-index: add support for `reachableroots2`
Exposition in `hg-cpython` done in regular impl block, again
for rustfmt support etc.
Georges Racinet <georges.racinet@octobus.net> [Thu, 02 Nov 2023 12:17:06 +0100] rev 51221
hg-cpython: rev_pyiter_collect_or_else
It will be useful to give callers the control on the generated errors
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:54:42 +0100] rev 51220
rust-index: add support for `computephasesmapsets`
Exposition in `hg-cpython` done in the regular `impl` block to enjoy
rustfmt and clearer compilartion errors.
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 15:59:03 +0200] rev 51219
rust-index: slicechunktodensity returns Rust result
Ready for removal of the scaffolding.
This time, we allow ourselves a minor optimization: we avoid
allocating for each chunk. Instead, we reuse the same vector,
and perform at most one allocation per chunk.
The `PyList` constructor will copy the buffer anyway.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:40:23 +0100] rev 51218
rust-index: add support for `_slicechunktodensity`
Georges Racinet <georges.racinet@octobus.net> [Fri, 29 Sep 2023 20:51:49 +0200] rev 51217
rust-index: headrevsfiltered() returning Rust result
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:14:25 +0100] rev 51216
rust-index: add support for `headrevsfiltered`
The implementation is merged with that of `headrevs` also to make sure that
caches are up to date.
Raphaël Gomès <rgomes@octobus.net> [Tue, 19 Sep 2023 15:21:43 +0200] rev 51215
rust-index: implement headrevs
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 16:52:40 +0200] rev 51214
rust-index: variant of assert_py_eq with normalizer expression
The example given in doc-comment is the main use case: some methods
may require ordering insensitive comparison. This is about to be
used for `reachableroots2`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:50:14 +0200] rev 51213
rust-index: add support for delta-chain computation
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:01:34 +0200] rev 51212
rust-index: add support for `find_snapshots`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 12:05:32 +0200] rev 51211
rust-index: add `is_snapshot` method
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:33 +0200] rev 51210
rust-index: use the Rust index in `partialmatch`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 14:50:17 +0200] rev 51209
rust-index: add missing special case for null rev
This was an oversight, it was never a problem because we didn't use the index
much for user-facing things in the past, which is the only real way of getting
to this edge case.
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:17 +0200] rev 51208
rust-index: use the rust index in `shortest`
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 14:34:21 +0200] rev 51207
rust-index: add checks that `__contains__` is synchronized
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 11:03:57 +0100] rev 51206
rust-index: using the Rust index in nodemap updating methods
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:19:54 +0100] rev 51205
rust-index: implementation of __getitem__
Although the removed panic tends to prove if the full test suite
did pass that the case when the input is a node id does not happen,
it is best not to remove it right now.
Raising IndexError is crucial for iteration on the index to stop,
given the default CPython sequence iterator, see for instance
https://github.com/zpoint/CPython-Internals/blobs/master/BasicObject/iter/iter.md
This was spotted by `test-rust-ancestors.py`, which does simple interations on
indexes (as preflight checks).
In `revlog.c`, `index_getitem` defaults to `index_get` when called
on revision numbers, which does raise `IndexError` with the same message as
the one we are introducing here.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 11:34:52 +0200] rev 51204
rust-index: optim note for post-scaffolding removal
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:16:13 +0100] rev 51203
rust-index: check that the entry bytes are the same in both indexes
This is a temporary measure to show that both the Rust and C indexes are
kept in sync.
Comes with some related documentation precisions.
For comparison of error cases, see `index_entry_binary()` in `revlog.c`.
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 16:15:56 +0200] rev 51202
rust-index: return variables systematic naming convention
To help knowing at a glance when a method is ready, making
us more comofortable when we are close to the final removal of
scaffolding, we introduce the systematic variable names `rust_res` and
`c_res`. The goal of this series is to always return the formet.
We take again the case of `pack_header` as example.
Our personal opinion is to usually avoid such poor semantics as `res`, but
usually accept it when it close to the actual return, which will be the
case in most methods of this series. Also, the name can simply be dropped
when we remove the scaffolding. To follow on the example, the body of
`pack_header()` should become this in the final version:
```
let index = self.index(py).borrow();
let packed = index.pack_header(args.get_item(py, 0).extract(py)?);
Ok(PyBytes::new(py, &packed).into_object());
```
in these cases it is close to the actual return and will be removed
at the end entirely.
Georges Racinet <georges.racinet@octobus.net> [Fri, 29 Sep 2023 15:51:49 +0200] rev 51201
rust-index: results comparison helper with details
This is a bit simpler to call and has the advantage of systematically log
the encountered deviation.
To avoid committing dead code, we apply it to the `pack_header` method, that
was already returning the Rust result.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 10:59:04 +0200] rev 51200
rust-index: helper for revision not in index not involving nodemap
This is a good match for exceptions raised from the C implementation,
when it is not about a nodemap inconsistency.
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 19:54:18 +0200] rev 51199
rust-index: renamed nodemap error function for rev not in index
The function name was misleading, as the error wording mentions the
nodemap, hence would not be appropriate for missing revisions not
related to a nodemap lookup.
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 10:28:10 +0200] rev 51198
rust-index: add `pack_header` support
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 10:34:48 +0100] rev 51197
rust-index: support cache clearing
I'm not 100% sure how useful it is outside of perf, but it's still worth
implementing.
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Jun 2023 11:37:19 +0200] rev 51196
rust-index: check rindex and cindex return the same get_rev
This is a temporary safeguard while we synchronize both indexes.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 16:43:39 +0200] rev 51195
rust-index: synchronize remove to Rust index
Future steps will bring the two indexes further together until we can
rip the C index entirely when running Rust code.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 11:59:43 +0200] rev 51194
rust-index: remove `__setitem__` method from the mixed index
This is not defined on the Python or C one, and isn't used anywhere.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 11:36:22 +0200] rev 51193
rust-index: check equality between rust and cindex for `__len__`
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 18:24:54 +0200] rev 51192
rust-index: synchronize append method
We now append to the Rust index just as we do to the C index. Future steps
will bring the two indexes further together until we can rip the C index
entirely when running Rust code.
Raphaël Gomès <rgomes@octobus.net> [Mon, 18 Sep 2023 17:11:11 +0200] rev 51191
rust-revlog: teach the revlog opening code to read the repo options
This will become necessary as we start writing revlog data from Rust.
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 17:34:51 +0200] rev 51190
rust-index: pass data down to the Rust index
This will allow us to start keeping the Rust index synchronized with the
cindex as we gradually implement more and more methods in Rust. This will
eventually be removed.