Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 27 Sep 2024 17:25:15 +0100] rev 51917
rust: fix the deprecation warning in NaiveDateTime::from_timestamp
This warning appears between chrono 0.4.34 and 0.4.38, so
isn't affecting the current lock file, but it would come
when we upgraded the version.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 27 Sep 2024 15:53:56 +0200] rev 51916
run-tests: ensure that --no-rust do not use rust
Since having a valid value in HGWITHRUSTEXT is enough to trigger the use of
rust, we need to unset it before install to be sure.
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 03:04:48 +0200] rev 51915
revlogutils: teach
issue6528 filtering about grandparents
During dynamic filtering, we should assume that the current repository
is correct. Therefore the parents of the delta base can tell us if that
parent has metadata without having to build the whole text.
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:43:08 +0200] rev 51914
revlogutils: remember known metadata parents for
issue6528
In the cases where the parent revs tell us for sure that the parent has
metadata, remember this fact to avoid content recomputations later.
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:44:59 +0200] rev 51913
revlogutils: for
issue6528 fix, pre-cache nullrev as metadata-free
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:59:50 +0200] rev 51912
revlogutils: for
issue6528 fix, cache results for null changes
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:41:37 +0200] rev 51911
revlogutils: fix _chunk() reference
_chunk is only found in the inner revlog object and not directly exposed
outside.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 02 Sep 2024 22:14:38 +0200] rev 51910
rev-branch-cache: reenable memory mapping of the revision data
Now that we are no longer truncating it, we can mmap it again.
This provide a sizeable speedup on repository with a very large amount of
revision for example for a mozilla-try clone with 5 793 383 revisions, this
provide a speedup of 5ms - 10ms. Since they happens within the "critical" locked
path during push. These miliseconds are important.
In addition, the v3 branchmap format is use the rev-branch-cache more than the
v2 branchmap cache so this will be important.
On smaller repository we consistently see an improvement of one or two percents,
but the gain in absolute time is usually < 10 ms.
#### benchmark.name = hg.command.unbundle
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
# benchmark.variants.source = unbundle
# benchmark.variants.verbosity = quiet
### data-env-vars.name = mozilla-try-2024-03-26-zstd-sparse-revlog
## bin-env-vars.hg.flavor = default
e51161b12c7e: 3.527923
ebdcfe85b070: 3.468178 (-1.69%, -0.06)
## bin-env-vars.hg.flavor = rust
e51161b12c7e: 3.580158
ebdcfe85b070: 3.480564 (-2.78%, -0.10)
### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm
## bin-env-vars.hg.flavor = rust
e51161b12c7e: 3.527923
ebdcfe85b070: 3.468178 (-1.69%, -0.06)
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 25 Sep 2024 12:42:47 +0200] rev 51909
rev-branch-cache: have debugupdatecache warm rbc too
Since the "v2" format can be more performant than the "v1" format (thanks to
mmap), it is useful to be able to make sure it is present
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 25 Sep 2024 12:49:32 +0200] rev 51908
rev-branch-cache: schedule a write of the "v2" format if we read from "v1"
The new file can be memorymapped, while the old one cannot. So there is value in
having the v2 format around as soon a possible.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 15:44:10 +0200] rev 51907
rev-branch-cache: fallback on "v1" data if no v2 is found
This will help smooth the transition to the v2 format for existing large
repository.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 03:16:35 +0200] rev 51906
rev-branch-cache: increment the version to "v2"
We want to ensure no older clients will truncate the file under us. So we need to
change their name. We don't change the rest of the format (unfortunaly).
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 00:16:23 +0200] rev 51905
rev-branch-cache: stop truncating cache file
Truncating the file prevent the safe use of mmap. So instead of overwrite the
existing data. If more than 20% of the file is to be overwritten, we rewrite the
whole file instead.
Such whole rewrite is done by replacing the old one with a new one, so mmap of
the old file would be affected.
This prepare a more aggressive use of mmap in later patches.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 00:16:04 +0200] rev 51904
rev-branch-cache: make sure we close the name file we open
We were various opening without with or try. Adding a try would not hurt.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 23 Sep 2024 23:52:45 +0200] rev 51903
rev-branch-cache: add a way to force rewrite of the cache
This seems useful to be able to do this, for example during strip.
This align with the intended expressed in the `test-branches.t` test. This will
help use being more confident about future changes in the series.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 00:01:30 +0200] rev 51902
rev-branch-cache: issue more truthful "truncating" message
First, don't pretend it truncate to 40 when it actually truncate to 0. Second,
don't pretend to truncate to 0 when the file is already empty/missing.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 22 Sep 2024 15:55:46 +0200] rev 51901
rev-branch-cache: move the code in a dedicated module
The branchmap module is getting huge and the rev branch cache is fully
independent, lets move it elsewhere.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 25 Sep 2024 01:16:47 -0400] rev 51900
statichttprepo: stop shadowing the `bytes` builtin
PyCharm flagged it, but I also misunderstood when looking at the code, because
the name implied a byte string, not a number.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 25 Sep 2024 01:12:39 -0400] rev 51899
statichttprepo: fix `httprangereader.read()` for py3
It looks like there were a bunch of problems, not all of them py3 related:
1) The signature of BinaryIO.read() is -1, not None
2) The `end` variable can't be bytes and interpolate into str with "%s"
3) The `end` variable can't be an int and interpolate into str with "%s"
4) The result slicing could be out of bounds if more is requested than
returned
I guess if somebody would have called `read(-1)` (either directly or because a
wrapper defaults to that), it wouldn't have been handled correctly. The fact
that it is a valid value meaning to read everything requires some additional
changes later in the method around when it slices the byte string that was read,
but that seems to have already been broken.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 25 Sep 2024 00:52:44 -0400] rev 51898
statichttprepo: use a context manager to handle a file descriptor
I'm not sure if this should be reduced to `vfs.exists()`. That would seem to be
equivalent code (since the result of the read is ignored, so we can't tell if
the file actually has content, which has been the state of things going back to
98b6c3dde237), but this is at least safer file descriptor handling.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 02:58:50 +0200] rev 51897
profiling: pass bytes to `_()` and `error.Abort()`
And of course `other_tool_name` is str too, so that needs to be converted. The
type hints from PyCharm say `sys.monitoring.get_tool()` can return None, so
handle that case explicitly before it trips up pytype.
Joerg Sonnenberger <joerg@bec.de> [Mon, 08 Jul 2024 22:46:04 +0200] rev 51896
exchange: improve computation of relevant markers for large repos
Compute the candidate nodes with relevant markers directly
from keys of the predecessors/successors/children dictionaries of
obsstore. This is faster than iterating over all nodes directly.
This test could be further improved for repositories with relative
few markers compared to the repository size, but this is no longer
hot already. With the current loop structure, the obshashrange use
works as well as before as it passes lists with a single node.
Adjust the interface by allowing revision lists as well as node lists.
This helps cases that computes ancestors as it reduces the
materialisation cost. Use this in _pushdiscoveryobsmarker and
_getbundleobsmarkerpart. Improve the latter further by directly using
ancestors().
Performance benchmarks show notable and welcome improvement to no-op push and
pull (that would also apply to other push/pull). This apply to push and pull
done without evolve.
### push/pull Benchmark parameter
# bin-env-vars.hg.flavor = default
# benchmark.variants.explicit-rev = none
# benchmark.variants.protocol = ssh
# benchmark.variants.revs = none
## benchmark.name = hg.command.pull
# data-env-vars.name = mercurial-devel-2024-03-22-zstd-sparse-revlog
before: 5.968537 seconds
after: 5.668507 seconds (-5.03%, -0.30)
# data-env-vars.name = tryton-devel-2024-03-22-zstd-sparse-revlog
before: 1.446232 seconds
after: 0.835553 seconds (-42.23%, -0.61)
# data-env-vars.name = netbsd-src-draft-2024-09-19-zstd-sparse-revlog
before: 5.777412 seconds
after: 2.523454 seconds (-56.32%, -3.25)
## benchmark.name = hg.command.push
# data-env-vars.name = mercurial-devel-2024-03-22-zstd-sparse-revlog
before: 6.155501 seconds
after: 5.885072 seconds (-4.39%, -0.27)
# data-env-vars.name = tryton-devel-2024-03-22-zstd-sparse-revlog
before: 1.491054 seconds
after: 0.934882 seconds (-37.30%, -0.56)
# data-env-vars.name = netbsd-src-draft-2024-09-19-zstd-sparse-revlog
before: 5.902494 seconds
after: 2.957644 seconds (-49.89%, -2.94)
There is not notable different in these result using the "rust" flavor instead
of the "default". The performance impact on the same operation when using
evolve were also tested and no impact was noted.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 21:31:58 -0400] rev 51895
typing: make the localrepo classes known to pytype
9d4ad05bc91c and
1b17309cdaab both mentioned making `bundlerepository` and
`unionrepository` subclass `localrepository` during the type checking phase, but
that didn't apply to pytype in practice. See
bcaa5d408657 and friends for how
the zope interfaces confuse pytype, and end up converting the classes they
decorate into `Any`.
This commit is slightly more complex though, because `localrepository` has mixin
classes applied to it when it is instantiated. Specifically, `RevlogFileStorage`
is added, which adds `def file(f)` (which isn't defined on `localrepository`).
Therefore a list of `localrepository` superclasses is provided during type
checking to account for the mixins. Without this, the `bundlerepository` class
gets flagged when it attempts to call its superclass implementation of `file()`.
Note that pytype doesn't understand these mixin superclasses (it marks the
superclass of `localrepository` as `Any`, because they are zope interfaces it
doesn't understand), but that's enough to get it to not flag `bundlerepository`.
PyCharm also stops flagging it as a missing function, though it seems like it is
able to handle the zope interfaces.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 23 Sep 2024 14:58:37 -0400] rev 51894
typing: add a handful more annotations to `mercurial/vfs.py`
These came out of refactoring into a protocol class, but they can stand on their
own.
The `audit` callback is kinda screwy because the internal lambda and the callable
for `pathutil.pathauditor` have different args and a different return type. It's
conditionalized where it is called, and can be cleaned up later if desired.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 21 Sep 2024 13:53:05 -0400] rev 51893
typing: make `vfs.isfileorlink_checkdir()` path arg required
The only caller to this is `merge._checkunknownfile()`, which supplies a value.
That's good, because `util.localpath()` immediately uses the value to call a
method on it on Windows. The posix implementation returns the value unaltered,
but then `pathutil.finddirs_rev_noroot()` would have exploded.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 20:16:12 -0400] rev 51892
typing: manually add type annotations to `mercurial/vfs.py`
This isn't everything, but hopefully it's close enough to hack on a protocol
class.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 16:36:28 -0400] rev 51891
typing: correct pytype mistakes in `mercurial/vfs.py`
With the previous changes in this series (prior to merging the *.pyi file), this
wasn't too bad- the only definitively wrong things were the `data` argument to
`writelines()`, and the return type on `backgroundclosing()` (both of these
errors were dropped in the previous commit; for some reason pytype doesn't like
`contextlib._GeneratorContextManager`, even though that's what it determined it
is):
File "/mnt/c/Users/Matt/hg/mercurial/vfs.py", line 411, in abstractvfs:
Bad return type 'contextlib._GeneratorContextManager' for generator function abstractvfs.backgroundclosing [bad-yield-annotation]
Expected Generator, Iterable or Iterator
PyCharm thinks this is `Generator[backgroundfilecloser], Any, None]`, which can
be reduced to `Iterator[backgroundfilecloser]`, but pytype flagged the line that
calls `yield` without an argument unless it's also `Optional`. PyCharm is happy
either way. For some reason, `Iterable` didn't work for pytype:
File "/mnt/c/Users/Matt/hg/mercurial/vfs.py", line 390, in abstractvfs:
Function contextlib.contextmanager was called with the wrong arguments [wrong-arg-types]
Expected: (func: Callable[[Any], Iterator])
Actually passed: (func: Callable[[Any, Any, Any], Iterable[Optional[Any]]])
Attributes of protocol Iterator[_T_co] are not implemented on Iterable[Optional[Any]]: __next__
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 13:38:13 -0400] rev 51890
typing: run `merge-pyi` on `mercurial/vfs.py`
The *.pyi file was generated with pytype 2023.11.21. There were a few things
here that were wrong (e.g. `writelines()` takes an `Iterable[bytes]`, not
`bytes`, or inexplicable errors like importing several of the vfs classes from
this very module), and those changes have been dropped manually here.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 01:10:17 -0400] rev 51889
typing: add type annotations to `mercurial.util.makelock()`
This bubbles up into the `vfs` classes, so get this out of the way.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 00:20:24 -0400] rev 51888
util: avoid a leaked file descriptor in `util.makelock()` exceptional case
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 00:04:09 -0400] rev 51887
typing: add type annotations to the `mercurial.util.filestat` class
It's referenced in the `vfs` classes, so get this out of the way to help there.
The `TypeVar` definition and its usage was copied from the existing `util.pyi`
file.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 12:15:08 -0400] rev 51886
vfs: do minor copyediting on comments and doc strings
These were flagged by PyCharm, so clear them from the gutter.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 01:16:16 -0400] rev 51885
vfs: simplify the `abstractvfs.rename()` implementation
PyCharm was yapping about `util.rename()` not returning anything, because it is
typed to return `None`, but the value was captured and returned after calling
`_avoidambig()`. Instead, drop all of that, unconditionally rename, and then
call `_avoidambig()` if appropriate.
While we're here, convert the ersatz ternary operator into a modern one to help
pytype. When a variable is initialized the old way, pytype tends to assign the
type of the LHS of the `and`. In this case, that's a bool, and it will get
confused that bool doesn't have a `stat` attribute once this method gets more
type annotations. (Currently it thinks the `checkambig` arg is `Any`, so it
doesn't care.)
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 00:07:39 -0400] rev 51884
vfs: use @abstractmethod instead of homebrewing abstract methods
The latter confuses PyCharm after adding more type annotations when, for
example, `abstractvfs.rename()` calls `_auditpath()`- the latter unconditionally
raised an error, so PyCharm thought the code that came after is unreachable. It
also tricked pytype into marking the return type as `Never`, which isn't
available until Python 3.11 (outside of `typing_extensions`).
This also avoid PyCharm warnings that the call to the superclass constructor was
missed (it couldn't be called because it raised an error to prevent
instantiation).
The statichttprepo module needed to be given an override for one of the abstract
methods, so that it can be instantiated. In `abstractvfs`, this method is only
called by `rename()`, so I think we can leave this empty. We raise an error in
case somebody accidentally calls it in the future- it would have raised this
same error prior to this change.
I couldn't wrangle `import-checker.py` into accepting importing `ABC` and
`abstractmethod`- for each subsequent import, it reports something like:
stdlib import "contextlib" follows local import: abc
I suspect the problem is that near the `if fullname != '__future__'` check, if
the module doesn't fall into the error case, `seenlocal` gets set to the module
name. That causes it to be treated like a local module on the next iteration,
even though it is in `stdlib_modules`.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 19 Sep 2024 21:03:10 -0400] rev 51883
vfs: modernize the detection of the main thread
There weren't a lot of good choices when py27 was supported, but starting with
py34, `threading.main_thread()` is available. This gets us away from an
undocumented, internal symbol, and drops a pytype suppression statement. It is
also apparently no longer reliable after a process fork.[1][2]
[1] https://stackoverflow.com/a/
23207116
[2] https://github.com/python/cpython/blob/v3.6.3/Lib/threading.py#L1334
Matt Harbison <matt_harbison@yahoo.com> [Sun, 22 Sep 2024 17:06:31 -0400] rev 51882
store: fix a signature mismatch for a vfs subclass
This was flagged by PyCharm. I'm not sure why pytype doesn't catch this- it's
not excluded from the modules that are currently checked.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 22 Sep 2024 17:02:42 -0400] rev 51881
lfs: fix various signature mismatches for vfs subclasses
These were flagged by PyCharm. I'm not sure why pytype doesn't catch these-
only `hgext/lfs/__init__.py` in the lfs extension is excluded from being
checked.
I'm not sure if the `*insidef` arg to `join()` was meant as an internal
convencience, because I see another class that gets flagged for the same
signature problem (to be fixed next). But I don't feel bold enough to make this
an internal function, and provide a simplified public `join()` on the `vfs`
classes. That can still be done later, if desired. For now, process the
additional args and pass them along, even though there don't appear to be any
current callers that provide extra args to these classes. We need all of the
subclasses to agree on the signature, or they won't be considered to implement
the `Vfs` protocol being developed.
While we're copy/pasting from the base class, bring the type annotations along
for the ride.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 22 Sep 2024 17:18:05 -0400] rev 51880
util: add a comment to suppress a PyCharm warning about a PEP 8 violation
Slowly trying to get rid of silly warnings, so that real problems aren't hidden.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 22 Sep 2024 17:15:20 -0400] rev 51879
keepalive: fix a signature mismatch for a http.client.HTTPResponse subclass
Also flagged by PyCharm. This is checked by pytype too, so I'm not sure why it
misses this. I verified in py36 that this argument is documented for the
function, so maybe this is py2 legacy.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 22 Sep 2024 17:11:10 -0400] rev 51878
cbor: drop a duplicate dictionary initialization entry
Flagged by PyCharm.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 04 Sep 2024 17:08:58 +0200] rev 51877
profiling: document the py-spy value for `profiling.type`
The feature was not visible otherwise.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 19 Sep 2024 18:49:04 -0400] rev 51876
tests: enable pytype checking on `mercurial/unionrepo.py`
Matt Harbison <matt_harbison@yahoo.com> [Thu, 19 Sep 2024 18:48:07 -0400] rev 51875
unionrepo: fix mismatches with revlog classes
This is a subset of
cfd30df0f8e4, applied to `unionrepository`. There are none
of the `write()` method overrides here, like `bundlerepository`.
With these changes, pytype flags the `unionrevlog` constructor:
File "/mnt/c/Users/Matt/hg/mercurial/unionrepo.py", line 55, in __init__:
No attribute '_revlog' on mercurial.changelog.changelog [attribute-error]
Called from (traceback):
line 207, in __init__
File "/mnt/c/Users/Matt/hg/mercurial/unionrepo.py", line 55, in __init__:
No attribute '_revlog' on mercurial.revlog.revlog [attribute-error]
Called from (traceback):
line 232, in __init__
But it turns out that both `changelog.changelog` and `revlog.revlog` do have a
`target` attribute, so they wouldn't trip over this. It seems weird that the
second caller to be flagged is passing the private `_revlog`, but maybe that's
how it needs to be.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 19 Sep 2024 16:19:29 -0400] rev 51874
typing: make `unionrepository` subclass `localrepository` while type checking
This is the same change as
9d4ad05bc91c made for `bundlerepository`, for the
same reasons.
Also, add a comment here to suppress the PyCharm warning that the superclass
constructor is not called, that is new now that there's a simulated superclass.
That lack of a call is by design- `makeunionrepository()` does magic that
PyCharm isn't aware of. But PyCharm has been better at catching problems than
pytype in a lot of cases, so I'd like to reduce the bogus things it flags, to
make the real issues stand out.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 18 Sep 2024 21:00:20 -0400] rev 51873
tests: enable pytype checking on `mercurial/bundlerepo.py`
Matt Harbison <matt_harbison@yahoo.com> [Wed, 18 Sep 2024 17:46:46 -0400] rev 51872
revlog: make `clearcaches()` signature consistent with ManifestRevlog
I'm not sure if this a newly added bug, because of using a different version of
pytype, or if the recent work around avoiding the zope interface types in the
type checking phase (see
5eb98ea78fd7 and friends)... but pytype 2023.11.21
started flagging this series since it was last pushed ~6 weeks ago:
File "/mnt/c/Users/Matt/hg/mercurial/bundlerepo.py", line 204, in <module>:
Overriding method signature mismatch [signature-mismatch]
Base signature: 'def mercurial.manifest.ManifestRevlog.clearcaches(self, clear_persisted_data: Any = ...) -> None'.
Subclass signature: 'def mercurial.revlog.revlog.clearcaches(self) -> None'.
Not enough positional parameters in overriding method.
Maybe the multiple inheritance in `bundlerepo.bundlemanifest` is bad, but it
seems like a `ManifestRevlog` is-a `revlog`, even though the class hierarchy
isn't coded that way. Additionally, it looks like `revlog.clearcaches()` is
dealing with some persistent data, so maybe this is useful to have there anyway.
Also sprinkle some trivial type hints on the method, because there are other
`clearcaches()` definitions in the codebase with these hints, and I don't feel
like waiting for another pytype run to see if it cares that specifically about
the signature matching.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 03 Aug 2024 01:33:13 -0400] rev 51871
bundlerepo: fix mismatches with repository and revlog classes
Both pytype and PyCharm complained that `write()` and `_write()` in the
bundlephasecache class aren't proper overrides- indeed they seem to be missing
an argument that the base class has.
PyCharm and pytype also complained that the `revlog.revlog` class doesn't have a
`_chunk()` method. That looks like it was moved from revlog to `_InnerRevlog`
back in
e8ad6d8de8b8, and wasn't caught because this module wasn't type checked.
However, I couldn't figure out a syntax with `revlog.revlog._inner._chunk(self, rev)`,
as it complained about passing too many args. `bundlerevlog._rawtext()` uses
this `super(...)` style to call the super class, so hopefully that works, even
with the wonky dynamic subclassing. The revlog class needed the `_InnerRevlog`
field typed because it isn't set in the constructor.
Finally, the vfs type hints look broken. This initially failed with:
File "/mnt/c/Users/Matt/hg/mercurial/bundlerepo.py", line 65, in __init__: Function readonlyvfs.__init__ was called with the wrong arguments [wrong-arg-types]
Expected: (self, vfs: mercurial.vfs.vfs)
Actually passed: (self, vfs: Callable)
Called from (traceback):
line 232, in dirlog
line 214, in __init__
I don't see a raw Callable, but I tried changing some of the vfs args to be typed
as `vfsmod.abstractvfs`, but that class doesn't have `options`, so it failed
elsewhere. `readonlyvfs` isn't a subclass of `vfs` (it's a subclass of
`abstractvfs`), so I'm not sure how to handle that. It would be a shame to have
to make a union of vfs subclasses (but not all of them have `options` either).
Matt Harbison <matt_harbison@yahoo.com> [Wed, 18 Sep 2024 17:50:57 -0400] rev 51870
typing: make `bundlerepository` subclass `localrepository` while type checking
Currently, `mercurial/bundlerepo.py` is excluded from pytype, mostly because it
complains that various `ui` and `vfs` fields in `localrepository` are missing.
(`bundlerepository` dynamically subclasses `localrepository` when it is
instantiated, so it works at runtime.) This makes that class hierarchy known to
pytype.
Having a protocol for `Repository` is probably the right thing to do, but that
will be a lot of work and this still reflects the class at runtime. Subclassing
also has the benefit of making sure any method overrides have a matching
signature, so maybe this is a situation where we do both of these things. (I'm
not sure how clear the diagnostics are if a class *almost* implements a
protocol, but is missing a method argument or similar.) The subclassing is not
done outside of type checking runs to avoid any side effects on already complex
code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 17 Sep 2024 16:40:24 +0200] rev 51869
rust: bump rust-cpython version to 0.7.2
This version supports Python 3.12 while 0.7.1 did not.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 14:49:35 +0200] rev 51868
rust: add Vfs trait
This will allow for the use of multiple vfs like in the Python implementation,
as well as hiding the details of the upcoming Python vfs wrapper to hg-core.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 12:49:26 +0200] rev 51867
rust: use new revlog configs in all revlog opening code
This centralizes the more complex logic needed for the upcoming code
and creates stronger APIs with fewer booleans.
We also reuse `RevlogType` where needed.
Raphaël Gomès <rgomes@octobus.net> [Tue, 17 Sep 2024 10:18:32 +0200] rev 51866
rust-revlog: don't try to open the data file if the index is empty
This will cover the case where the data file is not present.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 12:25:12 +0200] rev 51865
rust-revlog: add revlog-specific config objects
These will be used by the upcoming Rust `InnerRevlog` to better centralize
config information that is relevant to revlogs.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 12 Sep 2024 16:27:58 -0400] rev 51864
typing: add `from __future__ import annotations` to remaining source files
Most of these look newer than when the original imports referenced in the
previous commit were dropped, so these weren't covered by the backout. These
were found with:
hg files mercurial hgext hgext3rd -I '**.py' -X '**/thirdparty' \
| xargs grep -L 'from __future__ import annotations'
All of the `__init__.py` files that finds are empty, so those were ignored and
the rest manually edited.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 16 Sep 2024 15:36:44 +0200] rev 51863
typing: add `from __future__ import annotations` to most files
Now that py36 is no longer supported, we can postpone annotation evaluation.
This means that the quoting is usually optional (for things imported under the
guard of `if typing.TYPE_CHECKING:` to avoid circular imports), and there's less
overhead on startup[1].
There may be some missing here. I backed out
6000f5b25c9b (which removed the
`from __future__ import ...` that was supporting py2), reverted the changes in
`contrib/`, `doc/`, and `tests/`, and then ran:
$ hg status -n --change . | \
xargs sed -i -e 's/from __future__ import .*$/from __future__ import annotations/'
There were some minor tweaks needed when reviewing (mostly making the spacing
around the import consistent, and `mercurial/testing/__init__.py` had a
multiline import that wasn't fully rewritten.
[1] https://docs.python.org/3/whatsnew/3.7.html#pep-563-postponed-evaluation-of-annotations
Matt Harbison <matt_harbison@yahoo.com> [Mon, 16 Sep 2024 15:36:38 +0200] rev 51862
format: add many "missing" comma
Black was not adding them until the next changeset introduced a bunch of `from
__future__ import annotations` to most file. Since it make the next changeset
hard to read we introduce them in advance.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 12 Sep 2024 12:53:00 -0400] rev 51861
typing: simplify archive.gz writing and drop a few pytype suppressions
I was waiting until 3.8 to use `Literal` to fix this, but there's also the ":"
and "|" characters that are passed along here, meant only for the non-gz archive
types. But manipulating what the local caller passes is silly- we know we're
writing, so just open it for writing. As an added bonus, PyCharm stops flagging
the call too (since it doesn't know about pytype suppression comments).
Matt Harbison <matt_harbison@yahoo.com> [Thu, 12 Sep 2024 12:38:43 -0400] rev 51860
typing: explicitly set the return type of `_InnerRevLog.raw_text()`
Somewhere between
cd72a88c5599 and
2fd44b3dcc33, pytype changed the return type
from `Tuple[_T1, Any, bool]` to `Any`. Both are wrong. `mdiff.patches()` is an
alias for `mpatch.patches()`, which is selected via module policy (and breaks
the ability to infer the types). However, `cext`, `cffi`, and `pure`
implementations all agree it returns bytes.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 12 Sep 2024 12:28:27 -0400] rev 51859
typing: add explicit hints for recent pytype regressions
Somewhere between
454feddab720 and
cd72a88c5599, pytype changed how it inferred
the return type in `extdiff.py` from
Tuple[Any, List[Tuple[bytes, Any, os.stat_result]]]
to
Tuple[Any, List[nothing]]
It also changed the return type in `archival.py` from `Any` to `NoReturn`. Fix
those up, and also the obvious parameter types while we're here.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 18:06:50 +0200] rev 51858
revlog: use the method to check if the revlog is being written to
This was probably fine, but it could become not fine at some point.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 17:26:06 +0200] rev 51857
revlog: add an early return for getting sidedata
No point in trying to fetch sidedata if there isn't a sidedata file.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 17:19:20 +0200] rev 51856
revlog: simplify rawtext return value
We're always returning a tuple even though only the raw text is being used,
and we're rebuilding another tuple again higher.
As a bonus, this will remove one tuple creation and deletion
per `raw_text` call, hence fewer gc calls, etc.
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 17:06:05 +0200] rev 51855
revlog: cleanup some outdated docstrings
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 12 Sep 2024 10:09:06 +0200] rev 51854
rust-inner-revlog: always inline `get_entry`
This is a very hot function.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 12 Sep 2024 10:08:45 +0200] rev 51853
rust-inner-revlog: derive Debug for IndexHeaderFlags
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 12 Sep 2024 10:08:28 +0200] rev 51852
rust-inner-revlog: drop some outdated comment
Raphaël Gomès <rgomes@octobus.net> [Wed, 19 Jun 2024 12:00:55 +0200] rev 51851
rust-config: add more ways of reading the config
These will be needed for future patches of this series to interpret more
complex/different config values.
Raphaël Gomès <rgomes@octobus.net> [Tue, 26 Mar 2024 15:51:31 +0000] rev 51850
util: make buffer readonly
There is no use of writable buffers anywhere in the code, and this helps us
make sure we don't get into unsound territory when sharing memory with Rust.
This `toreadonly` method was not available in Python 3.6, but we dropped the
support for it earlier that week, so no need for any compatibility code.
Matt Harbison <mharbison@atto.com> [Thu, 05 Sep 2024 17:12:52 -0400] rev 51849
setup: avoid the deprecated `distutils.spawn.find_executable`
I noticed this was flagged with `DeprecationWarning` in py3.12 with `setuptools`
74.1.2, and it suggested `shutil.which()` instead. The signatures aren't the
same, but the additional `mode` argument in the middle of the latter defaults to
`os.F_OK | os.X_OK`, which maintains the same semantics.
Matt Harbison <mharbison@atto.com> [Thu, 05 Sep 2024 16:59:36 -0400] rev 51848
setup: drop the hack to disable linker warning 4197 on Windows
I don't see this when building on Windows with py3.8 or py3.12, so either the
code was fixed, or (more likely) the compiler stopped warning about it some time
after VS 2008. If we do have to put this back, it would probably be better to
put a `#pragma` in a header file somewhere, and avoid `setuptools` technical
debt.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 11 Sep 2024 00:20:07 +0200] rev 51847
ci: also offer to test 3.12 with rust
The rust-cpython binding got 3.12 support very recently, it is worse keeping on
a tighter watch.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Aug 2024 16:35:43 +0200] rev 51846
ci: add the option to test more Python versions
It seems like a good idea to be able to test the lowest version we support. And
there have been enougth issue with 3.12 that we need to be able to make sur we
do not break it. We should probably get a matrix setup for more version and
flavor, but that is a simple and efficient start.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 05 Sep 2024 12:37:59 +0200] rev 51845
censor: document the censor.policy option (
issue6909)
Censor is not marked as experimental and should be documented
I am not doing this on stable because the help markup change it is using seems
more suitable for default.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 05 Sep 2024 12:28:12 +0200] rev 51844
help: add :config-doc:`section.key` shorthand to insert documentation
The config items defined in the configitems.toml file can already hold their
documentation. Having some way to automatically insert it was a long standing
low hanging fruit. So I did a first implementation on that. It fairly simple,
but it open the door to more.
It will be used in the next changeset.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 11 Sep 2024 20:52:51 +0200] rev 51843
bzr: attempt to stabilize the test
The test has flakyness where the order of a few commit swap. This is an attempt
at avoiding that.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 12 Sep 2024 02:24:20 +0200] rev 51842
branching: merge with stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 11 Sep 2024 12:03:39 +0200] rev 51841
profiling: use "stat" profiler to profile individual request
The ls profiler no longer works for that. As the lsprof profiler is not default
and not great is general, lets side step the issue for now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 11 Sep 2024 12:02:38 +0200] rev 51840
profiling: improve 3.12 error message for calling lsprof twice
Python 3.12 prevent lsprof to be enabled if it is already enabled. This break
the use of lsprof in `hg serve` as both the initial `serve` command and the
request serving want to profile.
The "stat" profiler (the default) does not have this problem, so we focus on
improving the error message for now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 11 Sep 2024 00:41:37 +0200] rev 51839
test: display server error log in test-profile.t
This will help us to catch error with Python 3.12
Joerg Sonnenberger <joerg@bec.de> [Wed, 15 Nov 2023 22:11:34 +0100] rev 51838
archive: defer opening the output until a file is matched
Before, if no file is matched, an error is thrown, but the archive is
created anyway. When using hgweb, an error 500 is returned as the
response body already exists when the error is seen.
Afterwards, the archive is created before the first match is emitted.
If no match is found, no archive is created. This is more consistent
behavior as an empty archive is not a representable in all output
formats, e.g. tar archives.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 05 Sep 2024 13:37:24 +0200] rev 51837
run-tests: add color to the progress output
More color is useful to me.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 10 Sep 2024 22:26:23 +0200] rev 51836
python-compat: drop support for Python3.6 and 3.7
As discussed on the mailing list¹, these are old version that seems safe to
drop. Python 3.8 comes with various improvement especially regarding typing
capabilities.
[1] https://lists.mercurial-scm.org/pipermail/mercurial-devel/2024-July/297998.html
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 10 Sep 2024 21:19:36 +0200] rev 51835
ci: drop path manipulation that we do not need anymore
The CI image has a squarer setup now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 06 Sep 2024 02:12:19 +0200] rev 51834
brancing: merge stable into default
Matt Harbison <mharbison@atto.com> [Thu, 05 Sep 2024 15:37:14 -0400] rev 51833
setup: handle removal of old MSVC compiler from setuptools 65.0 (
issue6910)
It was removed a few years ago[1]. When trying to reproduce locally using a
clean py3.12 as called out in the bug report, `setuptools` wasn't installed at
all, and needed a `pip install` to fix a `ModuleNotFoundError` when building
locally. Maybe that needs to be in the requirements clause now.
It looks like this "private" module was added in setuptools 48.0.[2] I can't
find a changelog of what version was included in which version of python, and
the changelog for pip has a huge gap between when it called out 67.6.1 in `pip`
23.1 (2023-04-15), and 41.4.0 in `pip` 19.3 (2019-10-14).[3] So, we'll just add
to the existing code instead of replacing it, for safety.
[1] https://github.com/pypa/setuptools/commit/
cc017c77948737d131f683e0c25cd37bc639b8fc
[2] https://github.com/pypa/setuptools/commit/
d034a5ec7f707499139f90eb846b9e720923124c
[3] https://pip.pypa.io/en/stable/news/
Joerg Sonnenberger <joerg@bec.de> [Wed, 28 Aug 2024 23:25:26 +0200] rev 51832
utils: accept bytearray arguments for escapestr
Joerg Sonnenberger <joerg@bec.de> [Sun, 30 Jun 2024 16:02:50 +0200] rev 51831
http: simplify
Joerg Sonnenberger <joerg@bec.de> [Sun, 30 Jun 2024 14:16:43 +0200] rev 51830
http: use urllib's cookie handler
Split the logic for loading the cookies based on the configuration in a
helper function and otherwise use the library implementation directly.
Joerg Sonnenberger <joerg@bec.de> [Sun, 30 Jun 2024 13:22:23 +0200] rev 51829
http: reuse Python's implementation of read/readline/readinto
Since Python 3 already provides a working implementation of readline,
there is no need for our own buffering implementation. Reduce the
code to transfer accounting only.
Joerg Sonnenberger <joerg@bec.de> [Sun, 30 Jun 2024 02:46:53 +0200] rev 51828
debugwireproto: redo logging to also work for https
Joerg Sonnenberger <joerg@bec.de> [Fri, 28 Jun 2024 16:26:06 +0200] rev 51827
urllib2: redo response.readlines addition via class patching
Matt Harbison <matt_harbison@yahoo.com> [Wed, 21 Aug 2024 22:15:05 -0400] rev 51826
typing: lock in new pytype gains from making revlog related classes typeable
These were pretty clean changes in the pyi files from earlier in this series, so
add them to the code to make it more understandable.
There's one more trivial hint that can be added to the return of
`mercurial.revlogutils.rewrite._filelog_from_filename()`, however it needs to be
imported from '..' under the conditional of `typing.TYPE_CHECKING`, and that
seems to confuse the import checker- possibly because there's already an import
block from that level. (I would have expected a message about multiple import
statements in this case, but got one about higher level imports should come
first, no matter where I put the import statement.)
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2024 00:07:05 -0400] rev 51825
typing: add types to `revlog.revlogproblem`
These attrs showed as `Any` after the previous commit made the class visible to
pytype.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 19 Aug 2024 22:46:09 -0400] rev 51824
typing: make the revlog classes known to pytype
These are the same changes as
c1d7ac70980b and
45270e286bdc made to dirstate,
for the same reasons.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 19 Aug 2024 22:27:43 -0400] rev 51823
typing: make the manifest classes known to pytype
These are the same changes as
c1d7ac70980b and
45270e286bdc made to dirstate,
for the same reasons. The migration away from decorating the classes with
`@interfaceutil.implementer` was started back in
3e9a660b074a, but missed one.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 19 Aug 2024 22:21:16 -0400] rev 51822
typing: make the filelog class known to pytype
These are the same changes as
c1d7ac70980b and
45270e286bdc made to dirstate,
for the same reasons.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 21 Aug 2024 17:41:57 -0400] rev 51821
remotefilelog: adapt the `debugindex` command to past API changes
Pytype was missing these problems because it's currently inferring the classes
for `filelog` and `revlog` to be `Any`. When that's fixed, these were flagged,
so fix these first.
The `filelog` class used to subclass `revlog`, but that was changed back in
1541e1a8e87d (with most or all of the "lost" attributes being forwarded to the
embedded `revlog` attribute at that time). These forwarded references were
dropped over time, and this command has been broken at least as far back as
68282a7b29a7 when the `version` field was dropped. Most of the fixes were as
simple as calling the accessor for the embedded `revlog` member, but the general
delta feature detection was a bit more involved- I copied the detection for it
from `mercurial.revlogutils.debug.debug_revlog()`.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 21 Aug 2024 16:13:14 -0400] rev 51820
typing: add type hints to the `opener` attributes and arguments of revlog
When making revlog and filelog classes visible to pytype, it got confused quite
a bit in `mercurial/revlogutils/rewrite.py`, thinking it had a plain `Callable`,
and flagging additional methods on it like `join()` and `rename()`. I couldn't
figure out how it reduced to that (and PyCharm flagged `opener` references as
`Any`), but this makes it happy. So make this change before making the classes
visible.
The vfs class hierarchy is a bit wonky (e.g. `filteredvfs` is not a `vfs`), so
this may need to be revisited with a Protocol class that covers all of the `vfs`
classes. But for now, everything works.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 21 Aug 2024 16:09:22 -0400] rev 51819
remotefilelog: honor the `--format` arg of the `debugindex` command
Flagged by PyCharm while investigating pytype spew. The other `**opts` above
are already accessed as str. I've never used remotefilelog, and don't have a
repo to test this on, so I'm trusting the nearby code.
Manuel Jacob <me@manueljacob.de> [Wed, 07 Aug 2024 22:05:36 +0200] rev 51818
merge: sort filemap only if requested by the caller
The name `sorted` refers to a built-in function, which is always true, so the else branch of this if statement was dead code.
Because, with this fix, the function can iterate over the dict items while yielding values, the dict should not change size while the generator is running. Because of that, it is required to re-introduce code that makes a caller copy the filemap before modification, which was removed in
3c783ff08d40cbaf36eb27ffe1d296718c0f1d77 (that changeset also introduced the filemap() method including the bug that’s being fixed by this changeset).
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2024 22:47:11 -0400] rev 51817
shelve: consistently convert exception to bytes via `stringutil.forcebytestr`
The other two places in this module use this, and past experience shows that
this method does a nicer job. I'm not sure why we're converting to bytes here-
`KeyError` is built-in and will have str attrs, and `RepoLookupError` is a
subclass of the built-in `Exception` class (not `errors.Error`, which is
allegedly the baseclass for all Mercurial exceptions).
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2024 22:34:51 -0400] rev 51816
typing: add type hints to `mercurial.shelve`
Pytype wasn't flagging anything here yet, but PyCharm was really unhappy about
the usage of `state` objects being passed to various methods that accessed attrs
on it, without any obvious attrs on the class because there's no contructor.
Filling that out made PyCharm happy, and a few other things needed to be filled
in to make that easier, so I made a pass over the whole file and filled in the
trivial hints. The other repo, ui, context, matcher, and pats items can be
filled in after the context and match modules are typed.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2024 18:30:47 -0400] rev 51815
typing: lock in correct changes from pytype 2023.04.11 -> 2023.06.16
There were a handful of other changes to the pyi files generated when updating
pytype locally (and jumping from python 3.8.0 to python 3.10.11), but they were
not as clear (e.g. the embedded type in a list changing from `nothing` to `Any`
or similar). These looked obviously correct, and agreed with PyCharm's thoughts
on the signatures.
Oddly, even though pytype starting inferring `obsutil._getfilteredreason()` as
returning bytes, it (correctly) complained about the None path when it was typed
that way. Instead, raise a ProgrammingError if an unhandled fate is calculated.
(Currently, all possibilities are handled, so this isn't reachable unless
another fate is added in the future.)
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2024 17:46:17 -0400] rev 51814
monotone: replace %s interpolation with appropriate numeric specifiers
The length is an int, and the version is a float. Neither work with bytes on
py3. This was noticed when looking at nearby code after updating pytype changed
some signatures.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2024 16:32:13 -0400] rev 51813
shelve: raise an error when loading a corrupt state file in an impossible case
The old return statement was flagged by pytype 2023.06.16 running under python
3.10.11. No idea why it isn't caught in CI running the same pytype with py3.7.
This function is only called by `unshelvecmd()` (which first checks that either
`--abort` or `--continue` is specified), and `hgabortunshelve()` and
`hgcontinueunshelve()`, which locally apply `--abort` or `--continue`
respectively. Therefore, there is no other way to call this, and this error
should never be seen, but pytype can't figure that out on its own. Given that
the abort case clears the state, it seems reasonable to defensively code this
and not make that a blanket `else` case, on the off chance a 3rd way of calling
this appears in the future.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 20 Aug 2024 11:18:10 -0400] rev 51812
contrib: print the version of pytype used to do the type checking
This will help with CI. I don't see a way to print the version of python that's
running it. When I tried `head -n 1 $(which pytype)`, the CI run printed:
#!/usr/bin/env bash
Locally, that gives the path to the python interpreter in the venv, so IDK
what's different.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 17 Aug 2024 18:43:23 -0400] rev 51811
typing: create an @overload of `phasecache` ctor to handle the copy case
In `phasecache.copy()`, it calls `self.__class__(None, None, _load=False)`, but
the constuctor is typed to take a non-None repository. For the `_load=False`
case, all args are ignored (and the copy function itself populates the attrs on
the new object), so this isn't an error. For the default `_load=True` case, it
needs a non-None repository. This is the simplest way to handle that duality.
The reason this wasn't being detected is because pytype is confused by the
interface decorators on the `localrepository` class, and is inferring the whole
class as `Any`. (See
3e9a660b074a or
c1d7ac70980b) Therefore, the type hint of
`localrepo.localrepository` here was also effectively `Any`, which disabled the
type checking entirely.
This is the first foray into using `typing_extensions` to unlock future typing
features. I think this is safe and reasonable because 1) it is only imported in
the type checking phase (so no need to vendor our own copy), and 2) pytype has
its own copy of `typing_extensions` bundled with it, so no need to alter the
test environment. When run with a version of python that supports the symbol(s)
natively, `typing_extensions` simply re-exports from `typing`, so there
shouldn't be any future headaches with this.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 17 Aug 2024 17:38:35 -0400] rev 51810
typing: declare the `_phasesets` member of `phasecache` to be `Optional`
Something in this area got flagged while making the repository class visible to
pytype (instead of being typed as `Any`). A None assignment to something not
optional is wrong, and when I tried setting it to `{}` to keep it non-Optional,
some tests failed. There are checks for the attr being None elsewhere, so this
seems to have just been an oversight.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 16 Aug 2024 18:11:52 -0400] rev 51809
typing: hide the interface version of `dirstate` during type checking
As noted in the previous commit, the `dirstate` type is still inferred as `Any`
by pytype, including where it is used as a base class for the largefiles
dirstate. That effectively disables most type checking. The problems fixed two
commits ago were flagged by this change.
I'm not at all clear what the benefit of the original type is, but that was what
was used at runtime, so I don't want to change the largefiles base class to the
raw class. Having both a lowercase and camelcase name for the same thing isn't
great, but given that this trivially finds problems without worrying about which
symbol clients may be using, and the non-raw type is useless to pytype anyway,
I'm not going to worry about it.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 16 Aug 2024 18:02:32 -0400] rev 51808
dirstate: remove the interface decorator to help pytype
This is the same change that was made for some of the manifest classes in
3e9a660b074a. Note that `dirstate` is still inferred as `Any`, but at least we
have `DirState` with all of the expected attributes.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 16 Aug 2024 17:58:17 -0400] rev 51807
largefiles: sync up `largefilesdirstate` methods with `dirstate` base class
As it currently stands, pytype infers the `dirstate` class (and anything else
decorated with `@interfaceutil.implementer`) as `Any`. When that is worked
around, it suddenly noticed that most of these methods don't exist in the
`dirstate` class anymore. Since they only called into the missing methods and
there's no test failures, we can assume these are never called, and they can be
dropped.
In addition, PyCharm flagged `set_tracked()` and `_ignore()` as not overriding
a superclass method with the same arguments. The missing default parameter for
the former was the obvious issue. I'm guessing that the latter was named wrong
because while there is `_ignore()` in the base class, it takes no arguments and
returns a matcher. The `_ignorefiles()` superclass method also takes no args,
and returns a list of bytes. The `_ignorefileandline()` superclass method DOES
take a file, but returns a tuple. Therefore, the closest match is `_dirignore()`,
which takes a file AND returns a bool. No idea why this needs to be overridden
though.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 16 Aug 2024 11:12:19 +0100] rev 51806
sparse: reliably avoid writing to store without a lock
With the code as written before this patch we can still end up writing to
store in `debugsparse`. Obviously we'll write to it if by accident a store
requirement is modified, but more importantly we write to it if another
concurrent transaction modifies the requirements file on disk.
We can't rule this out since we're not holding the store lock,
so it's better to explicitly pass a permission to write instead
of inferring it based on file contents.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Aug 2024 13:52:14 +0100] rev 51805
debugsparse: stop taking the store lock
debugsparse is a workspace-only opperation, or it better be workspace-only.
Let's make it to stop taking the store lock.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Aug 2024 14:54:22 +0100] rev 51804
scmutils: read the requires file before writing to avoid unnecessary rewrite
This lets us get away without the repo lock in situations where we need
to write requirements, but we know we're not changing the store requirements.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Aug 2024 14:56:50 +0100] rev 51803
localrepo: remove _readrequires function in favor of scmutil.readrequires
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Aug 2024 14:53:17 +0100] rev 51802
scmutil: add `readrequires` next to `writerequires`
The code is copied from localrepo.py.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 14 Aug 2024 03:25:16 -0400] rev 51801
typing: correct a type hint in `mercurial.manifest`
Obvious typo that was flagged by PyCharm.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 10 Aug 2024 14:22:26 -0400] rev 51800
typing: add hints to `mercurial.util.mktempcopy()`
Might as well, now that the previous commit indicated what types are required.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 10 Aug 2024 14:18:44 -0400] rev 51799
typing: fix the hint for the `mode` argument of `platform.copymode()`
The posix module is doing a bitwise AND with this and an integer, so it can't be
bytes. The only caller that provides the argument is `util.mktempcopy()`, and
pytype infers the type as Any, which explains why this wasn't caught.
Manuel Jacob <me@manueljacob.de> [Fri, 09 Aug 2024 22:45:32 +0200] rev 51798
largefiles: fix check that ensures that --all-largefiles is only used locally
Previously, the command added in the test failed with “abort: --all-largefiles is incompatible with non-local destination existing_destination”.
The reason for the buggy behavior was the use of hg.islocal(), which does “return true if repo (or path pointing to repo) is local” and, for local paths, assumes that the path is actually pointing to an existing repository and returns whether the path is not a regular file (in which case it assumes that it is a bundlerepo, which are considered non-local).
Felipe Contreras <felipe.contreras@gmail.com> [Fri, 05 May 2023 06:08:36 -0600] rev 51797
exchange: trivial simplification
Both sides of the condition do essentially the same thing, except one
with fastpath=True.
No functional changes.
Manuel Jacob <me@manueljacob.de> [Fri, 09 Aug 2024 14:26:13 +0200] rev 51796
import: fix erroneous comparison of str with bytes
Anton Shestakov <av6@dwimlabs.net> [Thu, 08 Aug 2024 17:28:38 +0400] rev 51795
histedit: create state and acquire locks earlier
This makes chistedit (histedit with curses UI) not write any files inside repo
without wlock. It also makes sense to wrap the entire process of preparing
commands inside the curses UI inside locks because we don't want anything else
to touch wdir or repo during this time.
Manuel Jacob <me@manueljacob.de> [Tue, 06 Aug 2024 22:51:41 +0200] rev 51794
py3: use str literal instead of bytes literal
Manuel Jacob <me@manueljacob.de> [Tue, 06 Aug 2024 18:23:59 +0200] rev 51793
typing: fix type annotation
Manuel Jacob <me@manueljacob.de> [Tue, 06 Aug 2024 17:53:59 +0200] rev 51792
cffi: pass bytes instead of str to ffi.new("char[]", …)
The type annotations seem to imply that the passed values are always already bytes, but they aren’t necessarily. Before Python 3.11, the documentation stated that bytes can be used to annotate arguments whose type is actually any of bytes, bytearray, or memoryview.
Manuel Jacob <me@manueljacob.de> [Mon, 05 Aug 2024 21:21:32 +0200] rev 51791
cffi: call bytes() instead of str() on CFFI buffer instances
Manuel Jacob <me@manueljacob.de> [Mon, 05 Aug 2024 21:08:36 +0200] rev 51790
cffi: pass C type and attribute names as str instead of bytes
Manuel Jacob <me@manueljacob.de> [Mon, 05 Aug 2024 20:47:17 +0200] rev 51789
py3: fix type of some elements of __all__ lists
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 20:08:23 +0200] rev 51788
manifest: deprecated readdelta and readfast
These method should not have any user left.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 06 Aug 2024 02:09:33 +0200] rev 51787
manifest: use read_delta_new_entries in verify too
This seems like the proper semantic for the usage.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 06 Aug 2024 02:13:17 +0200] rev 51786
manifest: use read_delta_new_entries in changegroup validate
This new method have a well defined semantic and can be adjusted by narrow as it
needs. This should prevent some unwanted filelog access when running validate on
a server using narrow profile to restrict access.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 06 Aug 2024 02:12:08 +0200] rev 51785
manifest: add a read_delta_new_entries method
This new method have a well defined semantic and can be adjusted by narrow as it
needs.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 13:15:54 +0200] rev 51784
manifest: use `read_delta_parents` when adjusting linkrev
Let's use the more accurate API.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 13:15:10 +0200] rev 51783
manifest: use the `read_delta_parents` method
Let's use the more accurate API.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 13:12:49 +0200] rev 51782
manifest: use `read_delta_parents` when adjusting linkrev in remotefile
Let's use the more accurate API.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 13:10:09 +0200] rev 51781
manifest: introduce a `read_delta_parents` method
This new method have a clearer semantic and can be used by code that need this
semantic. This should avoid bugs, allow for more targeted optimisation, and
provide a clearer API. Users will be updated in subsequent changesets.
This is also part of the wider effort to clarify and fix this API. one more
method coming.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 12:14:40 +0200] rev 51780
manifest: use `read_any_fast_delta` for tag rev cache computation
This will have the benefit of using the fast path more often, and being (a bit)
less buggy. See inline comment for details.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 05:37:57 +0200] rev 51779
manifest: use `read_any_fast_delta` during shallow prefetch's
We now have a better function with a clear semantic. This simplify the usage in
the remotefilelog code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 05:36:53 +0200] rev 51778
manifest: use `read_any_fast_delta` during remotefilelog's repack
We now have a better function with a clear semantic. This simplify the usage in
the remotefilelog code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 13:42:34 +0200] rev 51777
manifest: use read_any_fast_delta in changectx
The new API is clearer but also more expressive. It allow to detect case where
we did return a full read and populated the associated cache. Saving time!
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 13:40:46 +0200] rev 51776
manifest: allow skipping valid_bases argument to `read_any_fast_delta`
In some case it make sens to just want a delta. So we update the API to support
this.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 05:35:06 +0200] rev 51775
manifest: introduce a `read_any_fast_delta` method
This method is a clearer semantic than `readbase` and `readfast` and will allow
for more accurate optimization and usage. This is part of a wider series
introducing such clearer method.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 10:03:06 +0200] rev 51774
manifest: add many type annotations to the manifest module
This help to clarify the API a bit, this caught various bug in the process and
will help to catch more in the future. This also make large refactoring
significantly simpler.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 10:15:10 +0200] rev 51773
manifest: help pytype to understant `writesubtrees`'s `getnode` type
Since we provide a default, the return of `_lazydirs.get` is cannot be None. We
help pytype to understand that.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 10:13:31 +0200] rev 51772
manifest: use explicit None checking in `_loaddifflazy`
This helps pytype to understand what is going here with `v2` type.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 10:12:37 +0200] rev 51771
manifest: use explicit None checking in `_loadlazy`
This help pytype to understand what is going on with `v` type.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 10:11:51 +0200] rev 51770
manifest: clear `_lazydirs` in place in `_loadalllazy`
This avoid resetting the type of the dictionary in pytype eyes. This is
consistent with the way the dictionary is cleared bits by bits in
`_loadalllazy`
Having more accurate code will help pytype. We do it in advance to help
bisecting and avoid drowning them in the future type annotation noise.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 10:10:03 +0200] rev 51769
manifest: use tuple for `delta` in `fastdelta`
This make the list content consistent and will help type annotation.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Aug 2024 09:22:18 +0200] rev 51768
manifest: expose a version of the Class without interface decorator
The decorator confuse Pytype. Having the "raw" python class exposed will also
helps pytype when it get replaced by a native implementation. At least until we
start using `typing.Protocol` in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 04 Aug 2024 10:50:38 +0200] rev 51767
pytype: stop ignoring manifest.py
pytype no longer complains about the file contents.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 04 Aug 2024 10:48:51 +0200] rev 51766
manifest: align some vfs option access on the fact we might not have options
This make the usage consistent with the other option.
Caught by pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 04 Aug 2024 10:49:48 +0200] rev 51765
manifest: adds some type things for manifestdict.added
This appeases pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 04 Aug 2024 10:47:29 +0200] rev 51764
manifest: type and fix unhexlify
Some part of that function seems to date back from Python 2. It raise question
about whether this function is useful or not, but let us just fix it for now.
This was caught by pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 04 Aug 2024 10:45:31 +0200] rev 51763
docker-pytype: use version v2.1 of the CI image
It use a more recent pytype as far as I understand.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 13:14:05 +0200] rev 51762
context: some gratuitous documentation improvement
I wrote it as I was reading the code.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 01 Aug 2024 13:07:13 +0100] rev 51761
profiling: add a py-spy profiling backend
The recommended way to use this backend is by setting the config
`profiling.output` to point to a file because py-spy output is not
human-readable.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 01 Aug 2024 11:14:58 +0100] rev 51760
copytracing: fix a bug in an edge case in metadata.compute_all_files_changes
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 01 Aug 2024 13:04:38 +0100] rev 51759
rhg: ignore readonly FS error when saving dirstate
The error is already ignored when the .hg directory is read-only,
so this is only fair. (the python hg is silent on readonly fs, too)
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 01 Aug 2024 13:38:31 +0100] rev 51758
commit: set whole manifest entries at once (node with its associated flags)
Add a new function manifest.set that sets whole manifest entries at once,
so the caller doesn't have to do two separate operations:
m[p] = n
m.set_flags(f)
becomes:
m.set(p, n, f)
This obviously saves an extra lookup by path, and it also lets the
underlying manifest implementation to be more efficient as
it doesn't have to deal with partially-specified entries.
It makes the interaction conceptually simpler, as well, since we don't
have to go through an intermediate state of incorrect
partially-written entry.
(the real motivation for this change is an alternative manifest
implementation where we batch pending writes, and dealing with
fully defined entries makes the batching logic muchsimpler while
avoiding slowdown due to alternating writes and reads)
Matt Harbison <matt_harbison@yahoo.com> [Thu, 01 Aug 2024 11:43:10 -0400] rev 51757
typing: add type hints around the matcher for subrepo archiving
Mostly this is meant to try to smoke out any other potential issues around the
matcher, since these args were mostly previously treated as `Any`, and therefore
checking wasn't done.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 01 Aug 2024 01:52:11 -0400] rev 51756
subrepo: drop the default value of None for the archive matcher
This was flagged by pytype after adding hints to `match.subdirmatcher` that it
takes a non-optional matcher. That matcher argument is used without a guard in
the subdirmatcher constructor, so that's the correct restriction.
I don't think this fixes a bug in practice because the only way these are
invoked is either by a parent `hgsubrepo.archive()`, `archival.archive()`, or
the largefiles override of these. The `hgsubrepo.archive()` case (and the
largefiles override) uses what the caller provided, so the caller will
eventually be `archival.archive()` (or the largfiles override) up the call
chain. The `archival.archive()` method also has None for its matcher's default
arg. However, the three callers of that (`commands.archive()`,
`webcommands.archive()`, and `extdiff.snapshot()`) all provide a matcher
argument, so the None case can never occur unless a 3rd party extension swaps it
for None. Sadly, we can't make the argument on the `archival.archive()`
non-optional because there is a kwarg prior to it.
Even though the largefiles override of `archival.archive()` is provided a valid
matcher, we duplicate the internal creation of the matcher that the original
`archival.archive()` does for consistency. By eliminating an impossible to hit
case, we can simplify some of the subrepo code too, by dropping unreachable
code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 16:42:38 +0200] rev 51755
branching: merge stable into default
Post 6.8.1 release.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 16:34:37 +0200] rev 51754
Added signature for changeset
11a9e2fc0caf
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 16:34:35 +0200] rev 51753
Added tag 6.8.1 for changeset
11a9e2fc0caf
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 01 Aug 2024 15:38:24 +0200] rev 51752
relnotes: add 6.8.1
Raphaël Gomès <rgomes@octobus.net> [Thu, 01 Aug 2024 14:00:07 +0200] rev 51751
rhg: expand user and environment variables in ignore includes
This was reported by a user, and was a TODO long overdue.
Mads Kiilerich <mads@kiilerich.com> [Tue, 27 Jun 2023 13:05:03 +0200] rev 51750
utils: avoid using internal _imp.is_frozen()
imp has been deprecated for a long time, and were removed in Python 3.12 . As a
workaround, we started using the internal _imp. That is ugly and risky.
It seems less risky to get the functionality in some other way. Here, we just
inspect if 'origin' of the '__main__' module is set and 'frozen'. That seems to
work and do the same, and might be better than using the internal _imp
directly.
This way of inspecting module attributes seems to work in some test cases, but
it is a risky change. This level of importlib doesn't have much documentation,
a complicated implementation, and we are dealing with some odd use cases.
Mads Kiilerich <mads@kiilerich.com> [Mon, 22 Jul 2024 18:20:03 +0200] rev 51749
utils: fix resourceutil use of deprecated importlib.resources
Some importlib functionality was deprecated in 3.11 . The documentation on
https://docs.python.org/3.12/library/importlib.resources.html recommends using
the new .files() API that was introduced in 3.9.
Mads Kiilerich <mads@kiilerich.com> [Thu, 11 Jan 2024 20:32:07 +0100] rev 51748
cext: use sys.executable instead of deprecated Py_GetProgramFullPath
Fix warning with Python 3.13:
mercurial/cext/parsers.c: In function 'check_python_version':
mercurial/cext/parsers.c:1243:30: warning: 'Py_GetProgramFullPath' is deprecated [-Wdeprecated-declarations]
1243 | Py_GetProgramFullPath());
| ^~~~~~~~~~~~~~~~~~~~~
In file included from /usr/include/python3.13/Python.h:119,
from mercurial/cext/parsers.c:11:
/usr/include/python3.13/pylifecycle.h:43:43: note: declared here
43 | Py_DEPRECATED(3.13) PyAPI_FUNC(wchar_t *) Py_GetProgramFullPath(void);
| ^~~~~~~~~~~~~~~~~~~~~
At this point in time, the PyConfig struct memory has been released and the PyConfig API can't be used.
https://docs.python.org/3.13/c-api/init.html#c.Py_GetProgramFullPath recommands
using sys.executable instead. Let's assume that will work in all versions.
It would perhaps be better to use PySys_GetObject, but I prefer to stay
consistent with how the same function is retrieving sys.hexversion.
Mads Kiilerich <mads@kiilerich.com> [Thu, 11 Jan 2024 21:58:55 +0100] rev 51747
subrepoutil: pass re.sub 'count' argument by name
Python 3.13 started warning:
DeprecationWarning: 'count' is passed as positional argument
Mads Kiilerich <mads@kiilerich.com> [Thu, 11 Jan 2024 21:58:55 +0100] rev 51746
tests: pass re.MULTILINE to re.sub as 'flags' - not in 'count' position
This bug was caught by the new Python 3.13 warning:
DeprecationWarning: 'count' is passed as positional argument
Mads Kiilerich <mads@kiilerich.com> [Mon, 26 Jun 2023 21:31:41 +0200] rev 51745
tests: use packaging from setuptools instead of deprecated distutils
When invoking StrictVersion in 3.12 we got:
DeprecationWarning: distutils Version classes are deprecated. Use packaging.version instead.
distutils is dead in the standard library, and we have to move towards using
`setuptools` as general extern dependency. Instead of also requiring the extern
`packaging`, we will just use the packaging that is vendored in setuptools.
Mads Kiilerich <mads@kiilerich.com> [Mon, 26 Jun 2023 15:16:51 +0200] rev 51744
tests: drop test-demandimport.py distutils test that failed with warnings
The test would fail because warnings:
/usr/lib/python3.11/site-packages/_distutils_hack/__init__.py:18: UserWarning: Distutils was imported before Setuptools, but importing Setuptools also replaces the `distutils` module in `sys.modules`. This may lead to undesirable behaviors or errors. To avoid these issues, avoid using distutils directly, ensure that setuptools is installed in the traditional way (e.g. not an editable install), and/or make sure that setuptools is always imported before distutils.
warnings.warn(
/usr/lib/python3.11/site-packages/_distutils_hack/__init__.py:33: UserWarning: Setuptools is replacing distutils.
warnings.warn("Setuptools is replacing distutils.")
The test for distutils.msvc9compiler comes from
2205d00b6d2b. But since then,
distutils is going away, and this test must change somehow. It is unclear exactly
how setuptools depended on msvc9compiler, but setuptools also moved forward,
and this exact test no longer seems relevant. It thus seems like a fair
solution to remove the test while keeping the demandimport blacklist of
distutils.msvc9compiler.
Mads Kiilerich <mads@kiilerich.com> [Thu, 29 Jun 2023 20:02:27 +0200] rev 51743
utils: test coverage of makedate
Explore the scenario from
ae04af1ce78d to avoid future regressions.
This was intended to give some coverage of the change in
faccec1edc2c.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Jul 2024 20:08:48 +0200] rev 51742
mmap: populate mapping in a background thread
When possible, we populate the memory mapping in a second thread. The mmap
population does not only read the data from disk to memory. It also actually
fill the memory mapping between process memory address and the physical memory
used by the file system cache containing the mmap'ed data.
Doing so buy back the slowdown from pre-population when it matters. When most
data is accessed, only a few page fault will occurs, while the background thread
fill the memory controller. When few data is accessed, the non-blocking mmap
won't have to wait for all data to be populated.
Here is a few example of improvement seen in benchmark around unbundle and push:
### data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.command.unbundle
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-100-extra-rev
before: 0.758101
after: 0.732129 (-3.43%, -0.03)
## data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
before: 1.519941
after: 1.503473 (-1.08%, -0.02)
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = hg.command.push
# bin-env-vars.hg.flavor = default
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
before: 4.801442
after: 4.695810 (-1.46%, -0.07)
# benchmark.variants.revs = any-100-extra-rev
before: 4.848596
after: 4.794075 (-1.12%, -0.05)
# bin-env-vars.hg.flavor = rust
# benchmark.variants.revs = any-1-extra-rev
before: 4.818410
after: 4.700053 (-2.46%, -0.12)
Matt Harbison <matt_harbison@yahoo.com> [Thu, 25 Jul 2024 14:40:38 -0400] rev 51741
pure: stringify builtin exception messages
Builtin exceptions usually want strings, and display with a wierd b'' prefix if
given bytes.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 29 Jul 2024 12:10:08 -0400] rev 51740
httppeer: avoid another bad reference before assignment warning
This wasn't a problem, because `b''` from the `AttributeError` handler is in
`bundle2.bundletypes`, so the following loop and conditional always run at least
once. But PyCharm can't figure that out on its own, and it took a little
exploring to figure out it wasn't a problem. The usage in `bundle2.writebundle`
is to look it up in the map of bundle types, so it will break in a more obvious
way in the unlikely event that the empty string is removed from the map in the
future.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 26 Jul 2024 21:59:34 -0400] rev 51739
httppeer: move a variable to avoid a bad reference before assignment warning
No actual bug here, because the conditional used to assign is the same as the
conditional in the `finally` block that guards the reference.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 26 Jul 2024 21:54:07 -0400] rev 51738
httppeer: simplify two-way stream cleanup
No need to conditionalize the cleanup if the filename is assigned outside the
exception handler. I suppose `fd` leaks if `os.fdopen()` fails, but that was
the case before too (and may trigger another exception in the `finally` block on
Windows, when the file is still open).
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 10:07:53 +0200] rev 51737
rustfmt: update expected Rust edition
In this case it doesn't change anything, but we've been using 2021 for a
while now.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 10:04:00 +0200] rev 51736
hghave: update expected rustfmt version
We still use nightly, but have moved to a newer nightly after the last
CI image upgrade in
74f1bf147a6d and
3876d4c6c79e.
Raphaël Gomès <rgomes@octobus.net> [Mon, 29 Jul 2024 10:06:28 +0200] rev 51735
rustfmt: apply formatting expected by newer nightly version
This was missed in
3876d4c6c79ec5c71e8c51b876cc157e93a5eaac somehow.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 25 Jul 2024 15:56:04 -0400] rev 51734
tests: stop skipping `mercurial/pure/osutil.py` during pytype runs
Not sure when the original issue(s) were fixed, but it works for me now.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 25 Jul 2024 13:31:13 -0400] rev 51733
largefiles: avoid a potentially undefined variable in exception case
The `wlock` variable is used to release the lock in the `finally` block, so it
would be undefined if `repo.wlock()` itself failed. Caught by pytype 2024.04.11
with py3.10.11.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 24 Jul 2024 22:40:22 -0400] rev 51732
typing: add trivial type hints to `mercurial.scmutil`
There's still a lot to go, but there's a lot here already, so I tried to keep it
to obvious/trivial things. I didn't bother with contexts, matchers, and
revisions that can be `bytes | int | None`.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 24 Jul 2024 18:17:00 -0400] rev 51731
typing: narrow the scope of some recent disabled import warnings
These comments were added in
39e2b2d062c1, but had the effect of changing the
known type to `Any`, which cascaded through a few function signatures. Just
ignore the import error instead.
Julien Cristau <jcristau@debian.org> [Fri, 26 Jul 2024 10:52:28 +0200] rev 51730
demandimport: don't delay threading import
A recent cpython change breaks demandimport by importing threading
locally in importlib.util.LazyLoader.exec_module; add it (plus warnings
and _weakrefset, which are imported by threading) to demandimport's
ignore list.
```
Traceback (most recent call last):
File "/usr/bin/hg", line 57, in <module>
from mercurial import dispatch
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
File "/usr/lib/python3/dist-packages/hgdemandimport/demandimportpy3.py", line 52, in exec_module
super().exec_module(module)
File "<frozen importlib.util>", line 257, in exec_module
File "<frozen importlib._bootstrap>", line 1360, in _find_and_load
File "<frozen importlib._bootstrap>", line 1331, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 935, in _load_unlocked
File "/usr/lib/python3/dist-packages/hgdemandimport/demandimportpy3.py", line 52, in exec_module
super().exec_module(module)
File "<frozen importlib.util>", line 267, in exec_module
AttributeError: partially initialized module 'threading' has no attribute 'RLock' (most likely due to a circular import)
```
Ref: https://github.com/python/cpython/issues/117983
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1076449
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1076747
Matt Harbison <matt_harbison@yahoo.com> [Tue, 23 Jul 2024 19:20:22 -0400] rev 51729
typing: induce pytype to use the standard `attr` instead of the vendored copy
What was previously happening with the vendored copy was that pytype would stub
out all(?) classes that were decorated with `@attr.s` as `Any`. After this, we
get a ton of classes defined, and numerous fields and methods now have proper
types.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 23 Jul 2024 19:14:16 -0400] rev 51728
typing: disable some pytype errors in `mercurial.store`
These seem to be legitimate errors, since one of the two callers
(`encodedstore.data_entries()`) was previously typed to generate
`BaseStoreEntry`. However, that and the other caller only pass
`RevlogStoreEntry` objects. I can't tell if this was a WIP or what, but don't
want to get side tracked on this. So flag as a TODO for later.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 23 Jul 2024 19:05:26 -0400] rev 51727
linelog: correct the default value of `annotateresult.lines`
This was flagged by pytype once it was tricked into using the standard `attr`
package instead of the vendored copy.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 23 Jul 2024 19:01:16 -0400] rev 51726
phabricator: correct the default value of `phabhunk.corpus`
There's only one caller to this constructor (which does provide this argument),
and no direct assignments, so there's no runtime bug here. However, when pytype
is tricked into using the standard `attr` package instead of the vendored
version, it flags this because bytes is passed to the one constructor
invocation.
Tricking pytype into using the standard package will generate many more type
hints, noteably around `@attr.s` decorated things.
Georges Racinet <georges.racinet@cloudcrane.io> [Mon, 22 Jul 2024 18:20:29 +0200] rev 51725
rust-changelog: accessing the index
The `Index` object is currently the one providing all DAG related
algorithms, starting with simple ancestors iteration up to more
advanced ones (ranges, common ancestors…).
From pure Rust code, there was no way to access the changelog index for
a given `Repository`, probably because `rhg` does not use any such algorithm
yet.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 20 Jul 2024 17:03:30 -0400] rev 51724
typing: add type hints to `mercurial.policy`
Mostly trivial, but this seems like the logical module to use to inject the
hints from `cext`, `pure`, etc, given that this file has the fallback policy.
This is a first step.
There doesn't appear to be a predefined type for a module in py3.7, so those are
omitted for now.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 20 Jul 2024 01:55:09 -0400] rev 51723
cext: correct the argument handling of `b85encode()`
The type stub indicated that this argument is `Optional`, which implies None is
allowed. I don't see in the documentation where that's the case for `i`[1], and
trying it in `hg debugshell` resulted in the method failing with a TypeError. I
guess it was typed as an `int` argument because the `p` format unit wasn't added
until Python 3.3[2].
In any event, 2 clients in core (`pvec` and `obsolete`) call this with no
argument supplied, and `mdiff` calls it with True. So I guess we've avoided the
None arg case, and when no arg is supplied, it defaults to the 0 initialization
of the `pad` variable in C. Since the `p` format unit accepts both `int` and
None, as well as `bool`, I'm not bothering to bump the module version- this code
is more permissive than it was, in addition to being more correct.
Interestingly, when I first imported the `cext` and `pure` methods in the same
manner as the previous commit, it dropped the `Optional` part of the argument
type when generating `util.pyi`. No idea why.
[1] https://docs.python.org/3/c-api/arg.html#numbers
[2] https://docs.python.org/3/c-api/arg.html#other-objects
Matt Harbison <matt_harbison@yahoo.com> [Fri, 19 Jul 2024 20:09:48 -0400] rev 51722
typing: add type hints to the `charencode` module
Since this module is dynamically imported from either `mercurial.pure` or
`mercurial.cext`, these hints aren't detected in `mercurial.encoding`, and need
to be imported directly there during the type-checking phase. This keeps the
runtime selection via the policy config in place, but allows pytype to see these
as functions with proper signatures instead of just `Any`. We don't attempt to
import the `mercurial.cext` version yet because there's no types stubs for that
module, but this will get the ball rolling.
I thought this would spill over into other modules from there, but the only two
*.pyi files that changed were for `encoding` and `charencode`. Applying this to
other dynamically selected modules will clean some things up in other files, so
this is a start. I had originally redefined the functions in the type-checking
block (like some of the `os.path` aliasing in `mercurial.util`), but this is
better because we won't have another duplication of the definitions that may get
out of date.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 19 Jul 2024 16:49:46 -0400] rev 51721
typing: explicitly type some `mercurial.util` eol code to avoid @overload
Unlike the previous commit, this makes a material difference in the generated
stub file- the `pycompat.identity()` aliases generated an @overload like this:
@overload
def fromnativeeol(a: _T0) -> _T0: ...
... which might fail to detect a bad argument, like str. This drops the
@overload for the 3 related methods, so there's a single definition for each.
The `typelib.BinaryIO_Proxy` is used for subclassing (the same as was done in
8147abc05794), so that it is a `BinaryIO` type during type checking, but still
inherits `object` at runtime. That way, we don't need to implement unused
abstract methods.
Matt Harbison <matt_harbison@yahoo.com> [Fri, 19 Jul 2024 16:38:53 -0400] rev 51720
typing: avoid some useless @overload definitions in `mercurial.util`
Apparently pytype considered the name as well as the type of each argument, and
generates @overload definitions if they don't match. At best this is clutter,
and can easily be removed.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Jul 2024 22:46:36 -0400] rev 51719
dirstate: stringify a few exception messages
Built in exceptions want str, and ProgrammingError converts bytes to str
internally (because it subclasses RuntimeError).
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Jul 2024 20:34:35 -0400] rev 51718
typing: add type hints to `mercurial.verify._normpath()`
Since
10db46e128d4, pytype almost figured this out, going from `Any` -> `_T0`,
but the intent is obvious.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Jul 2024 20:16:31 -0400] rev 51717
typing: add type hints to `i18n._msgcache`
Since
10db46e128d4, pytype stopped inferring that the key is bytes.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Jul 2024 19:57:42 -0400] rev 51716
typing: add type hints to `mercurial.dirstatemap`
Somewhere since
10db46e128d4, pytype stopped being able to infer the type of the
`identity` field. Fill in some obvious other hints along the way.
These hints caused pytype to flag a missing attribute:
File "/mnt/c/Users/Matt/hg/mercurial/dirstatemap.py", line 714, in _v1_map:
No attribute 'stat' on mercurial.windows.cachestat [attribute-error]
In Union[Any, mercurial.posix.cachestat, mercurial.windows.cachestat]
File "/mnt/c/Users/Matt/hg/mercurial/dirstatemap.py", line 715, in _v1_map:
No attribute 'stat' on mercurial.windows.cachestat [attribute-error]
In Union[Any, mercurial.posix.cachestat, mercurial.windows.cachestat]
In practice, the `identity` field is NOT replaced with None if it isn't
cacheable, so it's probably safer to just add the field and set it to None,
since that check is already in place on line 715.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Jul 2024 19:55:51 -0400] rev 51715
typing: add type hints to `cmdutil.findrepo()`
Since
10db46e128d4, pytype almost figured this out, going from `Any` -> `_T0`,
but the intent is obvious.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 18 Jul 2024 19:01:55 -0400] rev 51714
typing: add some type hints to fastannotate that have decayed in the last year
Somewhere since
10db46e128d4, `_knownopts` decayed to `set` for unknown reasons.
Also, `annotateopts.default` changed from bytes to str. While that is correct,
I noticed that PyCharm was flagging the member fields as undefined in
`shortstr()`, so add those to keep it happy. (There are no complaints from
pytype because that module is excluded, due to the missing reference to
`linelog.copyfrom()` that I'm not sure how to fix.)
Raphaël Gomès <rgomes@octobus.net> [Tue, 23 Jul 2024 12:12:22 +0200] rev 51713
heptapod-ci: use new v2.1 image
This is finally catching up to ~3 years of tech debt.
Raphaël Gomès <rgomes@octobus.net> [Tue, 23 Jul 2024 12:12:03 +0200] rev 51712
heptapod-ci: move version prints closer to the start
This makes debugging a lot easier if anything is to go wrong, and shows output
earlier.
Raphaël Gomès <rgomes@octobus.net> [Tue, 23 Jul 2024 12:10:31 +0200] rev 51711
pytype: only try the hacky way of finding PYTHON if not provided
This allows us to work in more environments, like when using pyenv. This
syntax is compatible with all POSIX shells.
Raphaël Gomès <rgomes@octobus.net> [Mon, 22 Jul 2024 14:42:54 +0200] rev 51710
dummysmtpd: fix EOF handling on newer versions of OpenSSL
Explanations inline.
Raphaël Gomès <rgomes@octobus.net> [Mon, 22 Jul 2024 14:19:12 +0200] rev 51709
test-install: add new glob for the upgrade notice in newer versions of pip
We only globbed for the old warning, newer versions of pip use a cleaner one.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 13:36:32 +0200] rev 51708
rust: use `.cargo/config.toml` instead of `.cargo/config`
This has been deprecated for a while now and we don't support Rust versions
that only understand the old path.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 13:35:39 +0200] rev 51707
rust: apply clippy lints
They are at most harmless and at best make the codebase more readable and
simpler.
Raphaël Gomès <rgomes@octobus.net> [Tue, 23 Jul 2024 14:25:23 +0200] rev 51706
rust: change minimum supported version everywhere applicable
This will help users and downstream packaging.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:38:26 +0200] rev 51705
rustfmt: format the codebase with nightly-2024-07-16
The CI has moved to a newer nightly, which slightly changes how it wraps
comments (which is the very option we use nightly for).
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:37:13 +0200] rev 51704
hghave: update detection of black version to a newer minimum
The CI has moved to version 23.3.0, which is the last one to support 3.7 at
runtime.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:36:12 +0200] rev 51703
black: format the codebase with 23.3.0
The CI has moved to 23.3.0, which is the last version that supports 3.7
at runtime, so we should honor this change.
# skip-blame mass-reformating only
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:03:29 +0200] rev 51702
pytype: work around wrong ImportError flagging
As documented in https://github.com/google/pytype/issues/163, newer versions
of Pytype do not understand caught `ImportError`, so we temporarily ignore
them where applicable.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:02:01 +0200] rev 51701
zeroconf: fix boolean return value
This was (wrongly) flagged by Pytype as being undefined since it doesn't
seem to understand `try` blocks. However, the caller is expecting a boolean
and the fix to appease Pytype is simple, so we do both.
Raphaël Gomès <rgomes@octobus.net> [Tue, 23 Jul 2024 10:02:46 +0200] rev 51700
Backout accidental publication of a large range of revisions
I accidentally published
25e7f9dcad0f::
bd1483fd7088, this is the inverse.
Raphaël Gomès <rgomes@octobus.net> [Mon, 22 Jul 2024 16:49:38 +0200] rev 51699
Latest image and pytype fix
Raphaël Gomès <rgomes@octobus.net> [Mon, 22 Jul 2024 14:42:54 +0200] rev 51698
dummysmtpd: fix EOF handling on newer versions of OpenSSL
Explanations inline.
Raphaël Gomès <rgomes@octobus.net> [Mon, 22 Jul 2024 14:19:12 +0200] rev 51697
test-install: add new glob for the upgrade notice in newer versions of pip
We only globbed for the old warning, newer versions of pip use a cleaner one.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 15:48:05 +0200] rev 51696
Try the full CI run
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 14:57:37 +0200] rev 51695
WIP test new CI image
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 13:36:32 +0200] rev 51694
rust: use `.cargo/config.toml` instead of `.cargo/config`
This has been deprecated for a while now and we don't support Rust versions
that only understand the old path.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 13:35:39 +0200] rev 51693
rust: apply clippy lints
They are at most harmless and at best make the codebase more readable and
simpler.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:38:26 +0200] rev 51692
rustfmt: format the codebase with nightly-2024-07-16
The CI has moved to a newer nightly, which slightly changes how it wraps
comments (which is the very option we use nightly for).
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:37:13 +0200] rev 51691
hghave: update detection of black version to a newer minimum
The CI has moved to version 23.3.0, which is the last one to support 3.7 at
runtime.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:36:12 +0200] rev 51690
black: format the codebase with 23.3.0
The CI has moved to 23.3.0, which is the last version that supports 3.7
at runtime, so we should honor this change.
# skip-blame mass-reformating only
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:03:29 +0200] rev 51689
pytype: work around wrong ImportError flagging
As documented in https://github.com/google/pytype/issues/163, newer versions
of Pytype do not understand caught `ImportError`, so we temporarily ignore
them where applicable.
Raphaël Gomès <rgomes@octobus.net> [Thu, 18 Jul 2024 12:02:01 +0200] rev 51688
zeroconf: fix boolean return value
This was (wrongly) flagged by Pytype as being undefined since it doesn't
seem to understand `try` blocks. However, the caller is expecting a boolean
and the fix to appease Pytype is simple, so we do both.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 11 Jul 2024 21:54:02 -0400] rev 51687
convert: fix various leaked file descriptors
Some of these only leaked if an exception occurred between the open and close,
but a lot of these leaked unconditionally.
A type hint is added to `parsesplicemap` because otherwise this change caused
pytype to change the return type from this to `Dict[nothing, nothing]`.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 11 Jul 2024 21:16:45 -0400] rev 51686
convert: stringify `shlex` class argument
The documentation is handwavy, but typeshed says this should be `str`[1]. I'm
not sure if this is the correct encoding (vs `fsencode` or "latin1" like the
tokens returned by the proxy class).
While we're here, we can add a few more type hints that would have caused pytype
to flag the problem.
[1] https://github.com/python/typeshed/blob/
6a9b53e719a139c2d6b41cf265ed0990cf438192/stdlib/shlex.pyi#L51
Matt Harbison <matt_harbison@yahoo.com> [Thu, 11 Jul 2024 20:54:06 -0400] rev 51685
typing: add trivial type hints to the convert extension's common modules
This started as ensuring that the `encoding` and `orig_encoding` attributes has
a type other than `Any`, so pytype can catch problems where it needs to be str
for stdlib encoding and decoding. It turns out that adding the hint in
`mercurial.encoding` is what was needed, but I picked a bunch of low hanging
fruit while here. There's definitely more to do, and I see a problem where
`shlex.shlex` is being fed bytes instead of str, but there are not enough type
hints yet to make pytype notice.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 11 Jul 2024 14:46:00 -0400] rev 51684
convert: drop a duplicate implementation of `dateutil.makedate()`
I noticed this because the signature generated by pytype recently changed to be
less specific. When the method was introduced back in
337d728e644f,
`util.makedate()` didn't take an optional timestamp arg. But now it does, and
the methods are the same (except the `dateutil` version validates that the
timestamp isn't a negative value). I left the old method in place in case
anyone has custom convert code that monkey patches it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 08 Jul 2024 15:48:34 +0200] rev 51683
revlog: use mmap by default is pre-population is available
Using mmap has a great impact of memory usage on server, and a good impact on
performance in multiple case. Now that we pre-populate memory mapping by
default, there is case where it using mmap is slower. So we use it by default
(if pre-population is available).
Further work to reduce the performance impact of the pre-population will be done
later.
Some benchmark below (using the same setup as
522b4d729e89):
As for
522b4d729e89 the impact on small repository like Mercurial or Pypy is
tiny, ~1% best. However for large repositories we see some performance
improvement without seeing the performance regression that we could have without
pre-populate.
##### For netbeans
### data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog
## benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# benchmark.variants.limit-rev = 1
# benchmark.variants.patch = yes
no-mmap: 0.171579
mmap: 0.166311 (-3.07%, -0.01)
# bin-env-vars.hg.flavor = default
no-mmap: 0.170716
mmap: 0.165218 (-3.22%, -0.01)
# benchmark.variants.patch = no
# benchmark.variants.rev = tip
no-mmap: 0.140862
mmap: 0.137566 (-2.34%, -0.00)
## benchmark.name = hg.command.unbundle
# bin-env-vars.hg.flavor = rust
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
# benchmark.variants.source = unbundle
no-mmap: 0.238038
mmap: 0.239912
no-populate: 0.cbd4c9 (+11.71%, +0.03)
#### For Mozilla
### data-env-vars.name = mozilla-try-2019-02-18-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.limit-rev = 1
# benchmark.variants.patch = yes
no-mmap: 0.258440
mmap: 0.237813 (-7.98%, -0.02)
# benchmark.variants.limit-rev = 10
no-mmap: 1.235323
mmap: 1.213578 (-1.76%, -0.02)
## benchmark.name = hg.command.push
# bin-env-vars.hg.flavor = rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.explicit-rev = none
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
no-mmap: 4.790135
mmap: 4.668971 (-2.53%, -0.12)
no-populate: 4.841141 (+1.06%, +0.05)
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
## benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = default
# benchmark.variants.limit-rev = 1000
# benchmark.variants.rev = tip
no-mmap: 0.206187
mmap: 0.197348 (-4.29%, -0.01)
## benchmark.name = hg.command.push
# bin-env-vars.hg.flavor = default
# benchmark.variants.explicit-rev = none
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
no-mmap: 4.768259
mmap: 4.798632
no-populate: 4.953295 (+3.88%, +0.19)
# benchmark.variants.revs = any-100-extra-rev
no-mmap: 4.785946
mmap: 4.903618
no-populate: 5.014963 (+4.79%, +0.23)
## benchmark.name = hg.command.unbundle
# bin-env-vars.hg.flavor = default
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
# benchmark.variants.source = unbundle
no-mmap: 1.400121
mmap: 1.423411
no-populate: 1.585365 (+13.23%, +0.19)
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 08 Jul 2024 17:02:27 +0200] rev 51682
revlog: use an explicit config option to enable mmap usage for index
We replace the `experimental.mmapindexthreshold` with two options:
The `storage.revlog.mmap.index` is a boolean option to enable or disable the
feature. The `storage.revlog.mmap.index:size-threshold` is a bytes option that
control when we will be using mmap instead of plain reading.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 11 Apr 2024 00:02:07 +0200] rev 51681
mmap: populate the mapping by default
Without pre-population, accessing all data through a mmap can result in many
pagefault, reducing performance significantly. If the mmap is prepopulated, the
performance can no longer get slower than a full read.
(See benchmark number below)
In some cases were very few data is read, prepopulating can be overkill and
slower than populating on access (through page fault). So that behavior can be
controlled when the caller can pre-determine the best behavior.
(See benchmark number below)
In addition, testing with populating in a secondary thread yield great result
combining the best of each approach. This might be implemented in later
changesets.
In all cases, using mmap has a great effect on memory usage when many processes
run in parallel on the same machine.
### Benchmarks
# What did I run
A couple of month back I ran a large benchmark campaign to assess the impact of
various approach for using mmap with the revlog (and other files), it
highlighted a few benchmarks that capture the impact of the changes well. So to
validate this change I checked the following:
- log command displaying various revisions
(read the changelog index)
- log command displaying the patch of listed revisions
(read the changelog index, the manifest index and a few files indexes)
- unbundling a few revisions
(read and write changelog, manifest and few files indexes, and walk the graph
to update some cache)
- pushing a few revisions
(read and write changelog, manifest and few files indexes, walk the graph to
update some cache, performs various accesses locally and remotely during
discovery)
Benchmarks were run using the default module policy (c+py) and the rust one. No
significant difference were found between the two implementation, so we will
present result using the default policy (unless otherwise specified).
I ran them on a few repositories :
- mercurial: a "public changeset only" copy of mercurial from 2018-08-01 using
zstd compression and sparse-revlog
- pypy: a copy of pypy from 2018-08-01 using zstd compression and sparse-revlog
- netbeans: a copy of netbeans from 2018-08-01 using zstd compression and
sparse-revlog
- mozilla-try: a copy of mozilla-try from 2019-02-18 using zstd compression and
sparse-revlog
- mozilla-try persistent-nodemap: Same as the above but with a persistent
nodemap. Used for the log --patch benchmark only
# Results
For the smaller repositories (mercurial, pypy), the impact of mmap is almost
imperceptible, other cost dominating the operation. The impact of prepopulating
is undiscernible in the benchmark we ran.
For larger repositories the benchmark support explanation given above:
On netbeans, the log can be about 1% faster without repopulation (for a
difference < 100ms) but unbundle becomes a bit slower, even when small.
### data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.command.unbundle
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
# benchmark.variants.source = unbundle
# benchmark.variants.verbosity = quiet
with-populate: 0.240157
no-populate: 0.265087 (+10.38%, +0.02)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.459518
no-populate: 1.481290 (+1.49%, +0.02)
## benchmark.name = hg.command.push
# benchmark.variants.explicit-rev = none
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
with-populate: 0.771919
no-populate: 0.792025 (+2.60%, +0.02)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.459518
no-populate: 1.481290 (+1.49%, +0.02)
For mozilla-try, the "slow down" from pre-populate for small `hg log` is more
visible, but still small in absolute time. (using rust value for the persistent
nodemap value to be relevant).
### data-env-vars.name = mozilla-try-2019-02-18-ds2-pnm
# benchmark.name = hg.command.log
# bin-env-vars.hg.flavor = rust
# benchmark.variants.patch = yes
# benchmark.variants.limit-rev = 1
with-populate: 0.237813
no-populate: 0.229452 (-3.52%, -0.01)
# benchmark.variants.limit-rev = 10
# benchmark.variants.patch = yes
with-populate: 1.213578
no-populate: 1.205189
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.variants.limit-rev = 1000
# benchmark.variants.patch = no
# benchmark.variants.rev = tip
with-populate: 0.198607
no-populate: 0.195038 (-1.80%, -0.00)
However pre-populating provide a significant boost on more complex operations
like unbundle or push:
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = hg.command.push
# benchmark.variants.explicit-rev = none
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = yes
# benchmark.variants.revs = any-1-extra-rev
with-populate: 4.798632
no-populate: 4.953295 (+3.22%, +0.15)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 4.903618
no-populate: 5.014963 (+2.27%, +0.11)
## benchmark.name = hg.command.unbundle
# benchmark.variants.revs = any-1-extra-rev
with-populate: 1.423411
no-populate: 1.585365 (+11.38%, +0.16)
# benchmark.variants.revs = any-100-extra-rev
with-populate: 1.537909
no-populate: 1.688489 (+9.79%, +0.15)
Matt Harbison <matt_harbison@yahoo.com> [Thu, 11 Jul 2024 11:10:40 -0400] rev 51680
win32mbcs: use str for encoding value
This was reported to the TortoiseHg tracker as:
https://foss.heptapod.net/mercurial/tortoisehg/thg/-/issues/5980
It doesn't look like we have any tests for this extension, but the explicit
type hints are enough to convince pytype that the module level `_encoding` attr
is str. The `encode()` and `decode()` methods are too complex to add type hints
for them.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 18:44:55 -0400] rev 51679
typing: add a trivial type hint to `mercurial/vfs.py`
Since hg
3dbc7b1ecaba, pytype stopped seeing the return value of `rmtree` as
`None`, and substituted `Any`.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 18:34:47 -0400] rev 51678
typing: add a few trivial type hints to `mercurial/templater.py`
Since hg
3dbc7b1ecaba, pytype started inferring that the second value in the
tuple is `BinaryIO`, but still hasn't been able to figure out the rest of
`open_template()`. We can be more precise.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 18:19:32 -0400] rev 51677
typing: add a few type hints to `mercurial/revlog.py`
Somewhere between hg
3dbc7b1ecaba and hg
8e3f6b5bf720, pytype stopped being able
to infer the type for `_docket_file` and `compress()`. Lock those types in
before they get lost.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 18:05:40 -0400] rev 51676
typing: add a trivial type hint to `mercurial/posix.py` to avoid an @overload
Since hg
3dbc7b1ecaba, pytype added an `@overload` for this function, without a
type on the parameter. That's wrong, and undermines the hints on the
non-trivial functions.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 17:55:14 -0400] rev 51675
typing: add some trivial type hints to `mercurial/match.py`
These were new methods since hg
3dbc7b1ecaba, but surprisingly pytype couldn't
figure them out.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 17:44:49 -0400] rev 51674
typing: add a type hint to `mercurial/hg.py`
Somewhere between hg
3dbc7b1ecaba and hg
8e3f6b5bf720, the first value of the
tuple changed from bytes to str. Let's lock this in, so that pytype flags it
if someone mistakenly adds a tuple with bytes somewhere.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 17:37:35 -0400] rev 51673
typing: restore `encoding.encoding` and `encoding.encodingmode` to bytes
Somewhere between hg
3dbc7b1ecaba and hg
8e3f6b5bf720, pytype determined the
signature of these fields changed from `bytes` to `Any`. Not sure why- the type
of `environ` then and now is: `Union[WindowsEnviron, Dict[bytes, bytes], os._Environ[bytes]]`
That said, PyCharm wasn't able to figure out the type of `environ`, and the
`WindowsEnviron` class extends `MutableMapping` without specifying bytes for the
key and value types in py3.9. But that's not changed in my setup, so I can't
explain it.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 17:16:19 -0400] rev 51672
typing: add some trivial type hints to `mercurial/bundlecaches.py`
The function is meant for extensions, but it wasn't obvious what was expected
without looking through the code. Also, pytype couldn't figure it out either.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 17:09:34 -0400] rev 51671
typing: add some type hints for bundle2 capabilities
Somewhere between hg
3dbc7b1ecaba and hg
8e3f6b5bf720, pytype determined the
signature of `bundle20.capabilities` changed from `Dict[bytes, Tuple[bytes]]` to
`Dict[bytes, Union[List[bytes], Tuple[bytes]]]`.
First, I did try to simply be explicit about the previously inferred type, but
it does seem to mix and match list/tuple now (e.g. in `writenewbundle()`). I
tried changing the new list usage to tuple, but a couple of things complained,
(and I think lists of one item are a little more clear to read anyway). So then
I typed the dict value as `Sequence[bytes]`, which worked fine. But there's
also a module level `capabilities` field, and when that's typed, pytype
complains about `Sequence[bytes]` lacking `__add__`[1]. So I gave up, and just
assigned it the type it wanted, with an alias. If somebody feels motivated to
make the type consistent, it's simple enough to change the alias.
The mutable default value to the constructor was removed to appease PyCharm's
type checking on the field. (I didn't bother running the code through pytype
prior to changing it, because we've previously made an effort to remove this
pattern anyway.)
I'm not sure why `getrepocaps()` has a default value for `role` that apparently
raises an exception. It's just flagged for now so this series can land without
risking additional problems.
[1] https://foss.heptapod.net/mercurial/mercurial-devel/-/jobs/2466903
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 16:04:53 -0400] rev 51670
typing: add a few type hints to `mercurial/utils/urlutil.py`
Somewhere between hg
3dbc7b1ecaba and hg
8e3f6b5bf720, `_pathsuboptions` changed
from `Dict[bytes, Tuple[bytes, Any]]` to `Dict[bytes, Tuple[str, Any]]`, and it
caught my attention from diffing the local *.pyi files. The change is correct
based on the assertion, so let's get pytype to check for this instead of relying
on the assertion alone.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 10 Jul 2024 15:49:16 -0400] rev 51669
typing: add type hints to `mercurial/utils/resourceutil.py`
The `except` path requires byte args (because of the byte based manipulation in
`_package_path()`), while the `else` case tolerates `AnyStr`. Pytype was unable
to figure this out, and we should make sure the interface is the same for all
environments.
Anton Shestakov <av6@dwimlabs.net> [Fri, 12 Jul 2024 15:29:35 +0400] rev 51668
copyright: update to 2024
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2024 17:56:54 +0200] rev 51666
branching: merge stable into default for 6.9 cycle
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2024 17:52:08 +0200] rev 51665
Added signature for changeset
11f41248595b
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2024 17:51:56 +0200] rev 51664
Added tag 6.8 for changeset
11f41248595b
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2024 17:51:04 +0200] rev 51663
relnotes: add 6.8
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 08 Jul 2024 16:44:07 +0200] rev 51662
test-check: don't report distutils as a local import
On python 3.12 this is wrongly reported as a local import. So we adjust the
checker to avoid it.
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2024 16:20:04 +0200] rev 51661
Backed out changeset
f28c52a9f7b4
This backout and the previous are due to a large performance regression
detected in repositories with a lot of obsmarkers when performing a clone.
A better fix will come along at the start of the next cycle.
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2024 16:19:33 +0200] rev 51660
Backed out changeset
ff523675cd69
Raphaël Gomès <rgomes@octobus.net> [Mon, 08 Jul 2024 16:02:54 +0200] rev 51659
rust: use `cpython` 0.7.2 crate to add support for Python 3.12
This will give us more headroom until we can migrate to PyO3 some day.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 08 Jul 2024 15:52:01 +0200] rev 51658
revbranchcache: disable mmap access by default
The revbranchcache can be truncated (if some part of it is detected as invalid).
Using mmap on file we truncate is not an option at access to truncated part
would result in a SIGBUS signal.
So we disable the mmap by default until we fix this issue.
Joerg Sonnenberger <joerg@bec.de> [Mon, 24 Jun 2024 18:54:59 +0200] rev 51657
portability: fix build on Solaris-derived systemd
Current Illumos and older Solaris require _XOPEN_SOURCE for
msg_control. O_DIRECTORY doesn't exist on older systems either,
so fallback to O_RDONLY. It's good enough as a repository will
require both R and X permission anyway.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 03 Jul 2024 12:32:57 +0200] rev 51656
mmap: only use mmap to read revlog persistent nodemap if it is safe
Cf `is_mmap_safe` docstring.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 03 Jul 2024 12:47:08 +0200] rev 51655
mmap: fix another instance of reverse mmap logic in persistent nodemap
This fix the same kind of issue as
85d96517e650
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 03 Jul 2024 12:31:21 +0200] rev 51654
mmap: only use mmap to read rev-branch-cache data if it is safe
Cf `is_mmap_safe` docstring.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 03 Jul 2024 12:26:57 +0200] rev 51653
mmap: only use mmap to read revlog index if it is safe
Cf `is_mmap_safe` docstring.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 03 Jul 2024 12:22:48 +0200] rev 51652
mmap: add a `is_mmap_safe` method to vfs
This will be useful to safeguard mmap usage to void SIGBUS when repositories
lives on a NFS drive.
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Jun 2024 13:15:46 +0200] rev 51651
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Jun 2024 13:14:05 +0200] rev 51650
Added signature for changeset
6454c117c6a4
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Jun 2024 13:14:04 +0200] rev 51649
Added tag 6.8rc0 for changeset
6454c117c6a4
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Jun 2024 12:05:31 +0200] rev 51648
branching: merge default into stable for 6.8rc0
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Jun 2024 12:04:14 +0200] rev 51647
relnotes: add 6.8rc0
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Jun 2024 10:52:46 +0200] rev 51646
branch3: use an explicitely experimental name for files
Since this format is still experimental, we don't want to have to side-step
the `branch3` name in case people do start using it before it's stable.
Joerg Sonnenberger <joerg@bec.de> [Mon, 24 Jun 2024 03:16:52 +0200] rev 51645
obsolete: simplify relevantmarker
Drop duplicate assignment from a merge failure. Save
one loop iteration by exploiting that pendingnodes will
be seennodes after the first round anyway, so just
pre-initialize the set accordingly. From Anton Shestakov's
review on !867. Performance difference for my test case is
in the noise.
Joerg Sonnenberger <joerg@bec.de> [Tue, 11 Jun 2024 18:47:50 +0200] rev 51644
exchange: improve computation of relevant markers for large repos
Find the candidates for nodes with relevant markers by going over
all markers instead of iterating over all nodes. Most nodes will
not have markers anyway.
Further optimize the code by allowing revsets as well, which reduces the
materialization cost.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 13 Jun 2024 09:52:39 +0200] rev 51643
test: better glob some timing related line to avoid flakiness
If we go over 10 seconds, the number of white space changes.
Raphaël Gomès <rgomes@octobus.net> [Wed, 12 Jun 2024 11:29:11 +0200] rev 51642
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Wed, 12 Jun 2024 11:27:01 +0200] rev 51641
Added signature for changeset
a1a011d4b148
Raphaël Gomès <rgomes@octobus.net> [Wed, 12 Jun 2024 11:26:57 +0200] rev 51640
Added tag 6.7.4 for changeset
a1a011d4b148
Raphaël Gomès <rgomes@octobus.net> [Wed, 12 Jun 2024 11:25:49 +0200] rev 51639
relnotes: add 6.7.4 and warn about 6.7.{1,2,3}
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 12 Jun 2024 02:16:14 +0200] rev 51638
inline-changelog: fix pending transaction visibility when splitting
We move the name back to the expected name of `changelog.i.a`.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 12 Jun 2024 02:15:20 +0200] rev 51637
inline-changelog: fix a critical bug in write_pending that delete data
Since
a93e52f0b6ff we no longer use inline-revlog for the changelog. The goal there was to
solve the lack of testing for the two variants (inline vs split) and reduce the
complexity of the interaction with "diverted-write" on the changelog level.
However many existing repository still have inline-changelog and we
automatically move them to normal revlog as soon as we have the chances.
Unfortunately This conversion is buggy and can result in the destruction of the
changelog.i if hook triggers the "write pending" mechanism.
The bugs comes from the "revlog splitting" logic and the "write_pending" logic
stepping over each other. Ironically the change in
a93e52f0b6ff aims at no
longer having this kind of problem.
This changesets fix this issue and add associated tests.
Fixing this reveal that the transaction hooks end up not seeing the pending
transaction content, because the name is not right ("changelog.i.s.a" instead of
"changelog.i.s") we fix this in the next changeset.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Jun 2024 03:05:20 +0200] rev 51636
bookmark: fix remote bookmark deletion when the push is raced
Before this patch, running `hg push -B book` to push the `book` bookmark
sideway at the same time as a commit making it moving forward might result in
the removal of the bookmark remotely.
After this changeset, the push can still be raced, but to remove deletion
happens. This is progress.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Jun 2024 03:03:47 +0200] rev 51635
hooks: add a prewlock and a prelock hooks
This is useful for testing.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Jun 2024 11:14:13 +0200] rev 51634
exchange: fix locking to actually be scoped
The previous code was taking locks before entering with statements, so
exception before the with statement would not release the lock (except for
garbage collection).
We need to move to a try except here because the logic is more complicated.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Jun 2024 11:13:36 +0200] rev 51633
exchange: fix locking to actually be scoped
The previous code was taking locks before entering with statements, so
exception before the with statement would not release the lock (except for
garbage collection).
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Sep 2023 18:23:32 +0200] rev 51632
narrow: add a test for linkrev computation done during widen
This new tests show that the linkrev computed and sent by the server might end
up being wrong during a widen operation.
Joerg Sonnenberger <joerg@bec.de> [Mon, 10 Jun 2024 13:45:57 +0200] rev 51631
obsolete: quote the feature name
This makes it at least somewhat clearer that hg is talking about some
specific feature and not just outdated code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 10 Jun 2024 12:12:56 +0200] rev 51630
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Mon, 10 Jun 2024 10:59:44 +0200] rev 51629
rust-status: sort the failed matches when printing them
This was making the tests flaky after the recent patch¹ that opened up
more of the code to the Rust-augmented status.
[1]
865efc020c3355dca1cbaa35db80600009c01dd5
Julien Cristau <jcristau@mozilla.com> [Thu, 23 May 2024 11:05:11 +0200] rev 51628
clonebundles: add missing newline to legacy response
This seems to have been removed in 6.5 (likely by
60f9602b413e).
Anton Shestakov <av6@dwimlabs.net> [Tue, 07 May 2024 15:15:41 +0400] rev 51627
chistedit: change action for the correct item
We have an experimental config histedit.later-commits-first from
c820866c52f9,
and when it's true, the order of commits in histedit UI is reversed, both in
text mode and in curses mode.
But before this patch key presses in curses mode would change histedit actions
in the same old order, i.e. trying to edit the latest commit (which would be
first now) would put "edit" action on the last commit in the list. This wasn't
a cosmetic issue, histedit would actually proceed to edit the first commit in
the list.
Let's map rules to display items (hopefully now correctly).
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 26 Apr 2024 19:10:35 +0100] rev 51626
dirstate: remove the python-side whitelist of allowed matchers
This whitelist is too permissive because it allows matchers that contain
disallowed ones deep inside, for example through `intersectionmatcher`.
It is also too restrictive because it doesn't pass through
some of the matchers we support, such as `patternmatcher`.
It's also unnecessary because unsupported matchers raise
`FallbackError` and we fall back anyway.
Making this change makes more of the tests use rust code path,
and therefore subtly change behavior. For example, rust status
in largefiles repos seems to have strange behavior.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 26 Apr 2024 18:53:02 +0100] rev 51625
match: make `was_tampered_with` work recursively
This is useful if we are to use it outside of Rust, when
deciding whether or not to do some fast-path operation with
a given matcher.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 26 Apr 2024 19:43:42 +0100] rev 51624
largefiles: mark more matchers as having been tampered with
These happened to slip through the cracks earlier because they
weren't caught by tests. Now that we're enabling rust fast path
more widely these start breaking.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 18:50:21 +0200] rev 51623
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 18:48:37 +0200] rev 51622
Added signature for changeset
028dc3f92dbd
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 18:48:34 +0200] rev 51621
Added tag 6.7.3 for changeset
028dc3f92dbd
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 17:51:21 +0200] rev 51620
relnotes: add 6.7.3
Georges Racinet <georges.racinet@octobus.net> [Mon, 22 Apr 2024 19:47:08 +0200] rev 51619
rust: blanket implementation of Graph for Graph references
The need comes from the fact that `AncestorsIterator` and many
Graph-related algorithms take ownership of the `Graph` they work with.
This, in turn is due to them needing to accept the `Index` instances
that are provided by the Python layers (that neither rhg nor `RHGitaly`
use, of course): the fact that nowadays the Python layer holds an object
that is itself implemented in Rust does not change the core problem that
they cannot be tracked by the borrow checker.
Even though it looks like cloning `Changelog` would be cheap, it seems
hard to guarantee that on the long run. The object is already too rich
for us to be comfortable with it, when using references is the most
natural and guaranteed way of proceeding.
The added test seems a bit superfleous, but it will act as a reminder
that this feature is really useful until something in the Mercurial code
base actually uses it.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 15:30:21 +0200] rev 51618
rust-cpython: don't swallow the dirstate error message
In case we do get a dirstate error, we want to get the full error message and
not just an opaque `Dirstate error`.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 13:07:02 +0200] rev 51617
dirstate-v2: check that root nodes are at the root before writing
More explanations in the previous changeset.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 13:02:52 +0200] rev 51616
dirstate-v2: add check of parent/child nodes being related when writing
This stems from a corruption seen in a private repository. We're not sure
of the source of the corruption, and it's very possible that we're seeing
compounded effects of multiple writes on a corrupted dirstate.
Adding this check is not expensive in itself and large writes of the dirstate
are not common.
This change does not catch this problem at the root node, the next one will.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 16:29:00 +0200] rev 51615
admin-verify: expect a number of errors to be returned
It's the responsibility of the check to handle errors, we only care about
the total count to sum up the check's work.
We use `admin::verify -c dirstate` to test this path at least somewhat.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 16:16:15 +0200] rev 51614
admin-verify: fix error message handling
`dirstate.verify` used to return tuples but does not anymore, it returns
the pre-formatted error message, which is a nicer interface anyway.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 12:31:29 +0200] rev 51613
admin-verify: pass p1 down to the dirstate function
This was forgotten and can break with certain kinds of corruption.
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 11:27:29 +0200] rev 51612
Backed out changeset
3e0f86f09f26
Raphaël Gomès <rgomes@octobus.net> [Mon, 06 May 2024 11:26:52 +0200] rev 51611
Backed out changeset
fc317bd5b637
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 02 May 2024 02:20:42 +0200] rev 51610
re2: make errors quiet
By default, the re2 library will output error on its own instead of keeping the
error in an exception. This make re2 printing spurious error before fallback to
the stdlib remodule that may accept the pattern or also fails to parse it and
raise a proper error that will be handled by Mercurial.
So we also pass an Option object that changes this default.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 02 May 2024 08:46:58 +0200] rev 51609
fold-or-prune-me: update proposal
This does the same things but with a narrower wrapping.
Felipe Resende <felipe@fcresende.dev.br> [Sun, 31 Mar 2024 17:57:46 -0300] rev 51608
subrepo: propagate non-default path on outgoing
There was already a fix made in
5dbff89cf107 for pull and push commands. I did
the same for the outgoing command.
The problem I identified is that when the parent repository has multiple paths,
the outgoing command was not respecting the parent path used and was always
using the default path for subrepositories.
Hraban Luyat <hraban@0brg.net> [Tue, 26 Mar 2024 01:27:27 -0400] rev 51607
hgrc: search XDG_CONFIG_HOME on mac
Searching for hgrc was special cased not to look through ~/.config/hg on Mac,
but that’s unnecessary: Macs support it as do other unix based systems. There
are plenty tools that use it there, e.g. git, and people expect it to work, e.g.
"https://stackoverflow.com/questions/
72499837/mercurial-on-macos-doesnt-read-config-hg-hgrc".
Initial code introduced in
354020079723.
Raphaël Gomès <rgomes@octobus.net> [Tue, 16 Apr 2024 09:51:11 +0200] rev 51606
base-revsets: use an author that actually exercises a lot of changesets
This was caught in my big find-and-replace:
d4ba4d51f85f.
The point of `base-revsets` is to give revsets that will give a good coverage
of the repository. Using Pierre-Yves as the second largest committer
(in terms of number of changesets) seems like a good idea.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Tue, 16 Apr 2024 17:21:37 +0100] rev 51605
match: simplify the rust-side file pattern kind parsing
There's no need to add the ':' characters if
we're simply pattern matching against constants next.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Tue, 16 Apr 2024 13:51:45 +0100] rev 51604
match: share code between includematcher and patternmatcher
No need to have this duplication.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 05 Apr 2024 17:57:26 +0100] rev 51603
matchers: support patternmatcher in rust
Arseniy Alekseyev <aalekseyev@janestreet.com> [Tue, 09 Apr 2024 11:12:24 +0100] rev 51602
match: avoid rust fast path if the matcher was tampered with
Otherwise the fast path does not respect the modifications made
by the extension (concretely largefiles, but other extensions can
start using that too)
Arseniy Alekseyev <aalekseyev@janestreet.com> [Tue, 09 Apr 2024 11:00:52 +0100] rev 51601
largefiles: track if a matcher was tampered with
This is used to make sure rust fast path is not taken for the
modified matchers.
Raphaël Gomès <rgomes@octobus.net> [Wed, 17 Apr 2024 12:28:48 +0200] rev 51600
branching: merge stable into default
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 13 Mar 2024 12:02:06 +0100] rev 51599
tags-cache: directly perform a monimal walk for hgtagsfnodescache warming
We do something narrower than the path retrieving data. So lets use dedicated
code instead.
This provides further useful speedup:
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.debug.debug-update-cache
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.pre-state = warm
before-this-series: 19.947581
skip-fnode-filter: 18.916804 (-5.17%, -1.03)
use-rev-num: 17.493725 (-12.30%, -2.45)
this-changesets: 15.919466 (-20.19%, -4.03)
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 13 Mar 2024 11:51:11 +0100] rev 51598
tags-cache: directly operate on rev-num warming hgtagsfnodescache
Not having to goes through nodeid speed up things notably.
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.debug.debug-update-cache
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.pre-state = warm
before-this-series: 19.947581
before-this-changes: 18.916804 (-5.17%, -1.03)
this-changesets: 17.493725 (-12.30%, -2.45)
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 13 Mar 2024 11:38:28 +0100] rev 51597
tags-cache: skip the filternode step if we are not going to use it
When warming the hgtagsfnodescache, we don't need the actual result, so we can
simply skip the part that "filter" fnode we read from the cache. So provide a
quite visible speed up to the top level `hg debugupdatecache` function.
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.debug.debug-update-cache
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.pre-state = warm
before: 19.947581
after: 18.916804 (-5.17%, -1.03)
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 13 Mar 2024 11:34:21 +0100] rev 51596
tags-cache: add a dedicated warm cache function to hgtagsfnodescache
Having a dedicated API point will help to optimize that specific usage. Right
doing a full phases weam takes a long time, even when the cache is already
filled.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Apr 2024 22:37:15 +0200] rev 51595
outgoing: add a simple fastpath when there is no common
This further speed up case like `hg bundle --all` for larger repository.
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.command.bundle
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.revs = all
# benchmark.variants.type = none-streamv2
before: 316.749699
after: 311.165461 (-1.76%, -5.58)
There is further work to be done in this area like not doing any outgoing
computation in the stream case for example. however the recent changes already
gives use a large win for a small amount of local work.
### benchmark.name = hg.command.bundle
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.revs = all
# benchmark.variants.type = none-streamv2
## data-env-vars.name = mercurial-public-2024-03-22-zstd-sparse-revlog
pre-%ln-change: 1.263859
the-%ln-change: 0.700229 (-44.60%, -0.56)
prev-changeset: 0.496050 (-60.75%, -0.77)
this-changeset: 0.495243 (-60.81%, -0.77)
## data-env-vars.name = tryton-public-2024-03-22-zstd-sparse-revlog
pre-%ln-change: 2.975765
the-%ln-change: 1.870798 (-37.13%, -1.10)
prev-changeset: 1.461583 (-50.88%, -1.51)
this-changeset: 1.469185 (-50.63%, -1.51)
## data-env-vars.name = pypy-2024-03-22-zstd-sparse-revlog
pre-%ln-change: 4.540080
the-%ln-change: 3.401700 (-25.07%, -1.14)
prev-changeset: 2.915810 (-35.78%, -1.62)
this-changeset: 2.911643 (-35.87%, -1.63)
## data-env-vars.name = heptapod-public-2024-03-25-zstd-sparse-revlog
pre-%ln-change: 10.138396
the-%ln-change: 7.750458 (-23.55%, -2.39)
prev-changeset: 6.665565 (-34.25%, -3.47)
this-changeset: 6.672078 (-34.19%, -3.47)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
pre-%ln-change: 399.484481
the-%ln-change: 346.508952 (-13.26%, -52.98)
prev-changeset: 316.749699 (-20.71%, -82.73)
this-changeset: 311.165461 (-22.11%, -88.32)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Apr 2024 22:36:35 +0200] rev 51594
outgoing: rework the handling of the `missingroots` case to be faster
The previous implementation was slow, to the point it was taking a significant
amount of `hg bundle --type none-streamv2` call. We rework the code to compute
the same value much faster, making the operation disappear from the `hg bundle
--type none-streamv2` profile. Someone would remark that producing a streamclone
does not requires an `outgoing` object. However that is a matter for another
day. There is other user of `missingroots` (non stream `hg bundle` call for
example), and they will also benefit from this rework.
We implement an old TODO in the process, directly computing the missing and
common attribute as we have most element at hand already.
### benchmark.name = hg.command.bundle
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.revs = all
# benchmark.variants.type = none-streamv2
## data-env-vars.name = heptapod-public-2024-03-25-zstd-sparse-revlog
before: 7.750458
after: 6.665565 (-14.00%, -1.08)
## data-env-vars.name = mercurial-public-2024-03-22-zstd-sparse-revlog
before: 0.700229
after: 0.496050 (-29.16%, -0.20)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
before: 346.508952
after: 316.749699 (-8.59%, -29.76)
## data-env-vars.name = pypy-2024-03-22-zstd-sparse-revlog
before: 3.401700
after: 2.915810 (-14.28%, -0.49)
## data-env-vars.name = tryton-public-2024-03-22-zstd-sparse-revlog
before: 1.870798
after: 1.461583 (-21.87%, -0.41)
note: this whole `missingroots` of outgoing has a limited number of callers and
could likely be replace by something simpler (like taking an explicit
"missing_revs" set for example). However this is a wider change and we focus on
a small impact, quick rework that does not change the API for now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 14 Apr 2024 02:27:10 +0200] rev 51593
proxy-vfs: also proxy the `audit` attribute
In the previous changeset, we had to do a little dance to access the useful
`audit` attribute. We now provide a proper accessors to it.
We don't update the code in `perf.py` because it has to remain compatible with
older version of Mercurial. This will just be nicer in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 13 Apr 2024 23:40:28 +0200] rev 51592
perf: clear vfs audit_cache before each run
When generating a stream clone, we spend a large amount of time auditing path.
Before this changes, the first run was warming the vfs cache for the other
runs, leading to a large runtime difference and a "faulty" reported timing for
the operation.
We now clear this important cache between run to get a more realistic timing.
Below are some example of median time change when clearing these cases. The
maximum time for a run did not changed significantly.
### data-env-vars.name = mozilla-central-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.perf.exchange.stream.generate
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.version = latest
no-clearing: 17.289905
cache-clearing: 21.587965 (+24.86%, +4.30)
## data-env-vars.name = mozilla-central-2024-03-22-zstd-sparse-revlog
no-clearing: 32.670748
cache-clearing: 40.467095 (+23.86%, +7.80)
## data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
no-clearing: 37.838858
cache-clearing: 46.072749 (+21.76%, +8.23)
## data-env-vars.name = mozilla-unified-2024-03-22-zstd-sparse-revlog
no-clearing: 32.969395
cache-clearing: 39.646209 (+20.25%, +6.68)
In addition, this significantly reduce the timing difference between the
performance command, from the perf extensions and a `real `hg bundle` call
producing a stream bundle. Some significant differences remain especially on
the "mozilla-try" repositories, but they are now smaller.
Note that some of that difference will actually not be
attributable to the stream generation (like maybe phases or branch map
computation).
Below are some benchmarks done on a currently draft changeset fixing some
unrelated slowness in `hg bundle` (
34a78972af409d1ff37c29e60f6ca811ad1a457d)
### data-env-vars.name = mozilla-central-2018-08-01-zstd-sparse-revlog
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
hg.perf.exchange.stream.generate: 21.587965
hg.command.bundle: 24.301799 (+12.57%, +2.71)
## data-env-vars.name = mozilla-central-2024-03-22-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 40.467095
hg.command.bundle: 44.831317 (+10.78%, +4.36)
## data-env-vars.name = mozilla-unified-2024-03-22-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 39.646209
hg.command.bundle: 45.395258 (+14.50%, +5.75)
## data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 46.072749
hg.command.bundle: 55.882608 (+21.29%, +9.81)
## data-env-vars.name = mozilla-try-2023-03-22-zlib-general-delta
hg.perf.exchange.stream.generate: 334.716708
hg.command.bundle: 377.856767 (+12.89%, +43.14)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
hg.perf.exchange.stream.generate: 302.972301
hg.command.bundle: 326.098755 (+7.63%, +23.13)
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 14 Apr 2024 02:41:36 +0200] rev 51591
perf: start recording total time after warming
The warming might be costly and this should not affect the "time profile" of the
actual collection.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 14 Apr 2024 02:40:15 +0200] rev 51590
perf: run the gc before each run
The python garbage collector is a large source of performance troubles, we run
it right before the timed section to reduce the change for the gc to add noise
to the benchmark.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 14 Apr 2024 02:38:41 +0200] rev 51589
perf: allow profiling of more than one run
By default, we still profile the first run only. However profiling more run help
to understand side effect from one run to the other. So we add an option to be
able to do so.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 14 Apr 2024 02:36:55 +0200] rev 51588
profiler: flush after writing the profiler output
Otherwise, the profiler output might only partially appears until the next flush
of the buffer. Since profiling often happens for long operation, the next flush
can be a long time away.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 14 Apr 2024 02:33:36 +0200] rev 51587
stream-clone: disable gc for the entry listing section for the v2 format
This is similar to the change we did for the v3 format in
6e4c8366c5ce.
The benchmark bellow show this gives us a notable gains, especially on larger
repositories.
### benchmark.name = hg.perf.stream-locked-section
# benchmark.name = hg.perf.stream-locked-section
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.version = v2
## data-env-vars.name = pypy-2018-08-01-zstd-sparse-revlog
5e931bf8707c: 0.503820 ~~~~~
1106d1bf695e: 0.470078 (-6.70%, -0.03)
## data-env-vars.name = pypy-2024-03-22-zstd-sparse-revlog
5e931bf8707c: 0.535756 ~~~~~
1106d1bf695e: 0.490249 (-8.49%, -0.05)
## data-env-vars.name = heptapod-public-2024-03-25-zstd-sparse-revlog
5e931bf8707c: 1.327041 ~~~~~
1106d1bf695e: 1.174636 (-11.48%, -0.15)
## data-env-vars.name = netbeans-2018-08-01-zstd-sparse-revlog
5e931bf8707c: 2.439158 ~~~~~
1106d1bf695e: 2.220515 (-8.96%, -0.22)
## data-env-vars.name = netbeans-2019-11-07-zstd-sparse-revlog
5e931bf8707c: 2.630794 ~~~~~
1106d1bf695e: 2.261473 (-14.04%, -0.37)
## data-env-vars.name = mozilla-central-2018-08-01-zstd-sparse-revlog
5e931bf8707c: 5.769002 ~~~~~
1106d1bf695e: 5.062000 (-12.26%, -0.71)
## data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
5e931bf8707c: 13.351750 ~~~~~
1106d1bf695e: 12.346655 (-7.53%, -1.01)
## data-env-vars.name = mozilla-central-2024-03-22-zstd-sparse-revlog
5e931bf8707c: 10.772939 ~~~~~
1106d1bf695e: 9.495407 (-11.86%, -1.28)
## data-env-vars.name = mozilla-unified-2024-03-22-zstd-sparse-revlog
5e931bf8707c: 10.864297 ~~~~~
1106d1bf695e: 9.475597 (-12.78%, -1.39)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
5e931bf8707c: 17.448335 ~~~~~
1106d1bf695e: 16.027474 (-8.14%, -1.42)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Apr 2024 02:54:19 +0200] rev 51586
phases: rework the logic of _pushdiscoveryphase to bound complexity
This rework the various graph traversal in _pushdiscoveryphase to keep the
complexity in check.
This is done though a couple of things:
- first, limiting the space we have to explore, for example, if we are not in
publishing push, we don't need to consider remote draft roots that are also
draft locally, as there is nothing to be moved there.
- avoid unbounded descendant computation, and use the faster "rev between"
computation.
This provide a massive boost to performance when exchanging with repository with
a massive amount of draft, like mozilla-try:
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.command.push
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.explicit-rev = all-out-heads
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = default
## benchmark.variants.revs = any-1-extra-rev
before: 20.346590 seconds
after: 11.232059 seconds (-38.15%, -7.48 seconds)
## benchmark.variants.revs = any-100-extra-rev
before: 24.752051 seconds
after: 15.367412 seconds (-37.91%, -9.38 seconds)
After this changes, the push operation is still quite too slow. Some of this
can be attributed to general phases slowness (reading all the roots from disk
for example) and other know slowness (not using persistent-nodemap, branchmap,
tags, etc. We are also working on them, but with this series, phase discovery
during push no longer showing up in profile and this is a pretty nice and bit
low-hanging fruit out of the way.
### (same case as the above)
# benchmark.variants.revs = any-1-extra-rev
pre-%ln-change: 44.235070
this-changeset: 11.232059 seconds (-74.61%, -33.00 seconds)
# benchmark.variants.revs = any-100-extra-rev
pre-%ln-change: 49.234697
this-changeset: 15.367412 seconds (-68.79%, -33.87 seconds)
Note that with this change, the `hg push` performance is now much closer to the
`hg pull` performance, even it still lagging behind a bit. (and the overall
performance are still too slow).
### data-env-vars.name = mozilla-try-2023-03-22-ds2-pnm
# benchmark.variants.explicit-rev = all-out-heads
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.pulled-delta-reuse-policy = default
# bin-env-vars.hg.flavor = rust
## benchmark.variants.revs = any-1-extra-rev
hg.command.pull: 6.517450
hg.command.push: 11.219888
## benchmark.variants.revs = any-100-extra-rev
hg.command.pull: 10.160991
hg.command.push: 14.251107
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.explicit-rev = all-out-heads
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.pulled-delta-reuse-policy = default
## bin-env-vars.hg.flavor = default
## benchmark.variants.revs = any-1-extra-rev
hg.command.pull: 8.577772
hg.command.push: 11.232059
## bin-env-vars.hg.flavor = default
## benchmark.variants.revs = any-100-extra-rev
hg.command.pull: 13.152976
hg.command.push: 15.367412
## bin-env-vars.hg.flavor = rust
## benchmark.variants.revs = any-1-extra-rev
hg.command.pull: 8.731982
hg.command.push: 11.178751
## bin-env-vars.hg.flavor = rust
## benchmark.variants.revs = any-100-extra-rev
hg.command.pull: 13.184236
hg.command.push: 15.620843
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 22:47:44 +0200] rev 51585
phases: introduce a performant efficient way to access revision in a set
This will be useful in the next changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 14:13:47 +0200] rev 51584
phases: use revision number in `_pushdiscoveryphase`
We now reach our target checkpoint in terms of rev-num conversion. The
`_pushdiscoveryphase` function is now performing graph computation based on
revision number only. Avoiding repeated conversion from node-id to rev-num.
See previous changeset updated `new_heads` for rationnal.
Again, time saved in the 100 milliseconds order of magnitude for the mozilla-try
benchmark I have been using.
However, wow that the logic is done using revision number, we can look into having
better logic in the next changesets, which will provide a much bigger speedup.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 14:11:02 +0200] rev 51583
phases: move RemotePhasesSummary to revision number
This continue our quest to align more logic on revision number instead of
node-ids. The motivation is similar to the change to `new_heads` and
`analyze_remote_phases` a few changeset earlier.
Again, we take this as an opportunity to rename the class, and the attribute to
the new naming scheme. This will highlight the need for code update for any
code using it an expecting node-ids.
Many of the rev-num → node-id conversion we had to introduce in the previous
changesets can now be removed. More will be removed in the future as we continue
to align code toward rev-num usage.
time saved in the 100 milliseconds order of magnitude for the mozilla-try
benchmark I have been using.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 12:24:47 +0200] rev 51582
phases: stop using `repo.set` in `remotephasessummary`
The `repository.set` create changectx on the fly, an expensive operation. Using
`repo.revs` and a direct rev-num → node-id translation will be significantly
faster.
This is especially true as we prepare ourself to no longer do the rev-num →
node-id transalation there.
The speedup is a bit lost in the overall noisyness of the slow phase discovery algorithm, but it save a small amount of time in my benchmark.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 12:02:43 +0200] rev 51581
phases: use revision number in analyze_remote_phases
Same logic as the previous change to `new_heads`, see rationnal there.
This avoids a small number of `nodes -> revs` conversion speeding thing up in
the 100 milliseconds order of magnitude for the worses cases. However, the rest
of the logic is noisy enough that it hardly matters for now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 11:33:47 +0200] rev 51580
phases: use revision number in new_heads
All graph operations will be done using revision numbers, so passing nodes only
means they will eventually get converted to revision numbers internally.
As part of an effort to align the code on using revision number we make the
`phases.newheads` function operated on revision number, taking them as input
and using them in returns, instead of the node-id it used to consume and
produce.
This is part of multiple changesets effort to translate more part of the logic,
but is done step by step to facilitate the identification of issue that might
arise in mercurial core and extensions.
To make the change simpler to handle for third party extensions, we also rename
the function, using a more modern form. This will help detecting the different
between the node-id version and the rev-num version.
I also take this as an opportunity to add some comment about possible
performance improvement for the future. They don't matter too much now, but they
are worse exploring in a while.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 08 Apr 2024 15:11:49 +0200] rev 51579
phases: convert remote phase root to node while reading them
This is currently a bit silly as we will convert them back to node right after,
but that is an intermediate step before doing more disruptive changes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 11:17:25 +0200] rev 51578
phases: more compact error handling in analyzeremotephases
using an intermediate variable result in more readable code, so let us use it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Apr 2024 02:54:12 +0200] rev 51577
push: rework the computation of fallbackheads to be correct
The previous computation tried to be smart but ended up being wrong. This was
caught by phase movement test while reworking the phase discovery logic to be
faster.
The previous logic was failing to catch case where the pushed set was not based
on a common heads (i.e. when the discovery seemed to have "over discovered"
content, outside the pushed set)
In the following graph, `e` is a common head and we `hg push -r f`. We need to
detect `c` as a fallback heads and we previous failed to do so::
e
|
d f
|/
c
|
b
|
a
The performance impact of the change seems minimal. On the most impacted
repository at hand (mozilla-try), the slowdown seems mostly mixed in the
overall noise `hg push` but seems to be in the hundred of milliseconds order of
magnitude. When using rust, we seems to be a bit faster, probably because we
leverage more accelaratd internals.
I added a couple of performance related common for further investigation later
on.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 11:05:54 +0200] rev 51576
revset: stop serializing node when using "%ln"
Turning hundred of thousand of node from node to hex and back can be slow… what
about we stop doing it?
In many case were we are using node id we should be using revision id. However
this is not a good reason to have a stupidly slow implementation of "%ln".
This caught my attention again because the phase discovery during push make an
extensive use of "%ln" or huge set. In absolute, that phase discovery probably
should use "%ld" and need to improves its algorithmic complexity, but improving
"%ln" seems simple and long overdue. This greatly speeds up `hg push` on
repository with many drafts.
Here are some relevant poulpe benchmarks:
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.command.push
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.explicit-rev = all-out-heads
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.protocol = ssh
# benchmark.variants.reuse-external-delta-parent = default
## benchmark.variants.revs = any-1-extra-rev
before: 44.235070
after: 20.416329 (-53.85%, -23.82)
## benchmark.variants.revs = any-100-extra-rev
before: 49.234697
after: 26.519829 (-46.14%, -22.71)
### benchmark.name = hg.command.bundle
# bin-env-vars.hg.flavor = default
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.revs = all
# benchmark.variants.type = none-streamv2
## data-env-vars.name = heptapod-public-2024-03-25-zstd-sparse-revlog
before: 10.138396
after: 7.750458 (-23.55%, -2.39)
## data-env-vars.name = mercurial-public-2024-03-22-zstd-sparse-revlog
before: 1.263859
after: 0.700229 (-44.60%, -0.56)
## data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
before: 399.484481
after: 346.5089 (-13.26%, -52.98)
## data-env-vars.name = pypy-2024-03-22-zstd-sparse-revlog
before: 4.540080
after: 3.401700 (-25.07%, -1.14)
## data-env-vars.name = tryton-public-2024-03-22-zstd-sparse-revlog
before: 2.975765
after: 1.870798 (-37.13%, -1.10)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Apr 2024 14:41:48 +0200] rev 51575
bundlespec: drop unused _bundlespecvariants dictionary
Why do we have a `_bundlespecvariants`?
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Apr 2024 14:37:24 +0200] rev 51574
bundlespec: type the _bundlespeccontentopts dictionary
If only we had a tool to detect the kind of stupid error we just fixed… ho wait.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 Apr 2024 14:36:01 +0200] rev 51573
bundlespec: fix the "streamv2" and "streamv3-exp" variant
In
c4aab3661f25, we broken this feature by adding unicode instead of bytes to
the dictionary.
On the other hand, this feature was never tested, so augment the tests to tests
this.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 04 Apr 2024 14:15:32 +0100] rev 51572
wireprotoserver: ensure that output stream gets flushed on exception
Previously flush was happening due to Python finalizer being run on
`BufferedWriter`. With upgrade to Python 3.11 this started randomly
failing.
My guess is that the finalizer on the raw `FileIO` object may
be running before the finalizer of `BufferedWriter` has a chance to run.
At any rate, since we're not relying on finalizers in the happy case
we should also not rely on them in case of exception.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 15 Apr 2024 16:33:37 +0100] rev 51571
match: strengthen visit_children_set invariant, Recursive means "all files"
My previous interpretation of "Recursive" was too relaxed: I thought it
instructed the caller to do something like this:
> you can stop calling `visit_children_set` because you'll need to descend into
> every directory recursively, but you should still check every file if it
> matches or not
Whereas the real instruction seems to be:
> I guarantee that everything in this subtree matches, you can stop
> querying the matcher for all files and dirs altogether.
The evidence to support this:
- the test actually passes with the stronger invariant, revealing no
exceptions from this rule
- the implementation of `visit_children_set` for `DifferenceMatcher`
clearly relies on this requirement, so it must hold for that not to
lead to bugs.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 12 Apr 2024 16:09:45 +0100] rev 51570
match: fix the rust-side bug in visit_children_set for rootfilesin matchers
The fix is checked by `test_pattern_matcher_visit_children_set` test,
which is what caught the bug in the first place, but also by an end-to-end
test that I made for this purpose.
Accept the new results of Cargo tests
Many of these were already annotated with "FIXME", which is a good sign.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 12 Apr 2024 15:39:21 +0100] rev 51569
match: fix the "visitdir" method on "rootfilesin" matchers
This fixes just the Python side, the fix for the rust side will follow shortly.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 12 Apr 2024 14:21:14 +0100] rev 51568
match: rename RootFiles to RootFilesIn for more consistency
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 12 Apr 2024 14:17:10 +0100] rev 51567
match: small tweak to PatternMatcher.visit_children_set
This makes it a bit more efficient (avoid a computation in case of early
return), and in my opinion clearer.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 12 Apr 2024 14:09:55 +0100] rev 51566
matchers: fix the bug in rust PatternMatcher that made it cut off early
This brings the rust output in line with the Python output.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Fri, 12 Apr 2024 13:48:38 +0100] rev 51565
tests: add an end-to-end test to show a bug in `visit_children_set`
Concretely, `rootfilesin` is completely broken with respect to
`visit_children_set` optimization.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 11 Apr 2024 19:57:36 +0100] rev 51564
tests: add tests and document expectations from visit_children_set in rust
The tests this patch are adding have the form of formal spec in
invariants::visit_children_set::holds,
and then a series of checks that all examples must satisfy this
formal spec.
I tried to make the spec consistent with how this function is used
and how it was originally conceived. This is in conflict with how it's
documented in Rust. Some of the implementations also fail to implement
this spec, which leads to bugs, in particular when complicated patterns
are used with `hg status`.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 11 Apr 2024 15:53:23 +0100] rev 51563
tests: add a test that demonstrates a bug in rhg status pattern handling
The bug is in [visit_children_set], will be elaborated on in
follow-up changes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 Apr 2024 01:07:46 +0200] rev 51562
bundle-spec: properly parse boolean configuration as boolean
Before this changesets "v2;revbranchcache=no" would actually request the
addition for a revbranchcache part as the non-empty string `"0"` is `True`
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 Apr 2024 16:41:43 +0200] rev 51561
bundle-spec: properly identify changegroup-less bundle
It is possible to produce a bundle without changegroup. For example if we want
to only send phases or obsolescence information. However that lead to crash for
command that identifies bundle content. So we fix that.
The test will come in the next changesets, when we fix another bug preventing to
generate such bundle by hand.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 03 Apr 2024 15:33:25 +0200] rev 51560
perf: create the temporary target next to the source in stream-consume
See inline comment for rational.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 03 Apr 2024 16:00:37 +0200] rev 51559
setup: display return code information about failed `hg` call
This help to understand what is going wrong when things goes wrong.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 02 Apr 2024 21:53:17 +0200] rev 51558
bundlespec: rationalize the way we specify stream bundle version
Instead of having weird dedicated option for each version (v2, v3, etc) we
reuse the same "stream" parameters. This is consistent with the ability to
request a stream clone using "none-v2;stream=v2".
This changeset introduce no user visible change, this is pure internal cleaning.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 02 Apr 2024 17:02:39 +0200] rev 51557
bundle: do no check the changegroup version if no changegroup is included
We don't need to check the compatibility of something we will not use.
In practice this was getting in the was of `streamv2` bundles on a narrow
repository as the 'cg.version=02' value was rejected by this checks.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Mar 2024 18:51:33 +0000] rev 51556
perf-stream-consume: use the source repository config when applying
This might contains critical configuration for the benchmark, like enabling of
extensions like narrow.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Mar 2024 17:46:23 +0000] rev 51555
unbundle: move most of the logic on cmdutil to help debug::unbundle reuse
This make sure `hg debug::unbundle` focus on the core logic.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Mar 2024 17:29:48 +0000] rev 51554
postincoming: move to cmdutil
This looks like a good place for it to live.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Mar 2024 17:21:46 +0000] rev 51553
postincoming: avoid computing branchhead if no report will be posted
This otherwise defeat some of the branch v3 optimization.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 13:46:44 +0000] rev 51552
streamclone: stop listing files for entries that have no volatile files
This will save a lot of python related time.
This significantly boost performance. The following number comes from a large
private repository using perf::stream-locked-section:
base-line: 35.04 seconds
prev-change: 24.51 seconds (-30%)
prev-change: 20.88 seconds (-40%)
prev-change: 14.22 seconds (-60%)
this-change: 11.58 seconds (-67% from baseline; -18% from prev)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 13:34:05 +0000] rev 51551
stream-clone: disable gc for the initial section for the v3 format
The number of small container created turn Python in a gc-frenzy that seriously
impact performance.
This significantly boost performance. The following number comes from a large
private repository using perf::stream-locked-section:
base-line: 35.04 seconds
prev-change: 24.51 seconds (-30%)
prev-change: 20.88 seconds (-40%)
this-change: 14.22 seconds (-60% from baseline; -31% from prev)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 13:32:46 +0000] rev 51550
stream-clone: disable gc for `_entries_walk` duration
The number of small container created turn Python in a gc-frenzy that seriously
impact performance.
This significantly boost performance. The following number comes from a large
private repository using perf::stream-locked-section:
base-line: 35.04 seconds
prev-change: 24.51 seconds (-30%)
this-change: 20.88 seconds (-40% from baseline; -15% from previous changes)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 13:28:52 +0000] rev 51549
nocg: make the utility work are both a decorator and context manager
In some case, the context manager version will be simpler.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 11:24:20 +0000] rev 51548
stream-clone: stop getting the file size of all file in v3
The point of v3 is to do less work in the locked section. It was currently not
the case.
This significantly boost performance. The following number comes from a large
private repository using perf::stream-locked-section:
base-line: 35.03 seconds
this-change: 24.50 seconds (-30%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 18:55:40 +0000] rev 51547
stream: in v3, skip the "size" fast path if the entries as some unknown size
We are about to prefetch size during the lock less in the v3 case. So we need to
avoid trying to use that prefetched size when it is not available.
See next changeset for the motivation.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 08:43:20 +0000] rev 51546
perf-stream-locked-section: advertise the right version key in the help
As the v3 format is still experimental, its key is "v3-exp". The help text was
not pointing that out.
(we also fix `perf::stream-generate` in the process)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 08:39:08 +0000] rev 51545
perf-stream-locked-section: fix the call to the v3 generator
That generator simply return chunks so we should not assign the return to a
tuple.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 26 Mar 2024 08:36:47 +0000] rev 51544
perf-stream-locked-section: actually use v1 generation when requested
We were fetching a v1 generator but actually using the v2 function…
Raphaël Gomès <rgomes@octobus.net> [Fri, 29 Mar 2024 21:39:00 +0100] rev 51543
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Fri, 29 Mar 2024 21:37:09 +0100] rev 51542
Added signature for changeset
803e61387e86
Raphaël Gomès <rgomes@octobus.net> [Fri, 29 Mar 2024 21:37:06 +0100] rev 51541
Added tag 6.7.2 for changeset
803e61387e86
Raphaël Gomès <rgomes@octobus.net> [Thu, 28 Mar 2024 14:47:20 +0000] rev 51540
relnotes: add 6.7.2
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Mar 2024 07:12:09 +0000] rev 51539
bundle2: make the "hgtagsfnodes" part advisory
This bundle2 part is about helping the client to warms its cache. There is no
reason for it to be mandatory.
So we mark it advisory.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 25 Mar 2024 16:27:48 +0000] rev 51538
branching: merge stable into default
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Mar 2024 10:57:16 +0100] rev 51537
branchcache: allow to detect "pure topological case" for branchmap
We don't rum this detection every time we run the branchcache, that would be
costly. However we now do it when running `hg debugupdatecache`.
This will help existing repository to benefit from the fastpath when possible.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Mar 2024 04:15:23 +0100] rev 51536
branchcache: add a "pure topological head" fast path
In a narrow but actually quick common case, all topological heads are all on
the same branch and all open. In this case, computing the branch map is very
simple. We can quickly detect situation where this situation will not change.
So we update the V3 format to be able to express this situation and upgrade the
update code to detect we remains in that mode.
The branch cache is populated with the actual value when the branch map is
accessed, but the update_disk method can do the update without needing to
populate it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 16:18:03 +0100] rev 51535
branchcache: move the processing of the new data in a dedicated method
In a future changeset, this will allow the V3 of the branch cache to use a fast
path when possible.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 16:10:44 +0100] rev 51534
branchcache: gather newly closed head in a dedicated set
This is part of a series to more clearly split the update in two step. This
will allow us to introduce a fast path during update in a future changeset.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 16:09:42 +0100] rev 51533
branchcache: gather new obsolete revision in a set
This is part of a series to more clearly split the update in two step. This
will allow us to introduce a fast path during update in a future changeset.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 15:54:22 +0100] rev 51532
branchcache: filter obsolete revisions sooner
Since we won't do anything with the obsolete revisions, we can just ignore them
sooner.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Mar 2024 10:55:22 +0100] rev 51531
branchcache: skip entries that are topological heads in the on disk file
In the majority of cases, topological heads are also branch heads. We have
efficient way to get the topological heads and efficient way to retrieve
their branch information. So there is little value in putting them in the branch
cache file explicitly. On the contrary, writing them explicitly tend to create
very large cache file that are inefficient to read and update.
So the branch cache v3 format is no longer including them. This changeset focus
on the format aspect and have no focus on the performance aspect. We will cover
that later.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Mar 2024 01:35:43 +0100] rev 51530
branchcache: simplify the branch rev cache test
We don't need that many content dump and this gets in the way in change in
access pattern (e.g. accessing revision in a different order change the order of
branches in the "names" file).
So we simplify this test in advance.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 11:39:44 +0100] rev 51529
branchcache: store filtered hash and obsolete hash independently for V3
This will avoid the bug covered in tests/test-branches-obsolete.t when we stop
storing all heads explicitly in V3.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 12:07:31 +0100] rev 51528
branchcache: show the cache file content in test-branches-obsoletes.t
This help to track the changes in format between v2 and v3.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 02:20:53 +0100] rev 51527
branchcache: rework the `filteredhash` logic to be more generic
We now have a more flexible `key_hashes` tuple. We duplicated various logic in
the V2 and V3 version of the cache as the goal is to start changing the logic
for V3 in the next few changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 01:53:52 +0100] rev 51526
filteredhash: rename the filteredhash function
The new name is less ambiguous, as we are about to introduce an alternative
function it seems like a good idea to have clearer name to distinct the two.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Mar 2024 01:43:51 +0100] rev 51525
filteredhash: split the computation of revision sets
The branch2's filteredhash combines the filtered revisions and the obsolete
ones, this will creates issue for implicit reference to heads we want to
introduce for the v3 of the branch cache format. So we isolate this logic for
alternative use.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 05 Mar 2024 15:21:18 +0100] rev 51524
filteredhash: move the hashing in its own function
This will help us to reuse this logic in variants of the hashes used for branch
cache validation.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 25 Feb 2024 23:31:50 +0100] rev 51523
branchcache: cleanup the final key generation after update
A lot of duplicated work seemed to be done, as we already update the tiprev and
tipnode when needed right before. So we simplify that part to focus on the
filtered hash.
See inline comment for details.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 12:56:08 +0100] rev 51522
branchcache: add more test for the logic around obsolescence and branch heads
While working on branch-cache-v3, we noticed some ambiguity in the
filtered+obsolete hash. However this was only caught by a rebase test by
chance.
It seems important to explicitly tests these cases.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 15:44:44 +0100] rev 51521
branchcache-v3: use more explicit header line
The key-value approach is clearer and gives more rooms to have the format evolve
in a clear way. It also provides extension (like topic) simpler way to extend
the validation scheme.
This is just a small evolution, the V3 format is still a work in progress.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 14:20:36 +0100] rev 51520
branchcache-v3: introduce a v3 format
For now the format is the very same, however we will start changing it in
future changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 27 Feb 2024 14:04:29 +0100] rev 51519
branchcache: use an explicit class for the v2 version
This prepare the introduction of an experimental v3 format version.
In the process, we move the description of the format in that new class.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 27 Feb 2024 15:33:21 +0100] rev 51518
branchcache: add some blank line in a test
This helps each section to stand out.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 25 Mar 2024 02:09:15 +0100] rev 51517
phases: update the phase set as we go during retract boundary
Apparently iterating over the `changed_revs` dictionary is very expensive.
On mozilla-try-2019-02-18, a perf::unbundle call with a 10 000 changesets
bundle gives give use the following timing.
e57d4b868a3e: 4.6 seconds
ac1c75188440: 102.5 seconds
prev-changeset: 30.0 seconds
this-changeset: 4.6 seconds
So, the performance regression is gone.
Once again: thanks to marvelous Python!
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 25 Mar 2024 01:50:31 +0100] rev 51516
phases: avoid a potentially costly dictionary interation in some case
If we retract for the draft phase, there is not non-public item to be retracted
and we can skip this part. This part is was apparently super costly thanks to
Python.
On mozilla-try-2019-02-18, a perf::unbundle call with a 10 000 changesets
bundle gives give use the following timing.
e57d4b868a3e: 4.6 seconds
ac1c75188440: 102.5 seconds
this-changeset: 30.0 seconds
So we recovered about ⅔ of the regression, the next changeset will give us the
rest back.
Raphaël Gomès <rgomes@octobus.net> [Thu, 21 Mar 2024 12:26:46 +0100] rev 51515
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Thu, 21 Mar 2024 12:24:42 +0100] rev 51514
Added signature for changeset
2e6fde2ed01e
Raphaël Gomès <rgomes@octobus.net> [Thu, 21 Mar 2024 12:24:36 +0100] rev 51513
Added tag 6.7.1 for changeset
2e6fde2ed01e
Raphaël Gomès <rgomes@octobus.net> [Thu, 21 Mar 2024 12:23:25 +0100] rev 51512
relnotes: add 6.7.1
Felipe Resende <felipe@fcresende.dev.br> [Sat, 16 Mar 2024 21:02:19 -0300] rev 51511
subrepo: fix normalizing paths with scheme
After revision
0afe96e374a7, subrepo paths were normalized using
posixpath.normpath and that resulted in ssh paths being wrongly converted
from ssh://host/path to ssh:/host/path
This fix applies the same logic used in urlutil.url to split the path scheme
from the rest and only use posixpath.normpath to the string after scheme://
Felipe Resende <felipe@fcresende.dev.br> [Sat, 16 Mar 2024 18:37:07 -0300] rev 51510
sshpeer: fix path when handling invalid url exception
In
73ed1d13c0bf the code was refactored but the error handling seems to have
been missed (or maybe the object shoud have implemented __bytes__)
Raphaël Gomès <rgomes@octobus.net> [Mon, 18 Mar 2024 11:25:21 +0100] rev 51509
delta-search: fix crash caused by unbound variable
This code path was apparently not tested. This fixes a crash when cloning the
Tryton repo.
Raphaël Gomès <rgomes@octobus.net> [Fri, 15 Mar 2024 10:52:51 +0100] rev 51508
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Fri, 15 Mar 2024 10:49:44 +0100] rev 51507
Added signature for changeset
c9ceb4f60256
Raphaël Gomès <rgomes@octobus.net> [Fri, 15 Mar 2024 10:49:40 +0100] rev 51506
Added tag 6.7 for changeset
c9ceb4f60256
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 15 Mar 2024 01:31:57 +0100] rev 51505
phases: avoid N² behavior in `advanceboundary`
We allowed duplicated entries in the deque, which each entry could potentially
insert all its ancestors. So advancing boundary for the full repository would
mean each revision would walk all its ancestors, resulting in O(N²) iteration.
For repository of any decent size, N² is quickly insane.
We introduce a simple set to avoid this and get back to reasonable performance.
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Mar 2024 16:25:46 +0100] rev 51504
relnotes: add 6.7
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Mar 2024 11:24:52 +0100] rev 51503
admin-commands: move the chainsaw extension to the admin commands module
Activating an extension is always a little bit of a chore and the long name,
options and "chainsaw" bits are deterrent enough.
This also allows us to help the discoverability for people looking for
repo "administration" tools, with the widest semantic of "administration".
Anton Shestakov <av6@dwimlabs.net> [Wed, 13 Mar 2024 16:22:13 -0300] rev 51502
obsutil: sort metadata before comparing in geteffectflag()
This is probably less important now that we dropped Python 2. We do still
support Python 3.6 though, and the dictionaries aren't ordered there either
(that was a big change that came with 3.7).
Still, maybe it's a good idea to sort metadata explicitly.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Mar 2024 11:11:34 +0100] rev 51501
tests: disable revlog compression in test-generaldelta.t (
issue6867)
The revlog compression makes a lot of numbers unstable. Since checking revlog
compression is not the goal of this test, we disable the compression to get
stable numbers.
This should avoid wasting more time on this kind of changes in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Mar 2024 11:09:29 +0100] rev 51500
test-general-delta: actually test optimize-delta-parent-choice=no
Since the configuration was not explicit, the case stopped testing what it
intended to test when the default value changed.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Mar 2024 13:09:01 +0100] rev 51499
test-chg: stabilize the log checking
The "worker process exited" line have been making the CI flaky for a long time.
Lets sort this out.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Mar 2024 12:03:40 +0100] rev 51498
tests: fix test-patchbomb-tls.t instability
The flakiness on chg is caused by a client that exit faster than the server
output log.
So actively wait for the server to issue the expected output (with a small
timeout)
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Mar 2024 16:05:28 +0100] rev 51497
test-lock: use synchronisation file instead of sleep
This will prevent the test to be flaky on load.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 10 Mar 2024 03:29:12 +0100] rev 51496
branchcache: use update_disk to refresh 'served' and 'served.hidden'
The `update_disk` method is dedicated to this kind of usecase. Now that the writting patterns are more consistent, we can use it to warm these two important cache.
I am dropping the first comment about "refreshing all the others" because it is
false. If a branchmap already exist for "served", none of the subset will be
updated.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 10 Mar 2024 03:25:04 +0100] rev 51495
branchcache: explictly update disk state only if no transaction exist
If a transaction exist the `write_dirty` call will eventually be done and the state will be synched on disk. It is better to no interfer with that.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 10 Mar 2024 03:32:50 +0100] rev 51494
branchcache: do not use `__getitem__` in updatecache
The `update_disk` method uses `updatecache` and the point of `update_disk` is to be able to do alternative processing to the one we do in `__getitem__`. So we calling `__getitem__` in `updatecache` defeat this purpose.
Instead we do the equivalent explicitly to preserve the spirit of `update_disk` (that we will actually put to use soon, I promise)
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 10 Mar 2024 05:10:00 +0100] rev 51493
branchcache: explicitly track inheritence "state"
We move from a binary "dirty" flag to a three value "state": "clean", "inherited", "dirty".
The "inherited" means that the branch cache is not only "clean", but it is a
duplicate of its parent filter.
If a branch cache is "inherited", we can non only skip writing its value on
disk, but it is a good idea to delete any stale value on disk, as those will
just waste time (and possibly induce bug) in the future.
We only do this in the update related to transaction or explicit cache update
(e.g `hg debugupdatecache`). Deleting the file when we simply detected a stall
cache during a read only operation seems more dangerous.
We rename `copy` to `inherit_for` to clarify we associate a stronger semantic
to the operation.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 10 Mar 2024 04:53:17 +0100] rev 51492
branchcache: stop writing more branchcache file on disk than needed
Before this change, we were unconditionally writing a branchmap file for the
filter level passed to `update_disk`. This is actually counter productive if no
update were needed for this filter level. In many case, the branch cache for a
filter level is identical to its parent "subset" and it is better to simply
keep the subset update and reuse it every time instead of having to do identical
work for similar subset.
So we change the `update_disk` method to only write a file when that filter
level differ from its parent. This removes many cases where identical files were
written, requiring multiple boring update in the test suite.
The only notable changes is the change to `test-strip-branch-cache.t`, this
case was checking a scenario that no longer reproduce the bug as writing less
branchmap file result in less stalled cache on disk.
Strictly speaking, we could create a more convoluted scenario that create a
similar issue. However the next changeset would also cover that scenario so we
directly updated that test case to a "no longer buggy" state.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 16:49:06 +0100] rev 51491
branchcache: do not copy the `_dirty` flag
If the inherited branch cache is dirty, it will be written on disk, and the
super-set did not need to modify it, the on disk value for the subset will be
re-useable as is. So the super set does not needs to write the very same content
itself.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 16:52:08 +0100] rev 51490
branchcache: explicitly assert that copy is always about inheritance
This would catch cases where copy is used for something else if any existed.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 09 Mar 2024 02:07:15 +0100] rev 51489
branchcache: stop using `copy(…)` in `replace(…)`
The `copy` method is mostly used for a filter level to inherit the branchmap
from a subset. So we stop using (abusing) it in "replace" to ensure `copy` is
used only for inheritance purposes.
Since `replace` is a method of the BranchMapCache, it seems fine to do lower
level operation there.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 16:47:32 +0100] rev 51488
branchcache: change the _delayed flag to an explicit `_dirty` flag
This is more consistent with the logic we use for other object and it open the way to a clearer management of the cache state.
Now, cache are created clean, cache update mark them dirty, writing them on
disk mark them clean again.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 15:50:15 +0100] rev 51487
branchcache: write branchmap in subset inheritance order
This way, we can guarantee a valid subset has been written before touching the
branchmap of another filter.
This is especially useful as we are bout to start deleting outdated branchmap
file.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 15:06:54 +0100] rev 51486
branchcache: do not accept "empty update"
This currently does not happens and it will be simpler that is remains that way.
If all update do something, we will be able to simply declare, in a later
changesets, that all update to result in a dirty branchcache.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Mar 2024 11:04:34 +0100] rev 51485
branchcache: avoid created a `None` filter repoview when writing
The repoview class is not intended to be used for unfiltered repository.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:49:55 +0100] rev 51484
stream-clone-tests: stop filtering non existent warning
This filtering was introduced in
74c004a515bc, however there is already no
warning in that changeset. So I guess the warnings existed when we the patch
was created but the problem was solved in another changeset that
74c004a515bc,
rebased on.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:46:12 +0100] rev 51483
stream-clone-test: simplify case testing obsolescence
There is only two important things in this test:
- the number of file we send, to show we picked the obsstore.
- the resulting state, to show we did alter things in the process.
The rest are of the number are very fragile and consume a lot of time for little
value when adjusting formats, caches, and protocol.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:43:07 +0100] rev 51482
stream-clone-test: simplify the case testing phases
There is only two important things in this test:
- the number of file we send, to show we picked the phase roots.
- the resulting phases, to show we did not modified them.
The rest are of the number are very fragile and consume a lot of time for little
value when adjusting formats, caches, and protocol.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:39:10 +0100] rev 51481
stream-clone-test: simplify bookmark clone
The important things to test here is the number of file included (to catch that
the bookmark file was sent). So we keep that part non glob'ed but glob the
rest.
The glob'ed numbers are very fragile and consume a lot of time for little value
when adjusting formats, caches, and protocol.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:31:42 +0100] rev 51480
stream-clone-test: add a verify call to the "clone while changing" case
It seems useful to very that the clone did not result in a corrupted copy.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 10:59:51 +0100] rev 51479
stream-clone-test: add title to various test cases
These case are fine as is, but as we are adding title to all the other as we
simplify them, lets add title for all cases.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:28:07 +0100] rev 51478
stream-clone-test: simplify testing of secret cloning restriction
Here, we just want to check if the streaming clone is allowed and used or not.
We do not care about the details of the clone itself.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:26:27 +0100] rev 51477
stream-clone-test: simplify the background file closing test
Here we just care about the fact the background file closing logic actually ran. We don't need to check the details of the cloning.
The details of the output is very fragile and consume a lot of time for little
value when adjusting formats, caches, and protocol. So we filter it out.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 10:51:01 +0100] rev 51476
stream-clone-test: simplify the --uncompressed alias check
To check that --uncompressed is an alias we just need to check it trigger a
stream clone, we don't need to check anything else.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 08 Mar 2024 10:50:42 +0100] rev 51475
stream-clone-test: drop an automatic pattern replacement
That pattern is nice, but it prevent us to glob the number of bytes when we
don't care about them. We don't care about them more often that what we
currently checks so dropping this pattern will help use to simplify various
tests.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:15:33 +0100] rev 51474
stream-clone-test: simplify the test for getbundle with stream=1
The core of this tests is about checking we receive a stream bundle with such
request. We don't need to look at too much of the details of the stream itself.
Since the content of the stream if shifting overtime, Such check is very
fragile and consume a lot of time for little value when adjusting formats,
caches, and protocol.
So we reduce the size of what we check to focus on "is this a stream clone"
question.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:05:28 +0100] rev 51473
stream-clone-test: factor some piece of basic clone test out
Multiple parts of this case (listing cache, checking error) are common to all
cases and don't need to be in the conditionnal block.
This simplify the test update.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 28 Feb 2024 22:01:09 +0100] rev 51472
stream-clone-test: simplify the case where server disabled it
We have an option to disable it, we don't need to test it with all protocol
variants.
In addition there is little value in looking at the bytes to bytes details of
the reply. Such check is very fragile and consume a lot of time for little
value when adjusting formats, caches, and protocol.
Georges Racinet <georges.racinet@octobus.net> [Mon, 11 Mar 2024 13:36:25 +0100] rev 51471
rust-matchers: raw regular expression builder
Extracting this `re_builder()` from `re_matcher()` makes it reusable
in more general cases than matching `HgPath` instances and would
help reducing code duplication in RHGitaly.
Georges Racinet <georges.racinet@octobus.net> [Mon, 11 Mar 2024 13:23:18 +0100] rev 51470
rust-filepatterns: export glob_to_re function
Making this function public should not risk freezing the internal API,
and it can be useful for all downstream code that needs to perform
glob matching against byte strings, such as RHGitaly where it will
be useful to match on branches and tags.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Mar 2024 01:20:12 +0100] rev 51469
repoview: prevent `None` to be passed as the filtername
We let such instantiation slip in a previous commit, so we add an explicit check
to prevent it to happen in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Mar 2024 11:04:34 +0100] rev 51468
branchcache: avoid created a `None` filter repoview when writing
The repoview class is not intended to be used for unfiltered repository.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 05 Mar 2024 15:07:47 +0100] rev 51467
rust-index: don't use mutable borrow to computed filtered heads
This does not need to mutate the index.
This is the prime suspect for some RuntimeError raised during some pushes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 05 Mar 2024 15:07:04 +0100] rev 51466
rust-index: don't use mutable borrow for head-diff computation
It does not needs to mutate the index.
This is one of the two suspects of RuntimeError being thrown during push.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 15:26:08 +0100] rev 51465
branchcache: move head writing in a `_write_headers` method
Same rational: this will help having format variants.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 15:25:41 +0100] rev 51464
branchcache: move head writing in a `_write_heads` method
Same rational: this will help having format variants.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 15:23:45 +0100] rev 51463
branchcache: move the header loading in a `_load_header` class method
This will help changing header parsing in format variants.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 15:15:10 +0100] rev 51462
branchcache: simplify a long line
Gratuitous change to help code readability.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 15:12:20 +0100] rev 51461
branchcache: rename `load` to `_load_heads`
We are about to have more similar function, we rename the existing one to a more
meaningful name and mark it private in the process.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 25 Feb 2024 20:40:37 +0100] rev 51460
branchcache: move the filename to a class attribute
This prepare the introduction of more variant of cache.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 27 Feb 2024 22:52:00 +0100] rev 51459
test-clonebundles: simplify matching to be less flavor depends
We keep the files and bytes output for the first call, but then we mostly check
that we are being served a stream-clone bundle, not the actual content and size
of the bundle. That aspect being tested by the stream clone test themselves.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 25 Feb 2024 23:05:33 +0100] rev 51458
repoview: fix changelog.__contains__ method
This have been around for ten years, so we can safely that this method have few
callers. However I am about to add one.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 08 Jan 2024 15:11:34 +0100] rev 51457
branchcache: unconditionally write delayed branchmap
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 25 Feb 2024 16:14:15 +0100] rev 51456
branchcache: drop the unused `_verifyclosed`
This code appears dead since its introduction about 5 years ago in this three
consecutive commits:
-
6578654916ae → introduce the method with two calls
-
7c9d4cf23adf → remove first call
-
be5eeaf5c24a → remove second call
o changeset:
be5eeaf5c24a
| user: Pulkit Goyal <pulkit@yandex-team.ru>
| date: Fri Apr 05 15:57:09 2019 +0300
| summary: branchcache: don't verify closed nodes in _branchtip()
|
o changeset:
7c9d4cf23adf
| user: Pulkit Goyal <pulkit@yandex-team.ru>
| date: Fri Apr 05 15:56:33 2019 +0300
| summary: branchcache: don't verify closed nodes in iteropen()
|
o changeset:
6578654916ae
| user: Pulkit Goyal <pulkit@yandex-team.ru>
~ date: Mon Apr 01 13:56:47 2019 +0300
summary: branchcache: lazily validate nodes from the branchmap
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 15:46:24 +0100] rev 51455
branchcache: dispatch the code into the dedicated subclass
The code useful only to the local brancache have now been moved into the
dedicated subclass. This will help improving the branchcache code without subtle
breaking the remote variants.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 25 Feb 2024 14:09:36 +0100] rev 51454
branchcache: introduce a base class for branchmap
This will help define a clear boundary between the two.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 19 Feb 2024 12:09:06 +0100] rev 51453
branchcache: fix the copy code
We copy some internal attribute along too. This should prevent inconsistency in
the resulting branchmap.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 19 Feb 2024 13:11:42 +0100] rev 51452
branchcache: pass a "verify_node" attribut to __init__ instead of hasnode
The hasnode callback cannot be inherited and is dropped on copy, which seems
like a bad idea. Instead we pass the actual semantic as a parameter and let the
internal logic deal with it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 19 Feb 2024 11:59:56 +0100] rev 51451
branchcache: stop storing a repository instance on the cache altogether
We did not really needed it and we do not needs it anymore at all. So lets make
things simpler for consistency and garbage collecting and stop storing it
altogether.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 19 Feb 2024 11:43:19 +0100] rev 51450
branchcache: pass the target repository when copying
Branchmap are usually copied to be used on a different repoview using a
different filter level. Passing the repository around means the repository in
`branchcache._repo` will drift from the actual branchmap filter.
This is currently "fine" because the repo is only used to retrieve the `nullid`
value. However, this is a fairly big trap for any extension or future code using
the `_repo` attribute.
The replace logic is now using a copy to ensure the right repository view is
used to initialized the cached value.
We add a couple of assert for make sure this inconsistency does not sneak back.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 19 Jan 2024 11:30:10 +0100] rev 51449
branchcache: have an explicit method to update the on disk cache
Explicit is better and will give use more flexibility for future evolution of
the storage.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 29 Feb 2024 14:13:21 -0800] rev 51448
crecord: drop calls to `curses.endwin()`
We got a bug report where `curses.endwin()` failed with `_curses.error: endwin()
returned ERR`. Looking at
e306d552dfb12, it seems like we should be able to just
remove these calls.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 04 Mar 2024 04:16:15 +0100] rev 51447
config: move the option to mmap rev branch cache in the storage section
See previous commit for rational.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 04 Mar 2024 04:13:33 +0100] rev 51446
config: document the storage and format sections
This should help people to put configuration in the right section.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 12:59:57 +0100] rev 51445
rust-index: drop offset_override
The inline `offsets` value diverge from the one on disk for added value, so the
offset_override tricks is not going to work well once we start having the full
revlog logic in Rust.
We remove it beforehand and align the Rust logic to the Python one (adjusting
the segment offset at read time for inline revlog).
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 26 Feb 2024 13:41:02 +0100] rev 51444
rust-index: stop calling `with_offset` in the tests
We are not adding any data, so why are we setting any offset?
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:57:50 +0100] rev 51443
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:55:53 +0100] rev 51442
Added signature for changeset
d1d48d18db37
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:55:49 +0100] rev 51441
Added tag 6.7rc0 for changeset
d1d48d18db37
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:18:29 +0100] rev 51440
relnotes: add 6.7rc0
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:18:17 +0100] rev 51439
relnotes: remove outdated message from `next`
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:10:44 +0100] rev 51438
branching: merge default into stable for 6.7rc0
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:09:18 +0100] rev 51437
branching: merge stable into default
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 14:07:33 +0100] rev 51436
perf: add a --as-push option to perf::unbundle
This turned out to make a quite significant difference.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 06:25:09 +0100] rev 51435
chainsaw-update: exit early if one of the intermediate command fails
That will prevent the user to be presented with a start that pretend to be
consistent with the request, but is not.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 03:32:35 +0100] rev 51434
chainsaw-update: lock the repository for the duration of the operation
This should prevent and catch some misusage where something else try to touch
the repository.
Georges Racinet <georges.racinet@octobus.net> [Fri, 23 Feb 2024 11:41:55 +0100] rev 51433
chainsaw-update: taking care of initial cloning
Perhaps we should go just a bit lower level than this `instance()`,
since the main added value in our use-case is full path resolution,
that we need to do anyway for the rmtree cleanup.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 11:30:58 +0100] rev 51432
chainsaw-update: use a graph with branching in graph
This will be relevant for the next improvement of `chainsaw-update`.
Georges Racinet <georges.racinet@octobus.net> [Wed, 17 Jan 2024 14:39:06 +0100] rev 51431
chainsaw-update: log actual locks breaking
Previously, the command would simply state that it was about
to break locks, not if there was actually some to break.
This version is race-free. It would be also possible to display
the content of the lock before hand (not race-free but informative
in almost all cases).
Georges Racinet <georges.racinet@octobus.net> [Wed, 17 Jan 2024 14:26:58 +0100] rev 51430
vfs: have tryunlink tell what it did
It is useful in certain circumstances to know whether vfs.tryunlink()
actually removed something or not, be it for logging purposes.
Georges Racinet <georges.racinet@octobus.net> [Sat, 26 Nov 2022 12:23:56 +0100] rev 51429
chainsaw: new extension for dangerous operations
The first provided command is `chainsaw-update`, whose one and single job is
to make sure that it will pull, update and purge the target repository,
no matter what may be in the way (locks, notably), see docstring for rationale.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 03:45:07 +0100] rev 51428
rust: disable the RustIndex without persistent nodemap
See rational inline.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 03:44:56 +0100] rev 51427
rust: stop claiming the C index is compatible with the rust code
This is no longer the case since the introduction of the pure Rust Index, and
was probably not the case since the MixedIndex itself.
So we fix the dedicated attribute value.
Raphaël Gomès <rgomes@octobus.net> [Thu, 22 Feb 2024 15:11:26 +0100] rev 51426
rust-index: remove one collect when converting back
Turns out this is slightly faster. Sending the results back to Python is still
the most costly (like 75% of the time) of the whole method, but it's about
as fast as it can be now.
hg perf::phases on mozilla-try-2023-03-22
before: 0.267114
after: 0.247101
Raphaël Gomès <rgomes@octobus.net> [Thu, 22 Feb 2024 15:06:16 +0100] rev 51425
rust-index: improve phase computation speed
While less memory efficient, using an array is *much* faster than using a
HashMap, especially with the default hasher. It even makes the code simpler,
so I'm not really sure what I was thinking in the first place, maybe it's more
obvious now.
This fix a significant performance regression when using the rust version of the
code. (however, the C code still outperform rust on this operation)
hg perf::phases on mozilla-try-2023-03-22
- 6.6.3: 0.451239 seconds
- before: 0.982495 seconds
- after: 0.265347 seconds
- C code: 0.183241 second
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 06:37:25 +0100] rev 51424
phases: directly update the phase sets in advanceboundary
This is similar to what we do in retractboundary. There is no need to invalidate
the cache if we have everything at hand to update it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 05:25:35 +0100] rev 51423
phases: large rework of advance boundary
In a similar spirit as the rework of retractboundary, the new algorithm is doing
an amount of work in the order of magnitude of the amount of changeset that
changes phases. (except to find new roots in impacted higher phases if any may
exists).
This result in a very significant speedup for repository with many old draft
like mozilla try.
runtime of perf:unbundle for a bundle constaining a single changeset (C code):
before 6.7 phase work: 14.497 seconds
before this change: 6.311 seconds (-55%)
with this change: 2.240 seconds (-85%)
Combined with the other patches that fixes the phases computation in the Rust
index, the rust code with a persistent nodemap get back to quite interresting
performances with 2.026 seconds for the same operation, about 10% faster than
the C code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 19:21:14 +0100] rev 51422
phases: apply similar early filtering to advanceboundary
advanceboundary is called the push's unbundle (but not the other unbundle) so
advanceboundary did not show up the profile I looked at so far.
We start with simple pre-filtering to avoid doing any work if we don't needs
too.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:09:25 +0100] rev 51421
phases: filter revision that are already in the right phase
No need to compute new roots if everything is already in order.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:05:29 +0100] rev 51420
phases: invalidate the phases set less often on retract boundary
We already have the information to update the phase set, so we do so directly
instead of invalidating the cache.
This show a sizeable speedup in our `perf::unbundle` benchmark on the
many-draft mozilla-try repository.
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.perf.perf-unbundle
# bin-env-vars.hg.flavor = no-rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.revs = last-10
before: 2.055259 seconds
after: 1.887064 seconds (-8.18%)
# benchmark.variants.revs = last-100
before: 2.409239 seconds
after: 2.222429 seconds (-7.75%)
# benchmark.variants.revs = last-1000
before: 3.945648 seconds
after: 3.762480 seconds (-4.64%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:05:23 +0100] rev 51419
phases: incrementally update the phase sets when reasonable
When the amount of manual walking is small, we update the phases set manually
instead of computing them from scratch. This should help small update. The next
changesets will make this used more often by reducing the amount of full
invalidation we do on roots upgrade.
The criteria for using an incremental upgrade are arbitrary, however, it "should
never hurt".
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 00:01:33 +0100] rev 51418
phasees: properly shallow caopy the phase sets dictionary
We are about to increments the set more incrementally in some case, so we need
to make a proper shallow copy of it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 14:42:13 +0100] rev 51417
phases: pass an unfiltered repository to _ensure_phase_sets
It seems better for such a low level function to be able to assume it operate on
a real repository.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:01:25 +0100] rev 51416
phases: drop set building in `hasnonpublicphases`
We don't actually use the set, so why do we ensure they are built?
(we should also clean up the use of repository argument but that's a quest for later).
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:59:28 +0100] rev 51415
phases: gather the logic for phasesets update in a single method
This logic is duplicated around for no good reason, we gather it in a single
place.
The conditional is the new function are a bit weird as we about going to extend it soon.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 10:58:54 +0100] rev 51414
phases: change the way we warm the phasecache in repocache
Same logic as for the previous chngeset. We are about to rename and change the
method used here.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 10:56:05 +0100] rev 51413
phases: use a more generic way to trigger a phases computation for perf
Querying the tip most revision will require the cache to warm the same as
calling the dedicated method. This avoid using a method that is mostly meant for
internal use and will be renamed in a coming changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 12:01:09 +0100] rev 51412
phases: fix an overzealous invalidation of the phase sets
If `len(cl) == self._loadedrevslen` the cache is up to date.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:04:56 +0100] rev 51411
phases: type annotation for `_phasesets`
Does not hurt.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 23:46:21 +0100] rev 51410
phases: leverage the collected information to record phase update
Since the lower level function already gather this information, we can directly
use it.
This comes with a small change to the test that are actually fixing them. The
previous version over-reported some phase change that did not exists. In both
case, we are force revision `1` to be secret and `0` remains draft`, the
previous code wrongly reported `0` as moving to secret while it properly
remained draft in the repository.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 10:41:09 +0100] rev 51409
phases: large rewrite on retract boundary
The new code is still pure Python, so we still have room to going significantly
faster. However its complexity of the complex part is `O(|[min_new_draft, tip]|)` instead of
`O(|[min_draft, tip]|` which should help tremendously one repository with old
draft (like mercurial-devel or mozilla-try).
This is especially useful as the most common "retract boundary" operation
happens when we commit/rewrite new drafts or when we push new draft to a
non-publishing server. In this case, the smallest new_revs is very close to the
tip and there is very few work to do.
A few smaller optimisation could be done for these cases and will be introduced in
later changesets.
We still have iterate over large sets of roots, but this is already a great
improvement for a very small amount of work. We gather information on the
affected changeset as we go as we can put it to use in the next changesets.
This extra data collection might slowdown the `register_new` case a bit, however
for register_new, it should not really matters. The set of new nodes is either
small, so the impact is negligible, or the set of new nodes is large, and the
amount of work to do to had them will dominate the overhead the collecting
information in `changed_revs`.
As this new code compute the changes on the fly, it unlock other interesting
improvement to be done in later changeset.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 15:49:21 +0100] rev 51408
phases: fast path public phase advance when everything is public
Everything is already public, so we have nothing to do here.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 15:24:22 +0100] rev 51407
phases: fast path retract of public phase
There are no boundary to retract, so lets do nothing.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:40:13 +0100] rev 51406
phases: keep internal state as rev-num instead of node-id
Node-id are expensive to work with, dealing with revision is much simple and
faster.
The fact we still used node-id here shows how few effort have been put into
making the phase logic fast. We tend to no longer use node-id internally for
about ten years.
This has a large impact of repository with many draft roots. For example this
Mozilla-try copy have ½ Million draft roots and `perf::unbundle` see a
significant improvement.
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.perf.perf-unbundle
# bin-env-vars.hg.flavor = no-rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.revs = last-1
before:: 1.746791 seconds
after:: 1.278379 seconds (-26.82%)
# benchmark.variants.revs = last-10
before:: 3.145774 seconds
after:: 2.103735 seconds (-33.13%)
# benchmark.variants.revs = last-100
before:: 3.487635 seconds
after:: 2.446749 seconds (-29.85%)
# benchmark.variants.revs = last-1000
before:: 5.007568 seconds
after:: 3.989923 seconds (-20.32%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:40:08 +0100] rev 51405
phases: do filtering at read time
This remove the need for the `filterunknown` method at all.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:38:01 +0100] rev 51404
phases: always write with a repo
In the future change that move the internal representation of phase-roots from
node-id to rev-num, we will use a repository to translate revision numbers back
to node at write time.
Since that future change is quite complicated already, we do this small API
change beforehand.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 17:18:15 +0100] rev 51403
phases: mark `phasecache.phaseroots` private
We are about to change its content from nodeid to revnum. So anyone directly
using the content might be in unexpected troubles. We start by making it private
to explicitly break any such user (and discourage them to do so).
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 17:17:54 +0100] rev 51402
phases: check secret presence the right way during discovery
There is an official function for this, lets use it.
This will prevent the code to break in the future while we refactor the phase
code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 14:21:18 +0100] rev 51401
phases: explicitly filter stripped revision at strip time
Explicit is better than implicit. The current logic is bit subtle and fragile.
It also get in the way of using something else than node-id as internal storage.
We replace it with a more explicit filtering while striping.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 04:26:03 +0100] rev 51400
debug: add a debug::unbundle command that simulate the unbundle from a push
The code have different behavior when the unbundle comes from a push, so we
introduce a command that can simulate such unbundle.
For our copy of mozilla-try-2023-03-22, this make the unbundle jump from 2.5
seconds (with `hg unbundle`) to 15 seconds (with `hg debug::unbundle`).
That 15 seconds timings is consistent with the issue seen in production.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 18:28:01 +0100] rev 51399
perf: support --template on perf::phases
Zeger Van de Vannet <zeger@vandevan.net> [Wed, 14 Feb 2024 08:14:46 +0100] rev 51398
annotate: limit output to range of lines
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 12 Feb 2024 20:01:27 +0000] rev 51397
revlog: add a Rust implementation of `headrevsdiff`
Python implementation of `headrevsdiff` can be very slow in the worst
case compared with the `heads` computation it replaces, since the
latter is done in Rust.
Even the average case of this Python implementation is still
noticeable in the profiles.
This patch makes the computation much much faster by doing it in Rust.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 21 Dec 2023 20:30:03 +0000] rev 51396
revlog: add a C implementation of `headrevsdiff`
Python implementation of `headrevsdiff` can be very slow in the worst
case compared with the `heads` computation it replaces, since the
latter is done in C.
Even the average case of this Python implementation is still
noticeable in the profiles.
This patch makes the computation much much faster by doing it in C.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 21 Dec 2023 17:38:04 +0000] rev 51395
unbundle: faster computation of changed heads
To compute the set of changed heads it's sufficient to look at the recent commits,
instead of looking at all heads currently in existence.
Raphaël Gomès <rgomes@octobus.net> [Wed, 21 Feb 2024 11:53:30 +0100] rev 51394
branching: merge stable into default
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Tue, 20 Feb 2024 10:47:47 -0500] rev 51393
hg-core: separate timestamp and extra methods
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 02:12:58 +0100] rev 51392
debugformat: fix formatting for compression level
`bytes(<int>)` gives a very different result as `str(<int>)` and the display
of `hg debugformat` have been broken for a while as a result.
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Thu, 15 Feb 2024 11:39:18 -0500] rev 51391
hg-core: implement timestamp line parsing
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 15:21:44 -0500] rev 51390
doc: document that labels must have a dot in them to have an effect
I noticed that the `hg topics` template has a bare `topic` label with
no dot, and that makes it useless, as such a label will never receive
any effect by the colour extension.
This dot has been required for a long time, at least since 2011, but
we never formally documented it!
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Feb 2024 18:10:41 +0000] rev 51389
tests: tweak chg test to make it fail less often
the test apparently sometimes prints the word "start" as a part of profile,
so let's no longer match "start":
CHGHG=/*/install/bin/hg (glob)
+ \x1b[90m | 50.0% 0.01s profiling.py: __enter__ line 196: self.start()\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s profiling.py: start line 261: self._profiler.__enter__()\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s profiling.py: statprofile line 125: statprof.start(mechanism=b'...\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s statprof.py: start line 356: state.thread.start()\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s threading.py: start line 852: self._started.wait()\x1b[0m (esc)
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Feb 2024 15:21:43 +0000] rev 51388
cext: fix potential memory leaks of list items appended with PyList_Append
Also reduce the duplication in the tricky code that uses PyList_Append by
extracting it into a function `pylist_append_owned`.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:55:11 -0500] rev 51387
crecord: enable search hotkeys (
issue6834)
The keys I chose here should be similar to less/vim keybindings, which
should fit the overall keybinding theme of crecord.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:54:21 -0500] rev 51386
crecord: add handle(next|prev)search functions
These are now just simple wrappers around `searchdirection`
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:53:58 -0500] rev 51385
crecord: add a searchdirection function
If a regex has already been previously set, this function handles the
UI elements of searching again forward or backward.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:50:00 -0500] rev 51384
crecord: add a handlesearch function
This function sets up some of the UI, such as getting the search
string from the user and displaying results or their absence.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:48:09 -0500] rev 51383
crecord: add a showsearch function
This function takes a regex and searches either forward or backward,
moving the current item to the found item, if any, and unfolding the relevant context.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:46:41 -0500] rev 51382
crecord: add a default regex to curseschunkselector
Whether there is a regex to search or not will affect if we can find
the next or the previous search hit.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:43:51 -0500] rev 51381
crecord: add `content` properties to all nodes
In order to have a unified API of what can be searched, let's provide
a `content` property to each node type. This way we can search
filenames, context headers (e.g. containing function names, if
deducible from patch context) or changed lines themselves.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:42:08 -0500] rev 51380
crecord: update uiheader docstring
There's no need to move anything to patch.py. The uiheader class only
has methods relevant to crecord and overrides __getattr__ in order to
use `patch.header` objects as a sort of mixin.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:40:47 -0500] rev 51379
crecord: add skipfolded param to previtem
This just simplifies the API a bit so it matches `nextitem` and I
can handle both nextitem and previtem symmetrically.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 15:23:59 -0500] rev 51378
dispatch: don't attempt to import debugger as bytestring
The __import__ thingie needs a string, not a bytestring. Guess I'm the
only one who uses this once in a while and noticed it was broken.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 11:53:04 -0500] rev 51377
debugsetparents: fix Marmoutian docstring
Just some light proofreading.
Martin von Zweigbergk <martinvonz@google.com> [Tue, 13 Feb 2024 11:49:55 -0800] rev 51376
docs: fix broken `make` in `docs/`
We had some wrapped lines without blank lines between, which made the runrst
script think the list was not a list and it got confused about the
indentation. I added blank lines, and also some other minor styling for
consistency with the rest of the file.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 10 Jan 2024 18:58:42 +0000] rev 51375
branchmap: use mmap for faster revbranchcache loading
A typical revbranchmap usage is:
- load the entire revbranchmap file into memory
- maybe do a few lookups
- add a few bytes to it
- write the addition to disk
There's no reason to load the entire revbranchmap into memory.
We can split it into a large immutable prefix and a mutable suffix,
and then memorymap the prefix, thus saving all the useless loading.
Benchmarking on some real-world pushes suggests that out of ~100s server-side
push handling revbranchcache handling is responsible for:
* ~7s with no change
* ~1.3s with the change, without mmap
* 0.04s with the change, with mmap
Manuel Jacob <me@manueljacob.de> [Fri, 02 Feb 2024 04:46:54 +0100] rev 51374
hghave: add py312 and py313
While not required in the core test suite in the moment, these could be useful
in the future or for extensions. For example, Python 3.12 removed distutils and
it might make sense to differentiate based on that.
Manuel Jacob <me@manueljacob.de> [Fri, 02 Feb 2024 04:23:07 +0100] rev 51373
hghave: use strings instead of floats for version numbers passed to checkvers
I think it’s a really bad idea to use floats for version numbers. One problem
is that 3.10 is the same as 3.1.
Manuel Jacob <me@manueljacob.de> [Sat, 03 Feb 2024 23:45:08 +0100] rev 51372
py3: fully port doctest to py3
Manuel Jacob <me@manueljacob.de> [Fri, 02 Feb 2024 04:03:15 +0100] rev 51371
import-checker: make stdlib path detection work in virtual environments
The previous logic tried to find the directory containing BaseHTTPServer, which
didn’t work as indended because it was only present on Python 2. Instead, the
argparse module is used now.
Manuel Jacob <me@manueljacob.de> [Fri, 02 Feb 2024 03:39:37 +0100] rev 51370
cleanup: remove unnecessary list constructor calls around list comprehensions
Raphaël Gomès <rgomes@octobus.net> [Mon, 12 Feb 2024 16:22:47 +0100] rev 51369
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Mon, 12 Feb 2024 16:17:08 +0100] rev 51368
Added signature for changeset
3fd1efb3ad12
Raphaël Gomès <rgomes@octobus.net> [Mon, 12 Feb 2024 16:16:10 +0100] rev 51367
Added tag 6.6.3 for changeset
3fd1efb3ad12
Raphaël Gomès <rgomes@octobus.net> [Mon, 12 Feb 2024 16:14:18 +0100] rev 51366
relnotes: add 6.6.3
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 08 Jan 2024 15:25:33 +0000] rev 51365
tests: fix nondeterministic test failure in test-contrib-perf.t
It turns out (not too shockingly!) the kernel sometimes has some work to do,
perhaps at the very least context-switching, so asserting the system time
is 0.000000 doesn't work.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Thu, 01 Feb 2024 19:35:35 -0500] rev 51364
grep: restore usage of --include/--exclude options
The refactor in
4a73df6eb67d accidentally forgot to transform the opts
argument for walkopts into a byteskwargs. This resulted in its options
being ignored. In particular, the -X/-I pair of options was missing.
A simple fix restores its usage. Tests included, of course.
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Tue, 30 Jan 2024 22:14:02 +0000] rev 51363
rust-changelog: don't panic on empty file lists
Anton Shestakov <av6@dwimlabs.net> [Wed, 24 Jan 2024 13:49:29 -0300] rev 51362
tests: use sha256line.py instead of /dev/random in test-censor.t (
issue6858)
Sometimes the systems that run our test suite don't have enough entropy and
they cannot produce target file of the expected size using /dev/random, which
results in test failures. Switching to /dev/urandom would give us way more
available data at the cost of it being less "random", but we don't really need
to use entropy for this task at all, since we only care if the file size after
compression is big enough to not be stored inline in the revlog. So let's use
something that we already have used to generate this kind of data in other
tests.
Anton Shestakov <av6@dwimlabs.net> [Wed, 24 Jan 2024 13:35:30 -0300] rev 51361
tests: make sha256line.py available for all tests
This was previously only used in test-revlog-delta-find.t, but it will be
useful (and used) in other tests that might need to generate
poorly-compressible files.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 22:51:01 +0100] rev 51360
delta-find: pass the full deltainfo to the _DeltaSearch class
Having more information is better, so we pass it directly.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 05:20:00 +0100] rev 51359
delta-find: move sparse-revlog pre-filtering in the associated class
Lets move the specialized code in the specialized class.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 05:16:08 +0100] rev 51358
delta-find: move sparse-revlog delta checks in the associated class
Lets move the specialized code in the specialized class.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 04:39:18 +0100] rev 51357
delta-find: split the _DeltaSearch class in two
We now have things sliced small enough to have two class that use different
`_iter_groups` implementation to encode their different logic.
The filtering code remains to be moved, but I would rather keep this changeset
simple and move them in the next.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 22:40:11 +0100] rev 51356
delta-find: finish reworking the snapshot logic and drop more layer
The refining logic only applies to the snapshot logic, and this is now all
contained in a dedicated method.
Along the way, we drop the refined_groups // raw_groups layer as they no longer
make sense. The result is a more explicit `iter_groups` method.
This conclude the splitting and simplification of the groups generation.
We are now ready to dispatch this in more diverse classes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 22:29:02 +0100] rev 51355
delta-find: move the base of the delta search in its own function
That logic is complicated enough that is is worth puting in its own function. Another method will be introduced in the next changeset to deal with the actual refining.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 21:44:51 +0100] rev 51354
delta-find: move the emotion of prev in a dedicated method
After splitting the filtering, and with the `_candidate_groups` layer removed,
we can start splitting the group generation too. This helps to organize this
code and make it easier to modifying the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 21:51:43 +0100] rev 51353
delta-find: move the emotion of parents in a dedicated method
After splitting the filtering, and with the `_candidate_groups` layer removed,
we can start splitting the group generation too. This helps to organize this
code and make it easier to modifying the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 03:08:46 +0100] rev 51352
delta-find: explicitly deal with usage of the cached revision
We can remove this from the general logic path and directly deal with this
corner case early.
This result in a small change in test-generaldelta.t as it turns out that:
- at commit time we (sometimes) precompute a delta against p1 and pass it as the
cached delta.
- since cached delta where going through the same filtering as everything, we
could "optimize" the base if it applied to an empty delta, resulting in not
using the pre-computed delta.
The simpler logic fix the second item, making the cached delta base always actually
tested when requested.
Note that the computation of a fast delta against p1 only is questionable, but
looking into that is out of scope for this series.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 03:02:30 +0100] rev 51351
delta-find: remove the "candidate groups" layer
We have enough pieces to remove this generator and directly bear it load using
the underlying object.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 03:13:36 +0100] rev 51350
delta-find: stop using heuristic to determine if we are creating a snapshot
This avoid assuming a changeset is a snapshot when it is actually something
simpler.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 02:38:38 +0100] rev 51349
delta-find: explicitly track stage of the search
Being more explicit about what we are doing is going to be useful. We actually
start making use of it in later changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 20:09:34 +0100] rev 51348
delta-find: drop some dead debug code
Seems like it was never put to use, so lets simply remove it for now.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 03:34:27 +0100] rev 51347
delta-find: introduce and use specialized _DeltaSearch class
For now, we introduce some very simple variant, but they are still useful to
display how having the class can helps keeping the simple case simple and
their special case out of more advanced logic.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 01:05:10 +0100] rev 51346
delta-find: introduce a base class for _DeltaSearch
This prepare the introduction of specialized the class in the next changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 03:23:24 +0100] rev 51345
delta-find: simplify the delta checking function for snapshot
Since the function is all about snapshot, we can safely use an early return and
make the result simpler.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 07 Jan 2024 00:56:15 +0100] rev 51344
delta-find: move good delta code earlier in the class
Nothing change except the code location. This greatly helps readability of the
next future diff,
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 Jan 2024 17:20:30 +0100] rev 51343
delta-find: split is_good_delta_info into more thematic function
Same logic as for candidate filtering, we group code into related sub method.
This will help clarifying later patches as some logic is pre-splitted
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 Jan 2024 15:35:57 +0100] rev 51342
delta-find: clarify some comment and code in is_good_delta_info
We move the comment closer to the code it describ and we compute an
intermediate value without using the `textlen` variable, as it will stop being
defined in a future patch.
This will clarify future patches.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 Jan 2024 15:35:36 +0100] rev 51341
delta-find: move delta size check earlier in is_good_delta_info
This will clarify future patches by regrouping related logic before larger
movement.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 Jan 2024 15:04:10 +0100] rev 51340
delta-find: split the delta-chain part of `_pre_filter_rev` in a method
Since `_pre_filter_rev` contains logic from various sources of constraint, we
start splitting is in subfunction to clarify and document the grouping.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 Jan 2024 14:51:48 +0100] rev 51339
delta-find: split the "sparse" part of `_pre_filter_rev` in a method
Since `_pre_filter_rev` contains logic from various sources of constraint, we
start splitting is in subfunction to clarify and document the grouping.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 18:56:31 +0100] rev 51338
delta-find: split the generic part of `_pre_filter_rev` in a method
Since `_pre_filter_rev` contains logic from various sources of constraint, we
start splitting is in subfunction to clarify and document the grouping.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 Jan 2024 14:39:10 +0100] rev 51337
delta-find: drop the temporary indent
Now that the complicated change is made, we can do the noisy one.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 18:40:47 +0100] rev 51336
delta-find: move pre-filtering of individual revision in its own function
This goes one step further than the previous change by making the pre-filtering
of individual candicates revision in its own function. This will allow subclass
to easily configure this filtering with their own constrains.
The `if True:` part help the readability of this diff a lot and will be drop in
to the next changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 04:21:07 +0100] rev 51335
delta-find: move pre-filtering of candidates in its own function
This organise the code further and open the way to specialization via
sub-classing. Something important for the coming changes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 29 Dec 2023 13:35:08 +0100] rev 51334
delta-find: move away from the generator API for _DeltaSearch
We use more explicit function call. This make operations more explicit and will
make future refactoring simpler.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 21:13:14 +0100] rev 51333
delta-find: use "-1" as depth snapshot-dept for non snapshot in debug
This will help do distinct full snapshot (level 0) and normal delta (not a snapshot, no snapshot level)
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 21:45:45 +0100] rev 51332
delta-find: fix the computation of the `prev` value
The previous computation was "wrong" it always used the tiprev, even when computing a delta in a non-append case (mostly benchmark).
This never produced wrong delta on disk, but would misled debug or performance command. Since it does not have any actual user impact, I did not put this on stable.
With the code fixed we can now use revisions in some search and it makes the
test display more interesting behavior since the algorithm has more to work
with.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 22 Dec 2023 01:33:40 +0100] rev 51331
delta-find: move is_good_delta_info on the _DeltaSearch class
There is a lot of format specific code in `is_good_delta_info`, moving it on
_DeltaSearch will allow to split this into subclass soon.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 22 Dec 2023 01:33:33 +0100] rev 51330
delta-find: feed revinfo to _DeltaSearch
The revinfo has more information and will allow for even more function to be
turned into method.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 03:23:11 +0100] rev 51329
delta-find: clarify that revisioninfo.p1/p2 constains nodeid
This clarify the content of these attributes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 03:23:41 +0100] rev 51328
delta-find: move filing of some debug data in `_one_dbg_data`
Since the `_one_dbg_data` method is meant to create a valid debug dictionnary.
We can as well prefill the relevant value to reduce the amount of debug code in
the main code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 01:28:30 +0100] rev 51327
delta-find: add more explanation to the the deltas_limit < length check
More explanations is always good.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 23 Nov 2023 01:13:40 +0100] rev 51326
delta-find: move tested in the _DeltaSearch.__init__
Now that we have an object we can initialize that attribute at initialization
time. This will make it available for more method in the future, allowing to
split the code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 20 Nov 2023 05:05:29 +0100] rev 51325
delta-find: check DELTA_BASE_REUSE_FORCE in the _DeltaSearch.__init__
Now that we have an object we can check that DELTA_BASE_REUSE_FORCE cases does not reach this code at in a more suitable location.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 20 Nov 2023 05:04:23 +0100] rev 51324
delta-find: move target_rev in the _DeltaSearch.__init__
Now that we have an object we can initialize that attribute at initialization
time.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 20 Nov 2023 05:03:21 +0100] rev 51323
delta-find: move snapshot_cache in the _DeltaSearch.__init__
Now that we have an object we can initialize that attribute at initialization
time.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 20 Nov 2023 04:59:25 +0100] rev 51322
delta-find: move `_rawgroups` on the `_DeltaSearch` object
Moving more code before doing more logic changes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 20 Nov 2023 04:53:11 +0100] rev 51321
delta-find: move `_refinedgroups` on the `_DeltaSearch` object
Moving more code before doing more logic changes.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 20 Nov 2023 04:44:40 +0100] rev 51320
delta-find: introduce a _DeltaSearch object
That object represent the search of a good delta for one revision. It will
replace the interleaved generator currently in use. It will make the logic more
explicit and easier to split into different subclass for the algorithm variant.
We will move content gradually before doing deeper rework.
For now, we only move the `_candidategroups` function here. More will follow in
the same series.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 22 Dec 2023 12:58:54 +0100] rev 51319
delta-find: add a small docstring to deltacomputer
As we are about to introduce another object related to finding delta. So lets
have a minimal docstring to the existing one.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 11 Jan 2024 16:41:54 +0100] rev 51318
revlog: stop using `atomictmp` for the split revlog
Since we already manually deal with writing on the side and delaying visibily,
we no longer need this.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 11 Jan 2024 16:39:31 +0100] rev 51317
changelog: drop the side_write argument to revlog splitting
The only user is now gone.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 11 Jan 2024 16:35:52 +0100] rev 51316
changelog: stop useless enforcing split at the end of transaction
Changelogs are no longer created inline, and existing changelogs are
automatically split. Since we now enforce splitting at the start of any write,
we don't need to enforce splitting at the end of the transaction.
This has the nice side effect of killing the only user of "side_write".
Anton Shestakov <av6@dwimlabs.net> [Sun, 14 Jan 2024 16:03:08 -0300] rev 51315
tests: don't use "status" operand of dd in test-censor.t (
issue6858)
Some implementations don't have this operand, let's just direct stderr into
/dev/null, that's pretty cross-platform.
Also specify bs=512 (the default for me), because the default might be
different on different systems. Other uses of dd in the tests do specify it, so
this is more consistent.
Raphaël Gomès <rgomes@octobus.net> [Thu, 11 Jan 2024 17:52:13 +0100] rev 51314
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Thu, 11 Jan 2024 17:49:51 +0100] rev 51313
Added signature for changeset
136902b3a95d
Raphaël Gomès <rgomes@octobus.net> [Thu, 11 Jan 2024 17:49:37 +0100] rev 51312
Added tag 6.6.2 for changeset
136902b3a95d
Raphaël Gomès <rgomes@octobus.net> [Thu, 11 Jan 2024 17:49:04 +0100] rev 51311
relnotes: add 6.6.2
Georges Racinet <georges.racinet@octobus.net> [Wed, 03 Jan 2024 18:33:39 +0100] rev 51310
pycompat: fix bytestr(bytes) in Python 3.11
In Python 3.10, the `bytes` type itself does not have a `__bytes__`
attribute, but it does in 3.11. Yet `bytes(bytes)` does not give
the wished output, so we have to add an exceptional case.
The added case in the doctest reproduces the problem with Python 3.11.
Impact: error treatment in expressions such as `repo[b'invalid']` gets
broken.
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Thu, 04 Jan 2024 14:45:31 -0500] rev 51309
narrow: prevent removal of ACL-defined excludes
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Thu, 04 Jan 2024 14:41:18 -0500] rev 51308
narrow: add test demonstrating bug in acl exclusion enforcement
Anton Shestakov <av6@dwimlabs.net> [Mon, 08 Jan 2024 13:35:02 +0100] rev 51307
contrib: add a set of scripts to run pytype in Docker
Having a simple way to run pytype for developers can massively shorten
development cycle. Using the same Docker image and scripts that we use on our
CI guarantees that the result achieved locally will be very similar to (if not
the same as) the output of our CI runners.
Things to note: the Dockerfile needs to do a little dance around user
permissions inside /home/ci-runner/ because:
- on one hand, creating new files on the host (e.g. .pyi files inside .pytype/)
should use host user's uid and gid
- on the other hand, when we run the image as uid:gid of host user, it needs to
be able to read/execute files inside the image that are owned by ci-runner
Since local user's uid might be different from ci-runner's uid, we execute this
very broad chmod command inside /home/ci-runner/, but then run the image as the
host user's uid:gid.
There might be a better way to do this.
Anton Shestakov <av6@dwimlabs.net> [Mon, 18 Dec 2023 15:52:17 -0300] rev 51306
pytype: use "$(hg root)" instead of `hg root` to make shellcheck happier
Anton Shestakov <av6@dwimlabs.net> [Mon, 18 Dec 2023 15:40:48 -0300] rev 51305
pytype: update check-pytype.sh to select target automatically
We have python3.11 on CI, so we can run pytype targeting that version. On the
other hand, we don't have python3.7 on CI anymore, so we can't run pytype for
3.7 anymore (interpreter not found). I think it's fine to make pytype select
the appropriate target depending on the version of the interpreter it's running
under.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:54:52 +0100] rev 51304
git-hgext: adjust to the lack of `changelog.heads` method
We don't have a `heads` method returning nodeid, but this is very easy to get
the same result.
This was flagged by pytype.
We can note that the fact this code did not break is probably a good sign that
it is dead code.
However this is a question outside of the scop of this series.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:21:31 +0100] rev 51303
remotefilelog: drop dead code
As pytype flagged bug in this method it highlighted that this methode being
never called anywhere.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:07:59 +0100] rev 51302
pytype: use the right signature for the `__delitem__`
It is not because it is NotImplemented that it should use a bad signature. Fix
it to please pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:07:21 +0100] rev 51301
pytype: use the right signature for the `__setitem__`
It is not because it is NotImplemented that it should use a bad signature. Fix
it to please pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:03:34 +0100] rev 51300
sparse: use with statement for wlock
This will avoid pytype complaining about the try/except range.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 22:00:47 +0100] rev 51299
remotefilelog: adjust the signature of basepack.createindex
pytype point that the subclass signature have been updated.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 21 Dec 2023 00:19:19 +0100] rev 51298
pytype: add the couple annotations for pytype to understands the lrunode
After loosing 2d6 SAN, I eventually understood that pytype was confused by method
return type. Pytype is now happy.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:40:06 +0100] rev 51297
pytype: ignore some signature mismatch in registrar
pytype is grumpy about a sub method having a different signature than the one we
use here.
pytype error:
internalmerge: Overriding method signature mismatch [signature-mismatch]
Base signature: 'def _funcregistrarbase._extrasetup(self, name, func) -> Any'.
Subclass signature: 'def internalmerge._extrasetup(self, name, func, mergetype, onfailure = None, precheck = None, binary = False, symlink = False) -> Any'.
Parameter 'mergetype' must have a default value.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:38:46 +0100] rev 51296
hgweb: update _runwsgi try/except range to be valid
The `tmpl` variable is used in the `except` and `finally`, so we need it created
before the `try` is open.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:36:52 +0100] rev 51295
pytype: add type information for `annotateresult.lines`
This seems to appease a confused pytype.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:34:47 +0100] rev 51294
pytype: ignore attribute error for time.clock
This seems to be a Windows only attribute.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:27:49 +0100] rev 51293
pytype: ignore certifi import error
This is an optional import so we should not complains about it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:26:30 +0100] rev 51292
pytype: ignore some signature mismatch in configitems
pytype is grumpy about the dict.update having a more complex signature than the
one we use here.
pytype error:
itemregister: Overriding method signature mismatch [signature-mismatch]
Base signature: 'def builtins.dict.update(self) -> None'.
Subclass signature: 'def itemregister.update(self, other) -> Any'.
Parameter 'other' must have a default value.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 16:30:32 +0100] rev 51291
pytype: only output the "pytype crashed" message on error
If pytype did not crash while generating stub, that message is kind of
confusing. It seems simple enough to avoid it in this case.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 22:17:03 +0100] rev 51290
pytype: drop the now useless assert
As the imported types are now used by type annotation, these ugly assert are
no longer needed.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 16:39:03 +0100] rev 51289
pytype: drop the last inline type comment
We can't assign type to the "for" variant on the fly, so we type the variable
and method used instead.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 11:23:09 +0100] rev 51288
pytype: convert type comment for inline variable too
Same logic as for the previous changeset, but for "type comment" annotating
variables, not function/method.
As for the previous changeset, we had to adjust for of the types to actually match what was happening.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 19 Dec 2023 21:29:34 +0100] rev 51287
pytype: move some type comment to proper annotation
We support direct type annotations now, while pytype is starting to complains
about them.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 20:13:22 +0100] rev 51286
lock: properly convert error to bytes
Flagged by pytype when a later changeset is applied moving typing comment to annotation.
We fix this ahead of the annotation change to make sure pytype remains happy
after the change.
We have to do fairly crazy dance for pytype to be happy. This probably comes
from the fact IOError.filename probably claims to be `str` while it is actually
`bytes` if the filename raising that `IOError` is bytes.
At the same time, `IOError.strerror` is consistently `str` and should be passed
as `str` everywhere.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 20 Dec 2023 12:51:20 +0100] rev 51285
pytype: import typing directly
First we no longer needs the pycompat layer, second having the types imported in
all case will allow to use them more directly in type annotation, something
important to upgrade the old "type comment" to proper type annotation.
A lot a stupid assert are needed to keep pyflakes happy. We should be able to
remove most of them once the type comment have been upgraded.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 08 Nov 2023 01:58:16 +0100] rev 51284
usage: configure uncompressed chunk cache through resource configuration
Let's use this new concept for what it is meant for.
This provides a sizable speed up for reading multiple revision for some complexe
repositories.
### data-env-vars.name = pypy-2018-08-01-zstd-sparse-revlog
# benchmark.name = hg.perf.read-revisions
# benchmark.variants.order = reverse
memory-medium: 1.892400
memory-high: 1.722934 (-8.61%)
# benchmark.variants.order = default
memory-medium: 1.751542
memory-high: 1.589340 (-9.49%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 09 Oct 2023 15:12:16 +0200] rev 51283
usage: add configuration option to adjust resources usage
They currently do nothing, but this open the way to actually use them.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 09 Oct 2023 15:06:21 +0200] rev 51282
usage: add a `usage.repository-role` config
This config will be used for behavior and performance adjustment depending of
the repository role.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 13 Dec 2023 13:46:28 +0100] rev 51281
common-pattern: cover "elapsed time" line
These are perfect targets for the common-pattern matching.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Dec 2023 18:02:26 +0100] rev 51280
bundle: do not detect --base argument that match nothing as lack of argument
With the previous version of the code, if --base did not match anything, it will
be handled as if no --base was provided and will fallback to using discovery
with the default path. This has two issues :
- The resulting bundle won't match what the user requested,
- if not default path is configured, it will crash.
We now properly distinct between the two cases and if the --base query does not
find any changeset, we will assume that everything under --rev needs to be sent.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 27 Dec 2023 18:42:13 +0100] rev 51279
bundle: highlight misbehavior when --base does not match any revision
See next changeset for fix and details.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 24 Dec 2023 02:43:53 +0100] rev 51278
branching: merge with stable
I need the fix to `generate-churning-bundle.py`.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 18 Nov 2023 00:16:15 +0100] rev 51277
generate-churning-bundle: fix script for python3
This script has apparently not run for a long time.
Martin von Zweigbergk <martinvonz@google.com> [Sat, 16 Dec 2023 10:48:20 -0800] rev 51276
narrow: strip trailing `/` from manifest dir before matching it
Commit
17a822d7943e broke some of our internal tests at Google because the `dir`
variable contains a trailing slash since that commit. Let's restore the old
behavior by stripping that trailing slash.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 18 Dec 2023 10:13:41 -0800] rev 51275
tests: demonstrate error when narrowing with `rootfilesin:` pattern
This demonstrates a bug introduced in
17a822d7943e.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 18 Dec 2023 14:51:20 -0800] rev 51274
matchers: use correct method for finding index in vector
The path matcher has an optimization for when all paths are `rootfilesin:`. This
optimization exists in both Python and Rust. However, the Rust implementation
currently has a bug that makes it fail in most cases. The bug is that it
`rfind()` where it was clearly intended to use `rposition()`. This patch fixes
that and adds a test.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 12 Dec 2023 17:08:45 +0100] rev 51273
dirstate: make the `transaction` argument of `setbranch` mandatory
This is deprecated since 6.4. We should drop it now.
Raphaël Gomès <rgomes@octobus.net> [Wed, 20 Dec 2023 14:59:31 +0100] rev 51272
rust-clippy: apply some more trivial fixes
All of these were hinted at by clippy and make the code simpler.
Raphaël Gomès <rgomes@octobus.net> [Wed, 20 Dec 2023 14:58:36 +0100] rev 51271
rust-clippy: simplify `match` to `if let`
This was hinted at by clippy, and makes it more obvious that nothing is
happening in the `None` case.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:56:08 +0100] rev 51270
censor: accept multiple revision in a single call
This is useful when dealing with corruption, as all the corrupted revision can
be dealt with in one go.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:46:46 +0100] rev 51269
censor: be more verbose about the other steps too
If we informs the user about head checking, we should tell him when the other
operation happens too. Otherwise the user can imagine to still be in the head
checking part.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:44:33 +0100] rev 51268
censor: add a command flag to skip the head checks
In some case we spend hours of time checking the heads to censors a simple file
is not a good behavior. Especially when censors is used to removed corrupted
content.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:33:35 +0100] rev 51267
censor: inform the user that we are spending time checking heads
The time this can consume can be a surprise to the user, lets be explicit about
it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:25:52 +0100] rev 51266
censor: mention that we check the heads in the help
And add a message to will explain the possibly long time spent doing this.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 21 Dec 2023 01:45:43 +0100] rev 51265
persistent-nodemap: respect the mmap setting when refreshing data
After writing updated data, we reload the in-memory data. However, that logic
was… wrong. We were doing file read when mmap was requested and when the
configuration was requesting to not use mmap… we were using it.
This should now be fine.
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Dec 2023 09:57:25 +0100] rev 51264
rust-index: only access offsets if revlog is inline
Accessing the `RwLock` ended up showing up in profiles even with no contention.
Offsets only exist for inline revlogs, so gate everything behind an inline
check.
Raphaël Gomès <rgomes@octobus.net> [Wed, 06 Dec 2023 11:04:18 +0100] rev 51263
rust-index: cache the head nodeids python list
Same optimization as before, but for the nodeids this time.
Raphaël Gomès <rgomes@octobus.net> [Tue, 05 Dec 2023 14:50:05 +0100] rev 51262
rust-index: add fast-path for getting a list of all heads as nodes
This avoids a lot of back-and-forth between Python and Rust. We forgo adding
a fast-path in the `filteredchangelog` case yet. If it shows up in profiling,
we might add the variant with a filter.
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 23:22:51 -0500] rev 51261
rust-index-cpython: cache the heads' PyList representation
This is the same optimization that the C index does, we just have more
separation of the Python and native sides.
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 15:58:24 -0500] rev 51260
rust-index: use a `BitVec` instead of plain `Vec` for heads computation
The `Vec` method uses one byte per revision, this uses 1 per 8 revisions,
which improves our memory footprint. For large graphs (10+ millions), this
can make a measurable difference server-side.
I have seen no measurable impact on execution speed.
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Nov 2023 10:04:41 -0500] rev 51259
rust-index: implement faster retain heads using a vec instead of a hashset
This is the same optimization that the C index does, we're only catching up
now because this showed up as slow in benchmarking.
Raphaël Gomès <rgomes@octobus.net> [Thu, 14 Dec 2023 11:52:05 +0100] rev 51258
rust-index: allow inlining VCSGraph parents across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 18:48:07 +0100] rev 51257
rust-index: allow inlining `parents` across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 18:47:42 +0100] rev 51256
rust-index: allow inlining `check_revision` across crates
Raphaël Gomès <rgomes@octobus.net> [Thu, 23 Nov 2023 03:41:58 +0100] rev 51255
rust-index: document safety invariants being upheld for every `unsafe` block
We've added a lot of `unsafe` code that shares Rust structs with Python.
While this is unfortunate, it is also unavoidable, so let's at least
systematically explain why each call to `unsafe` is sound.
If any of the unsafe code ends up being wrong (because everyone screws up
at some point), this change at least continues the unspoken rule of always
explaining the need for `unsafe`, so we at least get a chance to think.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:18:03 +0100] rev 51254
rust-index: renamed `MixedIndex` as `Index`
It is simply not mixed any more, hence the name had become a
future source of confusion.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 23:54:05 +0100] rev 51253
rust-index: stop instantiating a C Index
The only missing piece was the `cache` to be returned from
`revlog.parse_index_v1_mixed`, and it really seems that it is
essentially repetition of the input, if `inline` is `True`.
Not worth a Rust implementation (C implementation is probably there
for historical reasons).
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:28:30 +0100] rev 51252
rust-revlog: using the ad-hoc `NodeTree` in scmutil
Now that we have an independent `NodeTree` class able to work natively
on the pure Rust index, we use it in `mercurial.scmutil`, with automatic
invalidation after mutation of the index.
This code path is tested by `test-revisions.t` and `test-template-functions.t`
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 22:36:30 +0100] rev 51251
rust-revlog: add invalidation detection to `NodeTree` class
This will be useful for callers, such as `scmutil` who reuse a
`NodeTree` instance as a cache. They would otherwise get hard
errors if any mutation of the index occurred since instantiation.
This is something the C index does not provide.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 15:50:13 +0100] rev 51250
rust-index: add support for `del index[r]`
Only the `del index[r:]` syntax was supported, but the comment said otherwise.
It's not actually used in core code, but the C index supports it.
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:26:17 +0100] rev 51249
rust-revlog: bare minimal NodeTree exposition
The independent `NodeTree` instances needs to be associated to an
index (for forward-checks of candidates) but do not need to
encompass all revisions from that index.
This is exactly how it is used in `scmutil.shortesthenodeidprefix`
and we restrict the implementation to the bare minimum needed there
and to write convincing tests.
It would of course be fairly trivial to add more.
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 21:25:28 +0100] rev 51248
rust-index: a property to identify the Rust index as such
Will be useful soon in `mercurial.scmutil` and potentially elsewhere
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 15:32:33 +0100] rev 51247
rust-cpython-revlog: renamed NodeTree import as CoreNodeTree
We're about to introduce a `NodeTree` Python class (hence also
a Rust struct) and it would be a collision with the import
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:48:53 +0200] rev 51246
rust-index: stop using C index
We still keep its wrapper implementation in `hg-cpython::cindex`,
because we might want to recreate ancestors handling objects using
it for the case of REVLOGV2.
Also, we still instantiate it (from Python code) and store it as
attribute, for the likes of `get_cindex` and the caller that
relies on it, but that is soon to be removed, too.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 12:07:05 +0100] rev 51245
rust-index: using `hg::index::Index` in discovery
At this point the C index is not used any more: we had to
remove `pyindex_to_graph()` to avoid the dead code warning.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 12:01:57 +0100] rev 51244
rust-python-testing: separated base test classes
This will allow, e.g., to change `test-rust-discovery.py` simply
by adding the appropriate base class.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:21:18 +0100] rev 51243
rust-discovery: encapsulated conversions to vec for instance methods
This new `pyiter_to_vec` is pretty trivial, and only mildly reduces
code duplication. The main advantage is that it encapsulates access
to the `index` attribute, which will be changed when we replace the
C index by the Rust index, given as `PySharedRef`.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 11:10:09 +0100] rev 51242
rust-discovery: moving most of hg-cpython methods to regular code blocks
The chosen methods are those with conversion of an incoming Python iterable,
as they will be changed the most when we will remove the C index, and
`takefullsample` for consistency with `takequicksample`.
Georges Racinet <georges.racinet@octobus.net> [Sun, 29 Oct 2023 10:47:54 +0100] rev 51241
rust-index: using `hg::index::Index` in `hg-cpython::dagops`
Hooking `headrevs` to the Rust index is straightforward as long as
we go the `PySharedRef` way. Direct attempts of obtaining a reference
to the inner `hg::index::Index` fail for lifetime reasons: the reference
is bound to the GIL, yet the `as_set` local variable is considered to
be static (the borrow checker clearly does not realize or care that this
set only stores `Revision` values).
In `rank()`, the chosen solution is the simplest as far as `hg-cpython` is
concerned, but it has the defect of removing an implementation
that would be easily adaptable if the core index did implement `RankedGraph`
(returning the same error as long as only `REVLOGV1` is supported), but that
would introduce a direct dependency of `hg-core` on the ``vcsgraph` crate.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sat, 28 Oct 2023 22:50:10 +0200] rev 51240
rust-index: using `hg::index::Index` in MissingAncestors
With this, the whole `hg-cpython::ancestors` module can now work without
the C index.
Georges Racinet <georges.racinet@octobus.net> [Fri, 27 Oct 2023 22:11:05 +0200] rev 51239
rust-index: using the `hg::index::Index` in ancestors iterator and lazy set
Since there is no Rust implementation for REVLOGV2/CHANGELOGv2, we declare
them to be incompatible with Rust, hence indexes in these formats will use
the implementations from Python `mercurial.ancestor`. If this is an unacceptable
performance hit for current users of these formats, we can later on add Rust
implementations based on the C index for them or implement these formats for
the Rust indexes.
Among the challenges that we had to meet, we wanted to avoid taking the GIL each
time the inner (vcsgraph) iterator has to call the parents function. This would probably
still be acceptable in terms of performance with `AncestorsIterator`, but not with
`LazyAncestors` nor for the upcoming change in `MissingAncestors`.
Hence we enclose the reference to the index in a `PySharedRef`, leading to more
rigourous checking of mutations, which does pass now that there no logically immutable
methods of `hg::index::Index` that take a mutable reference as input.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:29:29 +0200] rev 51238
revlog: always use a Rust index for REVLOGv1 if rustext is present
We are about to change classes such as `rustext.AncestorsIterator` to
take a Rust index, hence we cannot have the option not to use the Rust
index.
Note: this can be refined depending on whether we want to keep this
option or not. We will have to make two versions of `AncestorsIterator`
and its sibling to support REVLOGV2 and CHANGELOGv2 anyway.
Meanwhile, this is the simplest change to make the tests pass.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Sun, 29 Oct 2023 18:35:32 +0100] rev 51237
rust-index: disabling flagprocessor tests
The list of flags supported by the Rust index is not dynamic, hence
flagprocessor has no chance to work.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:58:56 +0100] rev 51236
rust-index: support `unionrepo`'s compressed length hack
Explanations inline.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 23:21:50 +0200] rev 51235
rust-index: honour incoming using_general_delta in `deltachain`
It looks to be a leftover from some past, but the C index considers
only the value passed from Python whereas up to now the Rust index
was using the value of its attribute.
As a middle ground, we make this argument of `deltachain` optional from
the Python side, with the Rust implementation only defaulting to its
attribute. This way, we reduce false leads when a difference in results
is spotted.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Fri, 27 Oct 2023 21:48:45 +0200] rev 51234
rust-index: use interior mutability in head revs and caches
For upcoming changes in `hg-cpython` switching to the `hg-core` index in
ancestors iterators, we will need to avoid excessive mutability, restricting
the use of mutable references on `hg::index::Index` to methods that actually
logically mutate it, whereas the maintenance of caches such as `head_revs`
clearly does not. We illustrate that immediately by switching to immutable
borrows in the corresponding methods of `hg-cpython::MixedIndex`
Raphaël Gomès <rgomes@octobus.net> [Thu, 26 Oct 2023 15:26:19 +0200] rev 51233
rust-index: add Sync bound to all relevant mmap-derived values
All readonly mmaps are Sync as far as Rust is concerned. Integrity of the
mmap'ed file is a concern separate to Rust's memory model, since it requires
out-of-program handling via locks, etc.
This will help when we start sharing the Rust Index with Python.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 18:09:43 +0100] rev 51232
debugindexstats: handle the lack of Rust support better
We don't have any stats in the Rust index. Currently it is not known which
stats would be interesting to get, so if they end up being important, we can
add them later.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:36:59 +0100] rev 51231
rust-python-index: don't panic on a corrupted index when calling from Python
This makes `test-verify.t` pass again. In an ideal world, we would find
the exact commit where this test breaks and amend part of this change there,
but this is a long enough series.
Raphaël Gomès <rgomes@octobus.net> [Tue, 31 Oct 2023 17:34:31 +0100] rev 51230
tests: ignore test-storage when using Rust
This is only relevant for Python code and the SQLite backend, which is in a
half-abandoned state.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 09:12:22 +0200] rev 51229
rust-index: optimize find_gca_candidates() on less than 8 revisions
This is expected to be by far the most common case, given that, e.g.,
merging involves using it on two revisions.
Using a `u8` as support for the bitset obviously divides the amount of
RAM needed by 8. To state the obvious, on a repository with 10 million
changesets, this spares 70MB. It is also possible that it'd be slightly
faster, because it is easier to allocate and provides better cache locality.
It is possible that some exhaustive listing of the traits implemented by
`u8` and `u64` would avoid the added duplication, but that can be done later
and would need a replacement for the `MAX` consts.
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:54:49 +0200] rev 51228
rust-index: simplification in find_gca_candidates()
`parent_seen` can be made a mutable ref, making this part more obvious,
not needing to be commented so much.
The micro-optimization of avoiding the union if `parent_seen` and
`current_seen` agree is pushed down in the `union()` method of the
fast, `u64` based bit set implementation (in case it matters).
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:43:00 +0200] rev 51227
rust-index: avoid double negation in find_gca_candidates()
Georges Racinet <georges.racinet@octobus.net> [Fri, 20 Oct 2023 08:17:00 +0200] rev 51226
rust-index: avoid some cloning in find_gca_candidates()
Instead of keeping the information whether the current revision is
poisoned on `current_seen`, we extract it as a boolean.
This also allows us to simplify the explanation of `seen[r].is_poisoned()`,
as the exceptional case where it is poisoned right after `r` has been
determined to be a solution does no longer exist.
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 15:35:38 +0200] rev 51225
rust-index: implement common_ancestors_heads() and ancestors()
The only differences betwwen `common_ancestors_heads()` and
`find_gca_candidates()` seems to be that:
- the former accepts "overlapping" input revisions (meaning with duplicates).
- limitation to 24 inputs (in the C code), that we translate to using the
arbitrary size bit sets in the Rust code because we cannot bail to Python.
Given that the input is expected to be small in most cases, we take the
heavy handed approach of going through a HashSet and wait for perfomance
assessment
In case this is used via `hg-cpython`, we can anyway absorb the overhead
by having `commonancestorheads` build a vector of unique values
directly, and introduce a thin wrapper over `find_gca_candidates`, to take
care of bit set type dispatching only.
As far as `ancestors` is concerneed, this is just chaining
`common_ancestors_heads()` with `find_deepest_revs`.
Georges Racinet on incendie.racinet.fr <georges@racinet.fr> [Tue, 17 Oct 2023 22:42:40 +0200] rev 51224
rust-index: find_gca_candidates bit sets genericization
This allows to use arbitratry size of inputs in `find_gca_candidates()`.
We're genericizing so that the common case of up to 63 inputs can be
treated with the efficient implementation backed by `u64`.
Some complications with the borrow checker came, because arbitrary sized
bit sets will not be `Copy`, hence mutating them keeps a mut ref on the `seen`
vector. This is solved by some cloning, most of which can be avoided,
preferably in a follow-up after proof that this works (hence after exposition
to Python layer).
As far as performance is concerned, calling `clone()` on a `Copy` object
(good case when number of revs is less than 64) should end up just doing a
copy, according to this excerpt of the `Clone` trait documentation:
Types that are Copy should have a trivial implementation of Clone.
More formally: if T: Copy, x: T, and y: &T, then let x = y.clone();
is equivalent to let x = *y;.
Manual implementations should be careful to uphold this invariant;
however, unsafe code must not rely on it to ensure memory safety.
We kept the general structure, hence why there are some double negations.
This also could be made nicer in a follow-up.
The `NonStaticPoisonableBitSet` is included to ensure that the
`PoisonableBitSet` trait is general enough (had to correct `vec_of_empty()` for
instance). Moving the genericization one level to encompass the `seen`
vector and not its elements would be better for performance, if worth it.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:45:20 +0100] rev 51223
rust-index: core impl for find_gca_candidates and find_deepest
This still follows closely the C original and not able to treat more than 63
input revisions (bitset backed by `u64` and one bit reserved for poisoning).
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:57:36 +0100] rev 51222
rust-index: add support for `reachableroots2`
Exposition in `hg-cpython` done in regular impl block, again
for rustfmt support etc.
Georges Racinet <georges.racinet@octobus.net> [Thu, 02 Nov 2023 12:17:06 +0100] rev 51221
hg-cpython: rev_pyiter_collect_or_else
It will be useful to give callers the control on the generated errors
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:54:42 +0100] rev 51220
rust-index: add support for `computephasesmapsets`
Exposition in `hg-cpython` done in the regular `impl` block to enjoy
rustfmt and clearer compilartion errors.
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 15:59:03 +0200] rev 51219
rust-index: slicechunktodensity returns Rust result
Ready for removal of the scaffolding.
This time, we allow ourselves a minor optimization: we avoid
allocating for each chunk. Instead, we reuse the same vector,
and perform at most one allocation per chunk.
The `PyList` constructor will copy the buffer anyway.
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:40:23 +0100] rev 51218
rust-index: add support for `_slicechunktodensity`
Georges Racinet <georges.racinet@octobus.net> [Fri, 29 Sep 2023 20:51:49 +0200] rev 51217
rust-index: headrevsfiltered() returning Rust result
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 11:14:25 +0100] rev 51216
rust-index: add support for `headrevsfiltered`
The implementation is merged with that of `headrevs` also to make sure that
caches are up to date.
Raphaël Gomès <rgomes@octobus.net> [Tue, 19 Sep 2023 15:21:43 +0200] rev 51215
rust-index: implement headrevs
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 16:52:40 +0200] rev 51214
rust-index: variant of assert_py_eq with normalizer expression
The example given in doc-comment is the main use case: some methods
may require ordering insensitive comparison. This is about to be
used for `reachableroots2`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:50:14 +0200] rev 51213
rust-index: add support for delta-chain computation
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 15:01:34 +0200] rev 51212
rust-index: add support for `find_snapshots`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 12:05:32 +0200] rev 51211
rust-index: add `is_snapshot` method
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:33 +0200] rev 51210
rust-index: use the Rust index in `partialmatch`
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 14:50:17 +0200] rev 51209
rust-index: add missing special case for null rev
This was an oversight, it was never a problem because we didn't use the index
much for user-facing things in the past, which is the only real way of getting
to this edge case.
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 16:49:17 +0200] rev 51208
rust-index: use the rust index in `shortest`
Raphaël Gomès <rgomes@octobus.net> [Wed, 02 Aug 2023 14:34:21 +0200] rev 51207
rust-index: add checks that `__contains__` is synchronized
Georges Racinet <georges.racinet@octobus.net> [Mon, 30 Oct 2023 11:03:57 +0100] rev 51206
rust-index: using the Rust index in nodemap updating methods
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:19:54 +0100] rev 51205
rust-index: implementation of __getitem__
Although the removed panic tends to prove if the full test suite
did pass that the case when the input is a node id does not happen,
it is best not to remove it right now.
Raising IndexError is crucial for iteration on the index to stop,
given the default CPython sequence iterator, see for instance
https://github.com/zpoint/CPython-Internals/blobs/master/BasicObject/iter/iter.md
This was spotted by `test-rust-ancestors.py`, which does simple interations on
indexes (as preflight checks).
In `revlog.c`, `index_getitem` defaults to `index_get` when called
on revision numbers, which does raise `IndexError` with the same message as
the one we are introducing here.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 11:34:52 +0200] rev 51204
rust-index: optim note for post-scaffolding removal
Raphaël Gomès <rgomes@octobus.net> [Thu, 02 Nov 2023 11:16:13 +0100] rev 51203
rust-index: check that the entry bytes are the same in both indexes
This is a temporary measure to show that both the Rust and C indexes are
kept in sync.
Comes with some related documentation precisions.
For comparison of error cases, see `index_entry_binary()` in `revlog.c`.
Georges Racinet <georges.racinet@octobus.net> [Sat, 30 Sep 2023 16:15:56 +0200] rev 51202
rust-index: return variables systematic naming convention
To help knowing at a glance when a method is ready, making
us more comofortable when we are close to the final removal of
scaffolding, we introduce the systematic variable names `rust_res` and
`c_res`. The goal of this series is to always return the formet.
We take again the case of `pack_header` as example.
Our personal opinion is to usually avoid such poor semantics as `res`, but
usually accept it when it close to the actual return, which will be the
case in most methods of this series. Also, the name can simply be dropped
when we remove the scaffolding. To follow on the example, the body of
`pack_header()` should become this in the final version:
```
let index = self.index(py).borrow();
let packed = index.pack_header(args.get_item(py, 0).extract(py)?);
Ok(PyBytes::new(py, &packed).into_object());
```
in these cases it is close to the actual return and will be removed
at the end entirely.
Georges Racinet <georges.racinet@octobus.net> [Fri, 29 Sep 2023 15:51:49 +0200] rev 51201
rust-index: results comparison helper with details
This is a bit simpler to call and has the advantage of systematically log
the encountered deviation.
To avoid committing dead code, we apply it to the `pack_header` method, that
was already returning the Rust result.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 10:59:04 +0200] rev 51200
rust-index: helper for revision not in index not involving nodemap
This is a good match for exceptions raised from the C implementation,
when it is not about a nodemap inconsistency.
Georges Racinet <georges.racinet@octobus.net> [Wed, 18 Oct 2023 19:54:18 +0200] rev 51199
rust-index: renamed nodemap error function for rev not in index
The function name was misleading, as the error wording mentions the
nodemap, hence would not be appropriate for missing revisions not
related to a nodemap lookup.
Raphaël Gomès <rgomes@octobus.net> [Thu, 03 Aug 2023 10:28:10 +0200] rev 51198
rust-index: add `pack_header` support
Raphaël Gomès <rgomes@octobus.net> [Mon, 30 Oct 2023 10:34:48 +0100] rev 51197
rust-index: support cache clearing
I'm not 100% sure how useful it is outside of perf, but it's still worth
implementing.
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Jun 2023 11:37:19 +0200] rev 51196
rust-index: check rindex and cindex return the same get_rev
This is a temporary safeguard while we synchronize both indexes.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 16:43:39 +0200] rev 51195
rust-index: synchronize remove to Rust index
Future steps will bring the two indexes further together until we can
rip the C index entirely when running Rust code.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 11:59:43 +0200] rev 51194
rust-index: remove `__setitem__` method from the mixed index
This is not defined on the Python or C one, and isn't used anywhere.
Raphaël Gomès <rgomes@octobus.net> [Wed, 28 Jun 2023 11:36:22 +0200] rev 51193
rust-index: check equality between rust and cindex for `__len__`
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 18:24:54 +0200] rev 51192
rust-index: synchronize append method
We now append to the Rust index just as we do to the C index. Future steps
will bring the two indexes further together until we can rip the C index
entirely when running Rust code.
Raphaël Gomès <rgomes@octobus.net> [Mon, 18 Sep 2023 17:11:11 +0200] rev 51191
rust-revlog: teach the revlog opening code to read the repo options
This will become necessary as we start writing revlog data from Rust.
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 17:34:51 +0200] rev 51190
rust-index: pass data down to the Rust index
This will allow us to start keeping the Rust index synchronized with the
cindex as we gradually implement more and more methods in Rust. This will
eventually be removed.
Raphaël Gomès <rgomes@octobus.net> [Tue, 27 Jun 2023 16:32:09 +0200] rev 51189
rust-index: add append method
This is the first time the Rust index has any notion of mutability.
This will be used in a future patch from Python, to start synchronizing the
Rust index and the C index.
Raphaël Gomès <rgomes@octobus.net> [Mon, 26 Jun 2023 19:16:07 +0200] rev 51188
rust-index: add an abstraction to support bytes added at runtimes
In order to support appending data to the Rust index, we need to abstract
data access away from the immutable (on-disk) bytes, to seemlessly fetch
either from the preexisting data or from the newly added data.
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Jun 2023 16:09:57 +0200] rev 51187
rust-mixed-index: move the mmap keepalive into a function
The same code will be used for keeping the new index mmap around.
Raphaël Gomès <rgomes@octobus.net> [Thu, 29 Jun 2023 15:00:46 +0200] rev 51186
rust-mixed-index: rename variable to make the next change clearer
We're going to add another mmap reference holder, so let's rename this one
first.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Sep 2023 10:08:32 +0200] rev 51185
rust: fix cargo doc for hg-cpython
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 15 Dec 2023 11:10:24 +0100] rev 51184
branching: merge with default
We merge with the current children of the bad merge (
37b52b938579)
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 15 Dec 2023 11:08:41 +0100] rev 51183
branching: merge with stable
This recreates `
37b52b938579` right as a `hg branch --rev
5b186ba40001` screwed
up the content.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Dec 2023 02:07:16 +0100] rev 51182
changelog: disallow delayed write on inline changesets
Since this will never happens, we can make the situation invalid and to stop to
handling the associated the case.
This simplify the random access file reading too.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Dec 2023 22:27:59 +0100] rev 51181
changelog: never inline changelog
The test suite mostly use small repositories, that implies that most changelog in the
tests are inlined. As a result, non-inlined changelog are quite poorly tested.
Since non-inline changelog are most common case for serious repositories, this
lack of testing is a significant problem that results in high profile issue like
the one recently fixed by
66417f55ea33 and
849745d7da89.
Inlining the changelog does not bring much to the table, the number of total
file saved is negligible, and the changelog will be read by most operation
anyway.
So this changeset is make it so we never inline the changelog, and de-inline the
one that are still inlined whenever we touch them.
By doing that, we remove the "dual code path" situation for writing new entry to
the changelog and move to a "single code path" situation. Having a single
code path simplify the code and make sure it is covered by test (if test cover
that situation obviously)
This impact all tests that care about the number of file and the exchange size,
but there is nothing too complicated in them just a lot of churn.
The churn is made "worse" by the fact rust will use the persistent nodemap on
any changelog now. Which is overall a win as it means testing the persistent
nodemap more and having less special cases.
In short, having inline changelog is mostly useless and an endless source of
pain. We get rid of it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Dec 2023 11:50:55 +0100] rev 51180
test-transaction-safety: glog out irrelevant flag
The test is focussing on the inline flag, so we glob out the other to highlight
that fact and prevent noise in the future.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 11 Dec 2023 11:43:32 +0100] rev 51179
test-transaction-safety: perform the test on a filelog
This test previously checked the transaction safety of splitting the changelog.
The changelog is a special case, with delayed/diverted writes and we will stop
inlining it soon. So we keep testing that transaction is safe around inline on
another revlog type : a filelog.
Minor comestic adjustement will be done in the next changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Dec 2023 03:40:37 +0100] rev 51178
test: clarify test-parseindex offsets
We will make this revlog non-inline, so we clarify the code to make sure it is
simple to adjust the test later.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Dec 2023 06:05:18 +0100] rev 51177
test: use more globing for perf timing
Not sure why we kept the number here.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 12 Dec 2023 12:29:12 +0100] rev 51176
branching: merge with stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Dec 2023 03:49:48 +0100] rev 51175
persistent-nodemap: avoid writing nodemap for empty revlog
The format cannot encode the lack of tip_rev.
There is currently nothing known to write such empty nodemap right now, but the
change we are preparing on default reveal this issue. So I had rather fix it on
stable.
Julien Cristau <jcristau@mozilla.com> [Tue, 12 Dec 2023 11:47:48 +0100] rev 51174
histedit: remove superfluous echo() and endwin() calls (
issue6859)
ncurses patchlevel
20231111 started returning an error from endwin() if
called twice without a intervening screen update.
Per Sven Joachim in
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1058041#17: "AFAICS,
invoking curses.echo() and curses.endwin() is superfluous
because curses.wrapper already does that for you, and calling
curses.endwin() twice throws an error with the newer ncurses. Removing
those two lines should fix the problem."
Martin von Zweigbergk <martinvonz@google.com> [Thu, 07 Dec 2023 09:31:07 -0800] rev 51173
statprof: handle `lineno == None` in more cases
This continues the work from
972f3e5c94b8. We saw a crash on line 956 but I
updated lots of other places as well.
Raphaël Gomès <rgomes@octobus.net> [Thu, 07 Dec 2023 14:28:31 +0100] rev 51172
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Thu, 07 Dec 2023 14:22:55 +0100] rev 51171
Added signature for changeset
71bd09bebbe3
Raphaël Gomès <rgomes@octobus.net> [Thu, 07 Dec 2023 14:22:46 +0100] rev 51170
Added tag 6.6.1 for changeset
71bd09bebbe3
Raphaël Gomès <rgomes@octobus.net> [Thu, 07 Dec 2023 14:19:02 +0100] rev 51169
relnotes: add 6.6.1
Anton Shestakov <av6@dwimlabs.net> [Sat, 02 Dec 2023 15:10:28 -0300] rev 51168
procutil: move stdin assignment outside of try-finally block
There is an stdin variable in the global scope of this module. And in the
`finally` block of this try-finally statement we're checking `if stdin is not
None`. Let's make sure we don't confuse code check tools into thinking we want
to use global stdin by moving this line of code outside of `try`.
This was caught by pytype 2023.11.21 on Python 3.11.2.
Anton Shestakov <av6@dwimlabs.net> [Sat, 02 Dec 2023 15:02:03 -0300] rev 51167
zeroconf: give inet_aton() str instead of bytes
All other uses of this function in this extension are already fixed (i.e. use
strings instead of bytes).
This was caught by pytype 2023.11.21 on Python 3.11.2.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Dec 2023 16:29:43 +0100] rev 51166
revlog: avoid wrongly updating the data file location on "divert"
If we are in the inline case, we need to align the location of the "data" file
with the temporary location of the file (i.e. "00changelog.i.a"). However we
should not do that for non-inline case… and before this changeset we had been
doing it. In addition `index_file` is already a property taking care of updating
the "segment file" filename when needed. So we can simply remove all that code.
As a result, code trying to read the diverted data before they were committed
ended deeply confused as the "00changelog.i.a" file is nothing like the
"00changelog.d" file.
However nothing corrupted data as all writing where properly handled outside of
the "segment file".
In "best" cases this small in-memory corruption of the filename when unnoticed
until the transaction was committed or rolled back and in the worse case, some
data reading was failing during the transaction and resulted in the transaction
to be rolled back. However wrong data never reached the disk, so this bug should
be have corrupted any repository.
This is not catch by tests because most test use a small repository and
therefor an inline revlog. In addition the bug only triggers when a
changelog read is done in the following "rare" situation:
- after some delayed write
- after that data have been written in a "divert" file (i.e. `00.changelog.i.a`)
- before transaction commit
- outside of a "writing" context
The issue was introduced in
d83d788590a8
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 07 Dec 2023 00:34:08 +0100] rev 51165
revlog: avoid exposing delayed index entry too widely in non-inline revlog
Before this change, the index entry would be seen as "appended" to the data
file. It did not hurt too much as there are never accessed for reading, but this
was odd. So lets stop doing so.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 06 Dec 2023 15:38:15 +0100] rev 51164
revlog: add one more assert about state of thing when splitting
This assert is currently happy, but it does not hurt to adds it to clarify
expected state and catch potential error in the future.
Martin von Zweigbergk <martinvonz@google.com> [Wed, 29 Nov 2023 08:32:24 -0800] rev 51163
add: don't attempt to add back removed files unless explicitly listed
This fixes the bug demonstrated by the previous patch.
Martin von Zweigbergk <martinvonz@google.com> [Tue, 28 Nov 2023 22:44:04 -0800] rev 51162
tests: show failure to `hg add -I` a dir->symlink transition
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Apr 2023 21:56:16 +0200] rev 51161
setup: try a non-pure version of the local Mercurial if the pure fails
Things like `zstd` can make the pure version fails.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 01 Dec 2023 22:13:37 +0100] rev 51160
setup: make debug simpler by adding a `__repr__` to `hgcommand`
This help when trying to debug this logic.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 02 Dec 2023 02:13:23 +0100] rev 51159
censor: fix things around inlining
The temporary revlog cannot go through the inline → split process as this would
break at transaction commit. (that might be fixable, but lets keep things
simple for now). We introduce a cleaner way to enforce this as the previous one
was broken in 6.6
On the way we remove multiple weird, fragile and broken overwrite of revlog
attributes and we focus on passing the configuration across.
We also had to update the test to actually create a non-inline revlog.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 02 Dec 2023 02:12:21 +0100] rev 51158
revlog: add a `may_inline` argument to revlog
This allow for a clean skipping of the inline feature when needed, for example
by censor.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 02 Dec 2023 02:11:20 +0100] rev 51157
revlog: allow explicit passing of config to revlog
This will be useful to fix censor in a later changeset.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 02 Dec 2023 01:06:35 +0100] rev 51156
censor: show that the `not-inline` → `inline` test is broken
The source revlog should not be inlined and it is…
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 15 Nov 2023 18:43:03 +0000] rev 51155
rhg: support rhg status --rev --rev
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 15 Nov 2023 18:41:33 +0000] rev 51154
rust: add a utility function to merge ordered fallible iterators
Adding a function merge_join_results_by, a version of
itertools::merge_join_by that works on "fallible" iterators
(iterators that can produce errors)
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 16 Oct 2023 18:56:40 +0100] rev 51153
rhg: refactor hg status, make the display code usable for non-dirstate status
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 03 Dec 2023 04:49:49 +0100] rev 51152
perf-tags: fix clear_cache_fnodes to actually clear that cache
The function was not doing it what it advertise for a long time. So we fix it
and we add a way for the perf extensions to detect broken version.
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 03 Dec 2023 04:43:08 +0100] rev 51151
perf-tags: fix the --clear-fnode-cache-rev code
It seems like this code never run?
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 04 Dec 2023 17:20:31 +0000] rev 51150
tests: do not fail tests in a state with uncommitted .py file removal
The problem is that [hg locate] lists removed files too.
We use [hg files] instead because that does not list removed files.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 02 Dec 2023 00:52:37 -0500] rev 51149
tests: fill in the Windows pattern for `$EADDRNOTAVAIL$` matching
This fixes test-https.t on Windows.
It looks like the real error translation is "Cannot assign requested address.",
and the message here is the start of a longer description, so I'm not sure why
this part is emitted. But it's not worth digging into, as it's evidently the
same failure.