Sat, 20 Jul 2024 00:41:37 +0200 revlogutils: fix _chunk() reference
Joerg Sonnenberger <joerg@bec.de> [Sat, 20 Jul 2024 00:41:37 +0200] rev 51911
revlogutils: fix _chunk() reference _chunk is only found in the inner revlog object and not directly exposed outside.
Mon, 02 Sep 2024 22:14:38 +0200 rev-branch-cache: reenable memory mapping of the revision data
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 02 Sep 2024 22:14:38 +0200] rev 51910
rev-branch-cache: reenable memory mapping of the revision data Now that we are no longer truncating it, we can mmap it again. This provide a sizeable speedup on repository with a very large amount of revision for example for a mozilla-try clone with 5 793 383 revisions, this provide a speedup of 5ms - 10ms. Since they happens within the "critical" locked path during push. These miliseconds are important. In addition, the v3 branchmap format is use the rev-branch-cache more than the v2 branchmap cache so this will be important. On smaller repository we consistently see an improvement of one or two percents, but the gain in absolute time is usually < 10 ms. #### benchmark.name = hg.command.unbundle # benchmark.variants.issue6528 = disabled # benchmark.variants.reuse-external-delta-parent = yes # benchmark.variants.revs = any-1-extra-rev # benchmark.variants.source = unbundle # benchmark.variants.verbosity = quiet ### data-env-vars.name = mozilla-try-2024-03-26-zstd-sparse-revlog ## bin-env-vars.hg.flavor = default e51161b12c7e: 3.527923 ebdcfe85b070: 3.468178 (-1.69%, -0.06) ## bin-env-vars.hg.flavor = rust e51161b12c7e: 3.580158 ebdcfe85b070: 3.480564 (-2.78%, -0.10) ### data-env-vars.name = mozilla-try-2024-03-26-ds2-pnm ## bin-env-vars.hg.flavor = rust e51161b12c7e: 3.527923 ebdcfe85b070: 3.468178 (-1.69%, -0.06)
Wed, 25 Sep 2024 12:42:47 +0200 rev-branch-cache: have debugupdatecache warm rbc too
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 25 Sep 2024 12:42:47 +0200] rev 51909
rev-branch-cache: have debugupdatecache warm rbc too Since the "v2" format can be more performant than the "v1" format (thanks to mmap), it is useful to be able to make sure it is present
Wed, 25 Sep 2024 12:49:32 +0200 rev-branch-cache: schedule a write of the "v2" format if we read from "v1"
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 25 Sep 2024 12:49:32 +0200] rev 51908
rev-branch-cache: schedule a write of the "v2" format if we read from "v1" The new file can be memorymapped, while the old one cannot. So there is value in having the v2 format around as soon a possible.
Tue, 24 Sep 2024 15:44:10 +0200 rev-branch-cache: fallback on "v1" data if no v2 is found
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 15:44:10 +0200] rev 51907
rev-branch-cache: fallback on "v1" data if no v2 is found This will help smooth the transition to the v2 format for existing large repository.
Tue, 24 Sep 2024 03:16:35 +0200 rev-branch-cache: increment the version to "v2"
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 03:16:35 +0200] rev 51906
rev-branch-cache: increment the version to "v2" We want to ensure no older clients will truncate the file under us. So we need to change their name. We don't change the rest of the format (unfortunaly).
Tue, 24 Sep 2024 00:16:23 +0200 rev-branch-cache: stop truncating cache file
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 00:16:23 +0200] rev 51905
rev-branch-cache: stop truncating cache file Truncating the file prevent the safe use of mmap. So instead of overwrite the existing data. If more than 20% of the file is to be overwritten, we rewrite the whole file instead. Such whole rewrite is done by replacing the old one with a new one, so mmap of the old file would be affected. This prepare a more aggressive use of mmap in later patches.
Tue, 24 Sep 2024 00:16:04 +0200 rev-branch-cache: make sure we close the name file we open
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 00:16:04 +0200] rev 51904
rev-branch-cache: make sure we close the name file we open We were various opening without with or try. Adding a try would not hurt.
Mon, 23 Sep 2024 23:52:45 +0200 rev-branch-cache: add a way to force rewrite of the cache
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 23 Sep 2024 23:52:45 +0200] rev 51903
rev-branch-cache: add a way to force rewrite of the cache This seems useful to be able to do this, for example during strip. This align with the intended expressed in the `test-branches.t` test. This will help use being more confident about future changes in the series.
Tue, 24 Sep 2024 00:01:30 +0200 rev-branch-cache: issue more truthful "truncating" message
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 24 Sep 2024 00:01:30 +0200] rev 51902
rev-branch-cache: issue more truthful "truncating" message First, don't pretend it truncate to 40 when it actually truncate to 0. Second, don't pretend to truncate to 0 when the file is already empty/missing.
Sun, 22 Sep 2024 15:55:46 +0200 rev-branch-cache: move the code in a dedicated module
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 22 Sep 2024 15:55:46 +0200] rev 51901
rev-branch-cache: move the code in a dedicated module The branchmap module is getting huge and the rev branch cache is fully independent, lets move it elsewhere.
Wed, 25 Sep 2024 01:16:47 -0400 statichttprepo: stop shadowing the `bytes` builtin
Matt Harbison <matt_harbison@yahoo.com> [Wed, 25 Sep 2024 01:16:47 -0400] rev 51900
statichttprepo: stop shadowing the `bytes` builtin PyCharm flagged it, but I also misunderstood when looking at the code, because the name implied a byte string, not a number.
Wed, 25 Sep 2024 01:12:39 -0400 statichttprepo: fix `httprangereader.read()` for py3
Matt Harbison <matt_harbison@yahoo.com> [Wed, 25 Sep 2024 01:12:39 -0400] rev 51899
statichttprepo: fix `httprangereader.read()` for py3 It looks like there were a bunch of problems, not all of them py3 related: 1) The signature of BinaryIO.read() is -1, not None 2) The `end` variable can't be bytes and interpolate into str with "%s" 3) The `end` variable can't be an int and interpolate into str with "%s" 4) The result slicing could be out of bounds if more is requested than returned I guess if somebody would have called `read(-1)` (either directly or because a wrapper defaults to that), it wouldn't have been handled correctly. The fact that it is a valid value meaning to read everything requires some additional changes later in the method around when it slices the byte string that was read, but that seems to have already been broken.
Wed, 25 Sep 2024 00:52:44 -0400 statichttprepo: use a context manager to handle a file descriptor
Matt Harbison <matt_harbison@yahoo.com> [Wed, 25 Sep 2024 00:52:44 -0400] rev 51898
statichttprepo: use a context manager to handle a file descriptor I'm not sure if this should be reduced to `vfs.exists()`. That would seem to be equivalent code (since the result of the read is ignored, so we can't tell if the file actually has content, which has been the state of things going back to 98b6c3dde237), but this is at least safer file descriptor handling.
Thu, 26 Sep 2024 02:58:50 +0200 profiling: pass bytes to `_()` and `error.Abort()`
Matt Harbison <matt_harbison@yahoo.com> [Thu, 26 Sep 2024 02:58:50 +0200] rev 51897
profiling: pass bytes to `_()` and `error.Abort()` And of course `other_tool_name` is str too, so that needs to be converted. The type hints from PyCharm say `sys.monitoring.get_tool()` can return None, so handle that case explicitly before it trips up pytype.
Mon, 08 Jul 2024 22:46:04 +0200 exchange: improve computation of relevant markers for large repos
Joerg Sonnenberger <joerg@bec.de> [Mon, 08 Jul 2024 22:46:04 +0200] rev 51896
exchange: improve computation of relevant markers for large repos Compute the candidate nodes with relevant markers directly from keys of the predecessors/successors/children dictionaries of obsstore. This is faster than iterating over all nodes directly. This test could be further improved for repositories with relative few markers compared to the repository size, but this is no longer hot already. With the current loop structure, the obshashrange use works as well as before as it passes lists with a single node. Adjust the interface by allowing revision lists as well as node lists. This helps cases that computes ancestors as it reduces the materialisation cost. Use this in _pushdiscoveryobsmarker and _getbundleobsmarkerpart. Improve the latter further by directly using ancestors(). Performance benchmarks show notable and welcome improvement to no-op push and pull (that would also apply to other push/pull). This apply to push and pull done without evolve. ### push/pull Benchmark parameter # bin-env-vars.hg.flavor = default # benchmark.variants.explicit-rev = none # benchmark.variants.protocol = ssh # benchmark.variants.revs = none ## benchmark.name = hg.command.pull # data-env-vars.name = mercurial-devel-2024-03-22-zstd-sparse-revlog before: 5.968537 seconds after: 5.668507 seconds (-5.03%, -0.30) # data-env-vars.name = tryton-devel-2024-03-22-zstd-sparse-revlog before: 1.446232 seconds after: 0.835553 seconds (-42.23%, -0.61) # data-env-vars.name = netbsd-src-draft-2024-09-19-zstd-sparse-revlog before: 5.777412 seconds after: 2.523454 seconds (-56.32%, -3.25) ## benchmark.name = hg.command.push # data-env-vars.name = mercurial-devel-2024-03-22-zstd-sparse-revlog before: 6.155501 seconds after: 5.885072 seconds (-4.39%, -0.27) # data-env-vars.name = tryton-devel-2024-03-22-zstd-sparse-revlog before: 1.491054 seconds after: 0.934882 seconds (-37.30%, -0.56) # data-env-vars.name = netbsd-src-draft-2024-09-19-zstd-sparse-revlog before: 5.902494 seconds after: 2.957644 seconds (-49.89%, -2.94) There is not notable different in these result using the "rust" flavor instead of the "default". The performance impact on the same operation when using evolve were also tested and no impact was noted.
Fri, 20 Sep 2024 21:31:58 -0400 typing: make the localrepo classes known to pytype
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 21:31:58 -0400] rev 51895
typing: make the localrepo classes known to pytype 9d4ad05bc91c and 1b17309cdaab both mentioned making `bundlerepository` and `unionrepository` subclass `localrepository` during the type checking phase, but that didn't apply to pytype in practice. See bcaa5d408657 and friends for how the zope interfaces confuse pytype, and end up converting the classes they decorate into `Any`. This commit is slightly more complex though, because `localrepository` has mixin classes applied to it when it is instantiated. Specifically, `RevlogFileStorage` is added, which adds `def file(f)` (which isn't defined on `localrepository`). Therefore a list of `localrepository` superclasses is provided during type checking to account for the mixins. Without this, the `bundlerepository` class gets flagged when it attempts to call its superclass implementation of `file()`. Note that pytype doesn't understand these mixin superclasses (it marks the superclass of `localrepository` as `Any`, because they are zope interfaces it doesn't understand), but that's enough to get it to not flag `bundlerepository`. PyCharm also stops flagging it as a missing function, though it seems like it is able to handle the zope interfaces.
Mon, 23 Sep 2024 14:58:37 -0400 typing: add a handful more annotations to `mercurial/vfs.py`
Matt Harbison <matt_harbison@yahoo.com> [Mon, 23 Sep 2024 14:58:37 -0400] rev 51894
typing: add a handful more annotations to `mercurial/vfs.py` These came out of refactoring into a protocol class, but they can stand on their own. The `audit` callback is kinda screwy because the internal lambda and the callable for `pathutil.pathauditor` have different args and a different return type. It's conditionalized where it is called, and can be cleaned up later if desired.
Sat, 21 Sep 2024 13:53:05 -0400 typing: make `vfs.isfileorlink_checkdir()` path arg required
Matt Harbison <matt_harbison@yahoo.com> [Sat, 21 Sep 2024 13:53:05 -0400] rev 51893
typing: make `vfs.isfileorlink_checkdir()` path arg required The only caller to this is `merge._checkunknownfile()`, which supplies a value. That's good, because `util.localpath()` immediately uses the value to call a method on it on Windows. The posix implementation returns the value unaltered, but then `pathutil.finddirs_rev_noroot()` would have exploded.
Fri, 20 Sep 2024 20:16:12 -0400 typing: manually add type annotations to `mercurial/vfs.py`
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 20:16:12 -0400] rev 51892
typing: manually add type annotations to `mercurial/vfs.py` This isn't everything, but hopefully it's close enough to hack on a protocol class.
Fri, 20 Sep 2024 16:36:28 -0400 typing: correct pytype mistakes in `mercurial/vfs.py`
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 16:36:28 -0400] rev 51891
typing: correct pytype mistakes in `mercurial/vfs.py` With the previous changes in this series (prior to merging the *.pyi file), this wasn't too bad- the only definitively wrong things were the `data` argument to `writelines()`, and the return type on `backgroundclosing()` (both of these errors were dropped in the previous commit; for some reason pytype doesn't like `contextlib._GeneratorContextManager`, even though that's what it determined it is): File "/mnt/c/Users/Matt/hg/mercurial/vfs.py", line 411, in abstractvfs: Bad return type 'contextlib._GeneratorContextManager' for generator function abstractvfs.backgroundclosing [bad-yield-annotation] Expected Generator, Iterable or Iterator PyCharm thinks this is `Generator[backgroundfilecloser], Any, None]`, which can be reduced to `Iterator[backgroundfilecloser]`, but pytype flagged the line that calls `yield` without an argument unless it's also `Optional`. PyCharm is happy either way. For some reason, `Iterable` didn't work for pytype: File "/mnt/c/Users/Matt/hg/mercurial/vfs.py", line 390, in abstractvfs: Function contextlib.contextmanager was called with the wrong arguments [wrong-arg-types] Expected: (func: Callable[[Any], Iterator]) Actually passed: (func: Callable[[Any, Any, Any], Iterable[Optional[Any]]]) Attributes of protocol Iterator[_T_co] are not implemented on Iterable[Optional[Any]]: __next__
Fri, 20 Sep 2024 13:38:13 -0400 typing: run `merge-pyi` on `mercurial/vfs.py`
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 13:38:13 -0400] rev 51890
typing: run `merge-pyi` on `mercurial/vfs.py` The *.pyi file was generated with pytype 2023.11.21. There were a few things here that were wrong (e.g. `writelines()` takes an `Iterable[bytes]`, not `bytes`, or inexplicable errors like importing several of the vfs classes from this very module), and those changes have been dropped manually here.
Fri, 20 Sep 2024 01:10:17 -0400 typing: add type annotations to `mercurial.util.makelock()`
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 01:10:17 -0400] rev 51889
typing: add type annotations to `mercurial.util.makelock()` This bubbles up into the `vfs` classes, so get this out of the way.
Fri, 20 Sep 2024 00:20:24 -0400 util: avoid a leaked file descriptor in `util.makelock()` exceptional case
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 00:20:24 -0400] rev 51888
util: avoid a leaked file descriptor in `util.makelock()` exceptional case
Fri, 20 Sep 2024 00:04:09 -0400 typing: add type annotations to the `mercurial.util.filestat` class
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 00:04:09 -0400] rev 51887
typing: add type annotations to the `mercurial.util.filestat` class It's referenced in the `vfs` classes, so get this out of the way to help there. The `TypeVar` definition and its usage was copied from the existing `util.pyi` file.
Fri, 20 Sep 2024 12:15:08 -0400 vfs: do minor copyediting on comments and doc strings
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 12:15:08 -0400] rev 51886
vfs: do minor copyediting on comments and doc strings These were flagged by PyCharm, so clear them from the gutter.
Fri, 20 Sep 2024 01:16:16 -0400 vfs: simplify the `abstractvfs.rename()` implementation
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 01:16:16 -0400] rev 51885
vfs: simplify the `abstractvfs.rename()` implementation PyCharm was yapping about `util.rename()` not returning anything, because it is typed to return `None`, but the value was captured and returned after calling `_avoidambig()`. Instead, drop all of that, unconditionally rename, and then call `_avoidambig()` if appropriate. While we're here, convert the ersatz ternary operator into a modern one to help pytype. When a variable is initialized the old way, pytype tends to assign the type of the LHS of the `and`. In this case, that's a bool, and it will get confused that bool doesn't have a `stat` attribute once this method gets more type annotations. (Currently it thinks the `checkambig` arg is `Any`, so it doesn't care.)
Fri, 20 Sep 2024 00:07:39 -0400 vfs: use @abstractmethod instead of homebrewing abstract methods
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Sep 2024 00:07:39 -0400] rev 51884
vfs: use @abstractmethod instead of homebrewing abstract methods The latter confuses PyCharm after adding more type annotations when, for example, `abstractvfs.rename()` calls `_auditpath()`- the latter unconditionally raised an error, so PyCharm thought the code that came after is unreachable. It also tricked pytype into marking the return type as `Never`, which isn't available until Python 3.11 (outside of `typing_extensions`). This also avoid PyCharm warnings that the call to the superclass constructor was missed (it couldn't be called because it raised an error to prevent instantiation). The statichttprepo module needed to be given an override for one of the abstract methods, so that it can be instantiated. In `abstractvfs`, this method is only called by `rename()`, so I think we can leave this empty. We raise an error in case somebody accidentally calls it in the future- it would have raised this same error prior to this change. I couldn't wrangle `import-checker.py` into accepting importing `ABC` and `abstractmethod`- for each subsequent import, it reports something like: stdlib import "contextlib" follows local import: abc I suspect the problem is that near the `if fullname != '__future__'` check, if the module doesn't fall into the error case, `seenlocal` gets set to the module name. That causes it to be treated like a local module on the next iteration, even though it is in `stdlib_modules`.
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -28 +28 +50 +100 +300 tip