Fri, 23 Feb 2024 05:25:35 +0100 phases: large rework of advance boundary
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 05:25:35 +0100] rev 51421
phases: large rework of advance boundary In a similar spirit as the rework of retractboundary, the new algorithm is doing an amount of work in the order of magnitude of the amount of changeset that changes phases. (except to find new roots in impacted higher phases if any may exists). This result in a very significant speedup for repository with many old draft like mozilla try. runtime of perf:unbundle for a bundle constaining a single changeset (C code): before 6.7 phase work: 14.497 seconds before this change: 6.311 seconds (-55%) with this change: 2.240 seconds (-85%) Combined with the other patches that fixes the phases computation in the Rust index, the rust code with a persistent nodemap get back to quite interresting performances with 2.026 seconds for the same operation, about 10% faster than the C code.
Thu, 22 Feb 2024 19:21:14 +0100 phases: apply similar early filtering to advanceboundary
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 19:21:14 +0100] rev 51420
phases: apply similar early filtering to advanceboundary advanceboundary is called the push's unbundle (but not the other unbundle) so advanceboundary did not show up the profile I looked at so far. We start with simple pre-filtering to avoid doing any work if we don't needs too.
Wed, 21 Feb 2024 11:09:25 +0100 phases: filter revision that are already in the right phase
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:09:25 +0100] rev 51419
phases: filter revision that are already in the right phase No need to compute new roots if everything is already in order.
Wed, 21 Feb 2024 13:05:29 +0100 phases: invalidate the phases set less often on retract boundary
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:05:29 +0100] rev 51418
phases: invalidate the phases set less often on retract boundary We already have the information to update the phase set, so we do so directly instead of invalidating the cache. This show a sizeable speedup in our `perf::unbundle` benchmark on the many-draft mozilla-try repository. ### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog # benchmark.name = hg.perf.perf-unbundle # bin-env-vars.hg.flavor = no-rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.issue6528 = disabled # benchmark.variants.revs = last-10 before: 2.055259 seconds after: 1.887064 seconds (-8.18%) # benchmark.variants.revs = last-100 before: 2.409239 seconds after: 2.222429 seconds (-7.75%) # benchmark.variants.revs = last-1000 before: 3.945648 seconds after: 3.762480 seconds (-4.64%)
Wed, 21 Feb 2024 13:05:23 +0100 phases: incrementally update the phase sets when reasonable
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:05:23 +0100] rev 51417
phases: incrementally update the phase sets when reasonable When the amount of manual walking is small, we update the phases set manually instead of computing them from scratch. This should help small update. The next changesets will make this used more often by reducing the amount of full invalidation we do on roots upgrade. The criteria for using an incremental upgrade are arbitrary, however, it "should never hurt".
Fri, 23 Feb 2024 00:01:33 +0100 phasees: properly shallow caopy the phase sets dictionary
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 00:01:33 +0100] rev 51416
phasees: properly shallow caopy the phase sets dictionary We are about to increments the set more incrementally in some case, so we need to make a proper shallow copy of it.
Wed, 21 Feb 2024 14:42:13 +0100 phases: pass an unfiltered repository to _ensure_phase_sets
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 14:42:13 +0100] rev 51415
phases: pass an unfiltered repository to _ensure_phase_sets It seems better for such a low level function to be able to assume it operate on a real repository.
Wed, 21 Feb 2024 13:01:25 +0100 phases: drop set building in `hasnonpublicphases`
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:01:25 +0100] rev 51414
phases: drop set building in `hasnonpublicphases` We don't actually use the set, so why do we ensure they are built? (we should also clean up the use of repository argument but that's a quest for later).
Wed, 21 Feb 2024 11:59:28 +0100 phases: gather the logic for phasesets update in a single method
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:59:28 +0100] rev 51413
phases: gather the logic for phasesets update in a single method This logic is duplicated around for no good reason, we gather it in a single place. The conditional is the new function are a bit weird as we about going to extend it soon.
Thu, 22 Feb 2024 10:58:54 +0100 phases: change the way we warm the phasecache in repocache
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 10:58:54 +0100] rev 51412
phases: change the way we warm the phasecache in repocache Same logic as for the previous chngeset. We are about to rename and change the method used here.
Thu, 22 Feb 2024 10:56:05 +0100 phases: use a more generic way to trigger a phases computation for perf
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 10:56:05 +0100] rev 51411
phases: use a more generic way to trigger a phases computation for perf Querying the tip most revision will require the cache to warm the same as calling the dedicated method. This avoid using a method that is mostly meant for internal use and will be renamed in a coming changesets.
Wed, 21 Feb 2024 12:01:09 +0100 phases: fix an overzealous invalidation of the phase sets
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 12:01:09 +0100] rev 51410
phases: fix an overzealous invalidation of the phase sets If `len(cl) == self._loadedrevslen` the cache is up to date.
Wed, 21 Feb 2024 11:04:56 +0100 phases: type annotation for `_phasesets`
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:04:56 +0100] rev 51409
phases: type annotation for `_phasesets` Does not hurt.
Tue, 20 Feb 2024 23:46:21 +0100 phases: leverage the collected information to record phase update
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 23:46:21 +0100] rev 51408
phases: leverage the collected information to record phase update Since the lower level function already gather this information, we can directly use it. This comes with a small change to the test that are actually fixing them. The previous version over-reported some phase change that did not exists. In both case, we are force revision `1` to be secret and `0` remains draft`, the previous code wrongly reported `0` as moving to secret while it properly remained draft in the repository.
Wed, 21 Feb 2024 10:41:09 +0100 phases: large rewrite on retract boundary
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 10:41:09 +0100] rev 51407
phases: large rewrite on retract boundary The new code is still pure Python, so we still have room to going significantly faster. However its complexity of the complex part is `O(|[min_new_draft, tip]|)` instead of `O(|[min_draft, tip]|` which should help tremendously one repository with old draft (like mercurial-devel or mozilla-try). This is especially useful as the most common "retract boundary" operation happens when we commit/rewrite new drafts or when we push new draft to a non-publishing server. In this case, the smallest new_revs is very close to the tip and there is very few work to do. A few smaller optimisation could be done for these cases and will be introduced in later changesets. We still have iterate over large sets of roots, but this is already a great improvement for a very small amount of work. We gather information on the affected changeset as we go as we can put it to use in the next changesets. This extra data collection might slowdown the `register_new` case a bit, however for register_new, it should not really matters. The set of new nodes is either small, so the impact is negligible, or the set of new nodes is large, and the amount of work to do to had them will dominate the overhead the collecting information in `changed_revs`. As this new code compute the changes on the fly, it unlock other interesting improvement to be done in later changeset.
Thu, 22 Feb 2024 15:49:21 +0100 phases: fast path public phase advance when everything is public
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 15:49:21 +0100] rev 51406
phases: fast path public phase advance when everything is public Everything is already public, so we have nothing to do here.
Wed, 21 Feb 2024 15:24:22 +0100 phases: fast path retract of public phase
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 15:24:22 +0100] rev 51405
phases: fast path retract of public phase There are no boundary to retract, so lets do nothing.
Tue, 20 Feb 2024 21:40:13 +0100 phases: keep internal state as rev-num instead of node-id
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:40:13 +0100] rev 51404
phases: keep internal state as rev-num instead of node-id Node-id are expensive to work with, dealing with revision is much simple and faster. The fact we still used node-id here shows how few effort have been put into making the phase logic fast. We tend to no longer use node-id internally for about ten years. This has a large impact of repository with many draft roots. For example this Mozilla-try copy have ½ Million draft roots and `perf::unbundle` see a significant improvement. ### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog # benchmark.name = hg.perf.perf-unbundle # bin-env-vars.hg.flavor = no-rust # bin-env-vars.hg.py-re2-module = default # benchmark.variants.issue6528 = disabled # benchmark.variants.revs = last-1 before:: 1.746791 seconds after:: 1.278379 seconds (-26.82%) # benchmark.variants.revs = last-10 before:: 3.145774 seconds after:: 2.103735 seconds (-33.13%) # benchmark.variants.revs = last-100 before:: 3.487635 seconds after:: 2.446749 seconds (-29.85%) # benchmark.variants.revs = last-1000 before:: 5.007568 seconds after:: 3.989923 seconds (-20.32%)
Tue, 20 Feb 2024 21:40:08 +0100 phases: do filtering at read time
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:40:08 +0100] rev 51403
phases: do filtering at read time This remove the need for the `filterunknown` method at all.
Tue, 20 Feb 2024 21:38:01 +0100 phases: always write with a repo
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:38:01 +0100] rev 51402
phases: always write with a repo In the future change that move the internal representation of phase-roots from node-id to rev-num, we will use a repository to translate revision numbers back to node at write time. Since that future change is quite complicated already, we do this small API change beforehand.
Tue, 20 Feb 2024 17:18:15 +0100 phases: mark `phasecache.phaseroots` private
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 17:18:15 +0100] rev 51401
phases: mark `phasecache.phaseroots` private We are about to change its content from nodeid to revnum. So anyone directly using the content might be in unexpected troubles. We start by making it private to explicitly break any such user (and discourage them to do so).
Tue, 20 Feb 2024 17:17:54 +0100 phases: check secret presence the right way during discovery
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 17:17:54 +0100] rev 51400
phases: check secret presence the right way during discovery There is an official function for this, lets use it. This will prevent the code to break in the future while we refactor the phase code.
Tue, 20 Feb 2024 14:21:18 +0100 phases: explicitly filter stripped revision at strip time
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 14:21:18 +0100] rev 51399
phases: explicitly filter stripped revision at strip time Explicit is better than implicit. The current logic is bit subtle and fragile. It also get in the way of using something else than node-id as internal storage. We replace it with a more explicit filtering while striping.
Fri, 23 Feb 2024 04:26:03 +0100 debug: add a debug::unbundle command that simulate the unbundle from a push
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 04:26:03 +0100] rev 51398
debug: add a debug::unbundle command that simulate the unbundle from a push The code have different behavior when the unbundle comes from a push, so we introduce a command that can simulate such unbundle. For our copy of mozilla-try-2023-03-22, this make the unbundle jump from 2.5 seconds (with `hg unbundle`) to 15 seconds (with `hg debug::unbundle`). That 15 seconds timings is consistent with the issue seen in production.
Wed, 14 Feb 2024 08:14:46 +0100 annotate: limit output to range of lines
Zeger Van de Vannet <zeger@vandevan.net> [Wed, 14 Feb 2024 08:14:46 +0100] rev 51397
annotate: limit output to range of lines
Mon, 12 Feb 2024 20:01:27 +0000 revlog: add a Rust implementation of `headrevsdiff`
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 12 Feb 2024 20:01:27 +0000] rev 51396
revlog: add a Rust implementation of `headrevsdiff` Python implementation of `headrevsdiff` can be very slow in the worst case compared with the `heads` computation it replaces, since the latter is done in Rust. Even the average case of this Python implementation is still noticeable in the profiles. This patch makes the computation much much faster by doing it in Rust.
Thu, 21 Dec 2023 20:30:03 +0000 revlog: add a C implementation of `headrevsdiff`
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 21 Dec 2023 20:30:03 +0000] rev 51395
revlog: add a C implementation of `headrevsdiff` Python implementation of `headrevsdiff` can be very slow in the worst case compared with the `heads` computation it replaces, since the latter is done in C. Even the average case of this Python implementation is still noticeable in the profiles. This patch makes the computation much much faster by doing it in C.
Thu, 21 Dec 2023 17:38:04 +0000 unbundle: faster computation of changed heads
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 21 Dec 2023 17:38:04 +0000] rev 51394
unbundle: faster computation of changed heads To compute the set of changed heads it's sufficient to look at the recent commits, instead of looking at all heads currently in existence.
Wed, 21 Feb 2024 11:53:30 +0100 branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Wed, 21 Feb 2024 11:53:30 +0100] rev 51393
branching: merge stable into default
Tue, 20 Feb 2024 10:47:47 -0500 hg-core: separate timestamp and extra methods
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Tue, 20 Feb 2024 10:47:47 -0500] rev 51392
hg-core: separate timestamp and extra methods
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -30 +30 +50 +100 tip