Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:10:44 +0100] rev 51438
branching: merge default into stable for 6.7rc0
Raphaël Gomès <rgomes@octobus.net> [Fri, 23 Feb 2024 15:09:18 +0100] rev 51437
branching: merge stable into default
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 14:07:33 +0100] rev 51436
perf: add a --as-push option to perf::unbundle
This turned out to make a quite significant difference.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 06:25:09 +0100] rev 51435
chainsaw-update: exit early if one of the intermediate command fails
That will prevent the user to be presented with a start that pretend to be
consistent with the request, but is not.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 03:32:35 +0100] rev 51434
chainsaw-update: lock the repository for the duration of the operation
This should prevent and catch some misusage where something else try to touch
the repository.
Georges Racinet <georges.racinet@octobus.net> [Fri, 23 Feb 2024 11:41:55 +0100] rev 51433
chainsaw-update: taking care of initial cloning
Perhaps we should go just a bit lower level than this `instance()`,
since the main added value in our use-case is full path resolution,
that we need to do anyway for the rmtree cleanup.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 11:30:58 +0100] rev 51432
chainsaw-update: use a graph with branching in graph
This will be relevant for the next improvement of `chainsaw-update`.
Georges Racinet <georges.racinet@octobus.net> [Wed, 17 Jan 2024 14:39:06 +0100] rev 51431
chainsaw-update: log actual locks breaking
Previously, the command would simply state that it was about
to break locks, not if there was actually some to break.
This version is race-free. It would be also possible to display
the content of the lock before hand (not race-free but informative
in almost all cases).
Georges Racinet <georges.racinet@octobus.net> [Wed, 17 Jan 2024 14:26:58 +0100] rev 51430
vfs: have tryunlink tell what it did
It is useful in certain circumstances to know whether vfs.tryunlink()
actually removed something or not, be it for logging purposes.
Georges Racinet <georges.racinet@octobus.net> [Sat, 26 Nov 2022 12:23:56 +0100] rev 51429
chainsaw: new extension for dangerous operations
The first provided command is `chainsaw-update`, whose one and single job is
to make sure that it will pull, update and purge the target repository,
no matter what may be in the way (locks, notably), see docstring for rationale.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 03:45:07 +0100] rev 51428
rust: disable the RustIndex without persistent nodemap
See rational inline.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 03:44:56 +0100] rev 51427
rust: stop claiming the C index is compatible with the rust code
This is no longer the case since the introduction of the pure Rust Index, and
was probably not the case since the MixedIndex itself.
So we fix the dedicated attribute value.
Raphaël Gomès <rgomes@octobus.net> [Thu, 22 Feb 2024 15:11:26 +0100] rev 51426
rust-index: remove one collect when converting back
Turns out this is slightly faster. Sending the results back to Python is still
the most costly (like 75% of the time) of the whole method, but it's about
as fast as it can be now.
hg perf::phases on mozilla-try-2023-03-22
before: 0.267114
after: 0.247101
Raphaël Gomès <rgomes@octobus.net> [Thu, 22 Feb 2024 15:06:16 +0100] rev 51425
rust-index: improve phase computation speed
While less memory efficient, using an array is *much* faster than using a
HashMap, especially with the default hasher. It even makes the code simpler,
so I'm not really sure what I was thinking in the first place, maybe it's more
obvious now.
This fix a significant performance regression when using the rust version of the
code. (however, the C code still outperform rust on this operation)
hg perf::phases on mozilla-try-2023-03-22
- 6.6.3: 0.451239 seconds
- before: 0.982495 seconds
- after: 0.265347 seconds
- C code: 0.183241 second
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 06:37:25 +0100] rev 51424
phases: directly update the phase sets in advanceboundary
This is similar to what we do in retractboundary. There is no need to invalidate
the cache if we have everything at hand to update it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 05:25:35 +0100] rev 51423
phases: large rework of advance boundary
In a similar spirit as the rework of retractboundary, the new algorithm is doing
an amount of work in the order of magnitude of the amount of changeset that
changes phases. (except to find new roots in impacted higher phases if any may
exists).
This result in a very significant speedup for repository with many old draft
like mozilla try.
runtime of perf:unbundle for a bundle constaining a single changeset (C code):
before 6.7 phase work: 14.497 seconds
before this change: 6.311 seconds (-55%)
with this change: 2.240 seconds (-85%)
Combined with the other patches that fixes the phases computation in the Rust
index, the rust code with a persistent nodemap get back to quite interresting
performances with 2.026 seconds for the same operation, about 10% faster than
the C code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 19:21:14 +0100] rev 51422
phases: apply similar early filtering to advanceboundary
advanceboundary is called the push's unbundle (but not the other unbundle) so
advanceboundary did not show up the profile I looked at so far.
We start with simple pre-filtering to avoid doing any work if we don't needs
too.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:09:25 +0100] rev 51421
phases: filter revision that are already in the right phase
No need to compute new roots if everything is already in order.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:05:29 +0100] rev 51420
phases: invalidate the phases set less often on retract boundary
We already have the information to update the phase set, so we do so directly
instead of invalidating the cache.
This show a sizeable speedup in our `perf::unbundle` benchmark on the
many-draft mozilla-try repository.
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.perf.perf-unbundle
# bin-env-vars.hg.flavor = no-rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.revs = last-10
before: 2.055259 seconds
after: 1.887064 seconds (-8.18%)
# benchmark.variants.revs = last-100
before: 2.409239 seconds
after: 2.222429 seconds (-7.75%)
# benchmark.variants.revs = last-1000
before: 3.945648 seconds
after: 3.762480 seconds (-4.64%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:05:23 +0100] rev 51419
phases: incrementally update the phase sets when reasonable
When the amount of manual walking is small, we update the phases set manually
instead of computing them from scratch. This should help small update. The next
changesets will make this used more often by reducing the amount of full
invalidation we do on roots upgrade.
The criteria for using an incremental upgrade are arbitrary, however, it "should
never hurt".
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 00:01:33 +0100] rev 51418
phasees: properly shallow caopy the phase sets dictionary
We are about to increments the set more incrementally in some case, so we need
to make a proper shallow copy of it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 14:42:13 +0100] rev 51417
phases: pass an unfiltered repository to _ensure_phase_sets
It seems better for such a low level function to be able to assume it operate on
a real repository.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 13:01:25 +0100] rev 51416
phases: drop set building in `hasnonpublicphases`
We don't actually use the set, so why do we ensure they are built?
(we should also clean up the use of repository argument but that's a quest for later).
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:59:28 +0100] rev 51415
phases: gather the logic for phasesets update in a single method
This logic is duplicated around for no good reason, we gather it in a single
place.
The conditional is the new function are a bit weird as we about going to extend it soon.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 10:58:54 +0100] rev 51414
phases: change the way we warm the phasecache in repocache
Same logic as for the previous chngeset. We are about to rename and change the
method used here.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 10:56:05 +0100] rev 51413
phases: use a more generic way to trigger a phases computation for perf
Querying the tip most revision will require the cache to warm the same as
calling the dedicated method. This avoid using a method that is mostly meant for
internal use and will be renamed in a coming changesets.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 12:01:09 +0100] rev 51412
phases: fix an overzealous invalidation of the phase sets
If `len(cl) == self._loadedrevslen` the cache is up to date.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 11:04:56 +0100] rev 51411
phases: type annotation for `_phasesets`
Does not hurt.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 23:46:21 +0100] rev 51410
phases: leverage the collected information to record phase update
Since the lower level function already gather this information, we can directly
use it.
This comes with a small change to the test that are actually fixing them. The
previous version over-reported some phase change that did not exists. In both
case, we are force revision `1` to be secret and `0` remains draft`, the
previous code wrongly reported `0` as moving to secret while it properly
remained draft in the repository.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 10:41:09 +0100] rev 51409
phases: large rewrite on retract boundary
The new code is still pure Python, so we still have room to going significantly
faster. However its complexity of the complex part is `O(|[min_new_draft, tip]|)` instead of
`O(|[min_draft, tip]|` which should help tremendously one repository with old
draft (like mercurial-devel or mozilla-try).
This is especially useful as the most common "retract boundary" operation
happens when we commit/rewrite new drafts or when we push new draft to a
non-publishing server. In this case, the smallest new_revs is very close to the
tip and there is very few work to do.
A few smaller optimisation could be done for these cases and will be introduced in
later changesets.
We still have iterate over large sets of roots, but this is already a great
improvement for a very small amount of work. We gather information on the
affected changeset as we go as we can put it to use in the next changesets.
This extra data collection might slowdown the `register_new` case a bit, however
for register_new, it should not really matters. The set of new nodes is either
small, so the impact is negligible, or the set of new nodes is large, and the
amount of work to do to had them will dominate the overhead the collecting
information in `changed_revs`.
As this new code compute the changes on the fly, it unlock other interesting
improvement to be done in later changeset.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 15:49:21 +0100] rev 51408
phases: fast path public phase advance when everything is public
Everything is already public, so we have nothing to do here.
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 15:24:22 +0100] rev 51407
phases: fast path retract of public phase
There are no boundary to retract, so lets do nothing.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:40:13 +0100] rev 51406
phases: keep internal state as rev-num instead of node-id
Node-id are expensive to work with, dealing with revision is much simple and
faster.
The fact we still used node-id here shows how few effort have been put into
making the phase logic fast. We tend to no longer use node-id internally for
about ten years.
This has a large impact of repository with many draft roots. For example this
Mozilla-try copy have ½ Million draft roots and `perf::unbundle` see a
significant improvement.
### data-env-vars.name = mozilla-try-2023-03-22-zstd-sparse-revlog
# benchmark.name = hg.perf.perf-unbundle
# bin-env-vars.hg.flavor = no-rust
# bin-env-vars.hg.py-re2-module = default
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.revs = last-1
before:: 1.746791 seconds
after:: 1.278379 seconds (-26.82%)
# benchmark.variants.revs = last-10
before:: 3.145774 seconds
after:: 2.103735 seconds (-33.13%)
# benchmark.variants.revs = last-100
before:: 3.487635 seconds
after:: 2.446749 seconds (-29.85%)
# benchmark.variants.revs = last-1000
before:: 5.007568 seconds
after:: 3.989923 seconds (-20.32%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:40:08 +0100] rev 51405
phases: do filtering at read time
This remove the need for the `filterunknown` method at all.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 21:38:01 +0100] rev 51404
phases: always write with a repo
In the future change that move the internal representation of phase-roots from
node-id to rev-num, we will use a repository to translate revision numbers back
to node at write time.
Since that future change is quite complicated already, we do this small API
change beforehand.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 17:18:15 +0100] rev 51403
phases: mark `phasecache.phaseroots` private
We are about to change its content from nodeid to revnum. So anyone directly
using the content might be in unexpected troubles. We start by making it private
to explicitly break any such user (and discourage them to do so).
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 17:17:54 +0100] rev 51402
phases: check secret presence the right way during discovery
There is an official function for this, lets use it.
This will prevent the code to break in the future while we refactor the phase
code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 20 Feb 2024 14:21:18 +0100] rev 51401
phases: explicitly filter stripped revision at strip time
Explicit is better than implicit. The current logic is bit subtle and fragile.
It also get in the way of using something else than node-id as internal storage.
We replace it with a more explicit filtering while striping.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 23 Feb 2024 04:26:03 +0100] rev 51400
debug: add a debug::unbundle command that simulate the unbundle from a push
The code have different behavior when the unbundle comes from a push, so we
introduce a command that can simulate such unbundle.
For our copy of mozilla-try-2023-03-22, this make the unbundle jump from 2.5
seconds (with `hg unbundle`) to 15 seconds (with `hg debug::unbundle`).
That 15 seconds timings is consistent with the issue seen in production.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Feb 2024 18:28:01 +0100] rev 51399
perf: support --template on perf::phases
Zeger Van de Vannet <zeger@vandevan.net> [Wed, 14 Feb 2024 08:14:46 +0100] rev 51398
annotate: limit output to range of lines
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 12 Feb 2024 20:01:27 +0000] rev 51397
revlog: add a Rust implementation of `headrevsdiff`
Python implementation of `headrevsdiff` can be very slow in the worst
case compared with the `heads` computation it replaces, since the
latter is done in Rust.
Even the average case of this Python implementation is still
noticeable in the profiles.
This patch makes the computation much much faster by doing it in Rust.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 21 Dec 2023 20:30:03 +0000] rev 51396
revlog: add a C implementation of `headrevsdiff`
Python implementation of `headrevsdiff` can be very slow in the worst
case compared with the `heads` computation it replaces, since the
latter is done in C.
Even the average case of this Python implementation is still
noticeable in the profiles.
This patch makes the computation much much faster by doing it in C.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 21 Dec 2023 17:38:04 +0000] rev 51395
unbundle: faster computation of changed heads
To compute the set of changed heads it's sufficient to look at the recent commits,
instead of looking at all heads currently in existence.
Raphaël Gomès <rgomes@octobus.net> [Wed, 21 Feb 2024 11:53:30 +0100] rev 51394
branching: merge stable into default
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Tue, 20 Feb 2024 10:47:47 -0500] rev 51393
hg-core: separate timestamp and extra methods
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 21 Feb 2024 02:12:58 +0100] rev 51392
debugformat: fix formatting for compression level
`bytes(<int>)` gives a very different result as `str(<int>)` and the display
of `hg debugformat` have been broken for a while as a result.
Arun Kulshreshtha <akulshreshtha@janestreet.com> [Thu, 15 Feb 2024 11:39:18 -0500] rev 51391
hg-core: implement timestamp line parsing
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 15:21:44 -0500] rev 51390
doc: document that labels must have a dot in them to have an effect
I noticed that the `hg topics` template has a bare `topic` label with
no dot, and that makes it useless, as such a label will never receive
any effect by the colour extension.
This dot has been required for a long time, at least since 2011, but
we never formally documented it!
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Feb 2024 18:10:41 +0000] rev 51389
tests: tweak chg test to make it fail less often
the test apparently sometimes prints the word "start" as a part of profile,
so let's no longer match "start":
CHGHG=/*/install/bin/hg (glob)
+ \x1b[90m | 50.0% 0.01s profiling.py: __enter__ line 196: self.start()\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s profiling.py: start line 261: self._profiler.__enter__()\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s profiling.py: statprofile line 125: statprof.start(mechanism=b'...\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s statprof.py: start line 356: state.thread.start()\x1b[0m (esc)
+ \x1b[90m | 50.0% 0.01s threading.py: start line 852: self._started.wait()\x1b[0m (esc)
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 15 Feb 2024 15:21:43 +0000] rev 51388
cext: fix potential memory leaks of list items appended with PyList_Append
Also reduce the duplication in the tricky code that uses PyList_Append by
extracting it into a function `pylist_append_owned`.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:55:11 -0500] rev 51387
crecord: enable search hotkeys (
issue6834)
The keys I chose here should be similar to less/vim keybindings, which
should fit the overall keybinding theme of crecord.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:54:21 -0500] rev 51386
crecord: add handle(next|prev)search functions
These are now just simple wrappers around `searchdirection`
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:53:58 -0500] rev 51385
crecord: add a searchdirection function
If a regex has already been previously set, this function handles the
UI elements of searching again forward or backward.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:50:00 -0500] rev 51384
crecord: add a handlesearch function
This function sets up some of the UI, such as getting the search
string from the user and displaying results or their absence.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:48:09 -0500] rev 51383
crecord: add a showsearch function
This function takes a regex and searches either forward or backward,
moving the current item to the found item, if any, and unfolding the relevant context.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:46:41 -0500] rev 51382
crecord: add a default regex to curseschunkselector
Whether there is a regex to search or not will affect if we can find
the next or the previous search hit.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:43:51 -0500] rev 51381
crecord: add `content` properties to all nodes
In order to have a unified API of what can be searched, let's provide
a `content` property to each node type. This way we can search
filenames, context headers (e.g. containing function names, if
deducible from patch context) or changed lines themselves.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:42:08 -0500] rev 51380
crecord: update uiheader docstring
There's no need to move anything to patch.py. The uiheader class only
has methods relevant to crecord and overrides __getattr__ in order to
use `patch.header` objects as a sort of mixin.
Jordi Gutiérrez Hermoso <jordigh@octave.org> [Wed, 14 Feb 2024 22:40:47 -0500] rev 51379
crecord: add skipfolded param to previtem
This just simplifies the API a bit so it matches `nextitem` and I
can handle both nextitem and previtem symmetrically.