Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 06 Jun 2023 01:43:48 +0200] rev 50674
perf: add a function to find a stream version generator
The logic is clearer and can be reused for other commands in the future.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 18 May 2023 19:23:59 +0100] rev 50673
treemanifest: make `updatecaches` update the nodemaps for all directories
Without this, if the cache for a nested directory is in a bad state,
it's very hard to repair it.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 31 May 2023 10:37:55 +0100] rev 50672
stream-clone: avoid opening a revlog in case we do not need it
Opening an revlog has a cost, especially if it is inline as we have to scan the
file and construct an index.
To prevent the associated slowdown, we just do a minimal scan to check that an
inline file is still inline, and simply stream the file without creating a
revlog when we can.
This provides a big boost compared to the previous changeset, even if the full
generation is still penalized by the initial gathering of information.
All benchmarks are run on linux with Python 3.10.7.
# benchmark.name = hg.exchange.stream.generate
# benchmark.variants.version = v2
### Compared to the previous changesets
We get a large win all across the board!
# mercurial-2018-08-01-zstd-sparse-revlog
before: 0.250694 seconds
after: 0.105986 seconds (-57.72%)
# pypy-2018-08-01-zstd-sparse-revlog
before: 3.885657 seconds
after: 1.709748 seconds (-56.00%)
# netbeans-2018-08-01-zstd-sparse-revlog
before: 16.679371 seconds
after: 7.687469 seconds (-53.91%)
# mozilla-central-2018-08-01-zstd-sparse-revlog
before: 38.575482 seconds
after: 17.520316 seconds (-54.58%)
# mozilla-try-2019-02-18-zstd-sparse-revlog
before: 81.160994 seconds
after: 37.073753 seconds (-54.32%)
### Compared to 6.4.3
We are still significantly slower than 6.4.3, the extra time is usually twice
slower than the extra time we observe on the locked section, which is a quite
interesting information.
Except for mercurial-central that is much faster. That discrepancy is not really
explained yet.
# mercurial-2018-08-01-zstd-sparse-revlog
6.4.3: 0.072560 seconds
after: 0.105986 seconds (+46.07%) (- 0.03 seconds)
# pypy-2018-08-01-zstd-sparse-revlog
6.4.3: 1.211193 seconds
after: 1.709748 seconds (+41.16%) (-0.45 seconds)
# netbeans-2018-08-01-zstd-sparse-revlog
6.4.3: 4.932843 seconds
after: 7.687469 seconds (+55.84%) (-2.75 seconds)
# mozilla-central-2018-08-01-zstd-sparse-revlog
6.4.3: 34.012226 seconds
after: 17.520316 seconds (-48.49%) (-16.49 seconds)
# mozilla-try-2019-02-18-zstd-sparse-revlog
6.4.3: 23.850555 seconds
after: 37.073753 seconds (+55.44%) (+13.22 seconds)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 17:43:59 +0100] rev 50671
store: stop relying on a `revlog_type` property
We want to know if a file is related to a revlog, but the rest is dealt with
differently already, so we simplify things further.
as a bonus, this cleanup This provides a small but noticeable speedup.
The number below use `hg perf::stream-locked-section` to measure the time spend
in the locked section of the streaming clone. Number are run on various
repository and compare different steps.:
1) the effect of this patchs,
2) the effect of the cleanup series,
2) current state compared to because large refactoring.
All benchmarks are run on linux with Python 3.10.7.
### Effect of this patch
# mercurial-2018-08-01-zstd-sparse-revlog
# benchmark.name = perf-stream-locked-section
before: 0.030246 seconds
after: 0.029274 seconds (-3.21%)
# pypy-2018-08-01-zstd-sparse-revlog
before: 0.545012 seconds
after: 0.520872 seconds (-4.43%)
# netbeans-2018-08-01-zstd-sparse-revlog
before: 2.719939 seconds
after: 2.626791 seconds (-3.42%)
# mozilla-central-2018-08-01-zstd-sparse-revlog
before: 6.304179 seconds
after: 6.096700 seconds (-3.29%)
# mozilla-try-2019-02-18-zstd-sparse-revlog
before: 14.142687 seconds
after: 13.640779 seconds (-3.55%)
### Effect of this series
A small but sizeable speedup
# mercurial-2018-08-01-zstd-sparse-revlog
before: 0.031122 seconds
after: 0.029274 seconds (-5.94%)
# pypy-2018-08-01-zstd-sparse-revlog
before: 0.589970 seconds
after: 0.520872 seconds (-11.71%)
# netbeans-2018-08-01-zstd-sparse-revlog
before: 2.980300 seconds
after: 2.626791 seconds (-11.86%)
# mozilla-central-2018-08-01-zstd-sparse-revlog
before: 6.863204 seconds
after: 6.096700 seconds (-11.17%)
# mozilla-try-2019-02-18-zstd-sparse-revlog
before: 14.921393 seconds
after: 13.640779 seconds (-8.58%)
### Current state compared to the pre-refactoring state
The refactoring introduced multiple string manipulation and dictionary creation
that seems to induce a signifiant slowdown
Slowdown
# mercurial-2018-08-01-zstd-sparse-revlog
6.4.3: 0.019459 seconds
after: 0.029274 seconds (+50.44%)
## pypy-2018-08-01-zstd-sparse-revlog
6.4.3: 0.290715 seconds
after: 0.520872 seconds (+79.17%)
# netbeans-2018-08-01-zstd-sparse-revlog
6.4.3: 1.403447 seconds
after: 2.626791 seconds (+87.17%)
# mozilla-central-2018-08-01-zstd-sparse-revlog
6.4.3: 3.163549 seconds
after: 6.096700 seconds (+92.72%)
# mozilla-try-2019-02-18-zstd-sparse-revlog
6.4.3: 6.702184 seconds
after: 13.640779 seconds (+103.53%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 16:38:13 +0100] rev 50670
store: directly pass the filesize in the `details` of revlog
The dictionary only contains 1 (or 0) entries, we can directly store that
information (or None).
Moving to a simpler argument passing result in a noticable speedup (because
Python)
The number below use `hg perf::stream-locked-section` to measure the time spend
in the locked section of the streaming clone. Number are run on various
repository.
### mercurial-2018-08-01-zstd-sparse-revlog
before: 0.031247 seconds
after: 0.030246 seconds (-3.20%)
### mozilla-central-2018-08-01-zstd-sparse-revlog
before: 6.718968 seconds
after: 6.304179 seconds (-6.17%)
### mozilla-try-2019-02-18-zstd-sparse-revlog
before: 14.631343 seconds
after: 14.142687 seconds (-3.34%)
### netbeans-2018-08-01-zstd-sparse-revlog
before: 2.895584 seconds
after: 2.719939 seconds (-6.07%)
### pypy-2018-08-01-zstd-sparse-revlog
before: 0.561843 seconds
after: 0.543034 seconds (-3.35%)
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 16:35:10 +0100] rev 50669
store: explicitly pass file_size when creating StoreFile
A small cleanup before large cleanup in the next patch.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 16:33:28 +0100] rev 50668
store: have the revlog determine which files are volatile itself
This is a first step toward simplifying the walk step.
Raphaël Gomès <rgomes@octobus.net> [Mon, 12 Jun 2023 10:50:00 +0200] rev 50667
test-dirstate-version-fallback: future-proof the test for a different default
Dirstate-v2 will become the default at some point, which would cause this
test to fail. Let's save someone else the headache later.
Mathias De Mare <mathias.de_mare@nokia.com> [Wed, 08 Mar 2023 14:23:43 +0100] rev 50666
clonebundles: add support for inline (streaming) clonebundles
The idea behind inline clonebundles is to send them through
the ssh or https connection to the Mercurial server.
We've been using this specifically for streaming clonebundles,
although it works for 'regular' clonebundles as well
(but is less relevant, since pullbundles exist).
We've had this enabled for around 9 months for a part
of our users.
A few benefits are:
- no need to secure an external system,
since everything goes through the same Mercurial server
- easier scaling (in our case: no risk of inconsistencies
between multiple mercurial-server mirrors and nginx clonebundles hosts)
Remaining topics/questions right now:
- The inline clonebundles don't work for https yet.
This is because httppeer doesn't seem to support sending client
capabilities.
I didn't focus on that as my main goal was to get this working
for ssh.
Raphaël Gomès <rgomes@octobus.net> [Thu, 08 Jun 2023 17:02:04 +0200] rev 50665
Added signature for changeset
da372c745e0f
Raphaël Gomès <rgomes@octobus.net> [Thu, 08 Jun 2023 17:02:00 +0200] rev 50664
Added tag 6.4.4 for changeset
da372c745e0f
Raphaël Gomès <rgomes@octobus.net> [Thu, 08 Jun 2023 17:01:29 +0200] rev 50663
relnotes: add 6.4.4
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 14:28:21 +0200] rev 50662
revlog: avoid possible collision between directory and temporary index
Since 6.4, we create a temporary index file to write the split data without
overwriting the inline version too early. However, the store encoding does not
prevent these new `.i.s` file to collide with a directory with the same name.
While the odds for such a collision to happens are fairly low, the collision
would prevent Mercurial from working.
The store encoding have a mitigation solution in place to prevent such
collisions from happening for `.i` and `.d` files, but not for other extensions.
We cannot update this encoding scheme to solve the issue since it would diverge
from older version of Mercurial.
Instead, we create an alternative directory tree dedicated to such files.
The use of the `.i` extension combined with store encoding will prevent
collisions there.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 11:08:19 +0200] rev 50661
revlog: move the computation of the split_index path in a property
This is about to become more complex, so we gather the logic in a single place.
Raphaël Gomès <rgomes@octobus.net> [Mon, 05 Jun 2023 16:43:27 +0200] rev 50660
rust-dirstate: fall back to v1 if reading v2 failed
This will help us not fail when a v1 dirstate is present on disk while a v2
was expected (which could happen with a racy/interrupted upgrade).
Raphaël Gomès <rgomes@octobus.net> [Mon, 05 Jun 2023 17:29:52 +0200] rev 50659
dirstate: add test showing dirstate version mismatch causes an error
We should fall back to trying dirstate v1 when v2 fails to read.
Raphaël Gomès <rgomes@octobus.net> [Mon, 05 Jun 2023 16:30:25 +0200] rev 50658
rust-dirstate: rename `has_dirstate_v2` to `use_dirstate_v2`
It is closer to the right semantics. I added a docstring to better explain
the reasonning. In the next patch(es), I will address the underlying issue
of finding the "wrong" version of the dirstate on disk.
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 05 Jun 2023 03:11:26 +0200] rev 50657
delta-find: fix pulled-delta-reuse-policy=forced behavior
The code that select delta still has too many oportunity to discard the delta
is has been forcibly asked to reuse. However is is fairly easy to use a
dedicated fastpath for this case. So we do so.
Cleaning other code that tries to enforce that policy will be done on default.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 03:49:44 +0200] rev 50656
delta-find: display more information about the search in some case
This will be useful to access the effect of the delta reuse policy.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 03:05:10 +0200] rev 50655
deltafind: issue debug information when we fast-path rivial case too
More debug options never hurts.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 03:11:51 +0200] rev 50654
delta-find: gather the condition to blindly use a full snapshot together
We are about to make the `if` body bigger, so having only one of them is simpler/
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 02:49:10 +0200] rev 50653
delta-find: initialize the debug information much sooner (when possible)
This help us to record debug information in alternative path.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 02:42:28 +0200] rev 50652
delta-find: fix `parents` round detection
We should compare integer with integer, instead of bytes (node).
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 08 Jun 2023 02:35:03 +0200] rev 50651
delta-find: intrduce a `_one_dbg_data` method
This helps with the initialisation of the expected debug information.