Tue, 06 Jun 2023 01:48:10 +0200 perf: add support for stream-v3 during benchmark
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 06 Jun 2023 01:48:10 +0200] rev 50683
perf: add support for stream-v3 during benchmark This is getting important as the v3 protocol will diverge from the v2 protocol.
Tue, 06 Jun 2023 01:43:48 +0200 perf: add a function to find a stream version generator
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 06 Jun 2023 01:43:48 +0200] rev 50682
perf: add a function to find a stream version generator The logic is clearer and can be reused for other commands in the future.
Thu, 18 May 2023 19:23:59 +0100 treemanifest: make `updatecaches` update the nodemaps for all directories
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 18 May 2023 19:23:59 +0100] rev 50681
treemanifest: make `updatecaches` update the nodemaps for all directories Without this, if the cache for a nested directory is in a bad state, it's very hard to repair it.
Wed, 31 May 2023 10:37:55 +0100 stream-clone: avoid opening a revlog in case we do not need it
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 31 May 2023 10:37:55 +0100] rev 50680
stream-clone: avoid opening a revlog in case we do not need it Opening an revlog has a cost, especially if it is inline as we have to scan the file and construct an index. To prevent the associated slowdown, we just do a minimal scan to check that an inline file is still inline, and simply stream the file without creating a revlog when we can. This provides a big boost compared to the previous changeset, even if the full generation is still penalized by the initial gathering of information. All benchmarks are run on linux with Python 3.10.7. # benchmark.name = hg.exchange.stream.generate # benchmark.variants.version = v2 ### Compared to the previous changesets We get a large win all across the board! # mercurial-2018-08-01-zstd-sparse-revlog before: 0.250694 seconds after: 0.105986 seconds (-57.72%) # pypy-2018-08-01-zstd-sparse-revlog before: 3.885657 seconds after: 1.709748 seconds (-56.00%) # netbeans-2018-08-01-zstd-sparse-revlog before: 16.679371 seconds after: 7.687469 seconds (-53.91%) # mozilla-central-2018-08-01-zstd-sparse-revlog before: 38.575482 seconds after: 17.520316 seconds (-54.58%) # mozilla-try-2019-02-18-zstd-sparse-revlog before: 81.160994 seconds after: 37.073753 seconds (-54.32%) ### Compared to 6.4.3 We are still significantly slower than 6.4.3, the extra time is usually twice slower than the extra time we observe on the locked section, which is a quite interesting information. Except for mercurial-central that is much faster. That discrepancy is not really explained yet. # mercurial-2018-08-01-zstd-sparse-revlog 6.4.3: 0.072560 seconds after: 0.105986 seconds (+46.07%) (- 0.03 seconds) # pypy-2018-08-01-zstd-sparse-revlog 6.4.3: 1.211193 seconds after: 1.709748 seconds (+41.16%) (-0.45 seconds) # netbeans-2018-08-01-zstd-sparse-revlog 6.4.3: 4.932843 seconds after: 7.687469 seconds (+55.84%) (-2.75 seconds) # mozilla-central-2018-08-01-zstd-sparse-revlog 6.4.3: 34.012226 seconds after: 17.520316 seconds (-48.49%) (-16.49 seconds) # mozilla-try-2019-02-18-zstd-sparse-revlog 6.4.3: 23.850555 seconds after: 37.073753 seconds (+55.44%) (+13.22 seconds)
Tue, 30 May 2023 17:43:59 +0100 store: stop relying on a `revlog_type` property
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 17:43:59 +0100] rev 50679
store: stop relying on a `revlog_type` property We want to know if a file is related to a revlog, but the rest is dealt with differently already, so we simplify things further. as a bonus, this cleanup This provides a small but noticeable speedup. The number below use `hg perf::stream-locked-section` to measure the time spend in the locked section of the streaming clone. Number are run on various repository and compare different steps.: 1) the effect of this patchs, 2) the effect of the cleanup series, 2) current state compared to because large refactoring. All benchmarks are run on linux with Python 3.10.7. ### Effect of this patch # mercurial-2018-08-01-zstd-sparse-revlog # benchmark.name = perf-stream-locked-section before: 0.030246 seconds after: 0.029274 seconds (-3.21%) # pypy-2018-08-01-zstd-sparse-revlog before: 0.545012 seconds after: 0.520872 seconds (-4.43%) # netbeans-2018-08-01-zstd-sparse-revlog before: 2.719939 seconds after: 2.626791 seconds (-3.42%) # mozilla-central-2018-08-01-zstd-sparse-revlog before: 6.304179 seconds after: 6.096700 seconds (-3.29%) # mozilla-try-2019-02-18-zstd-sparse-revlog before: 14.142687 seconds after: 13.640779 seconds (-3.55%) ### Effect of this series A small but sizeable speedup # mercurial-2018-08-01-zstd-sparse-revlog before: 0.031122 seconds after: 0.029274 seconds (-5.94%) # pypy-2018-08-01-zstd-sparse-revlog before: 0.589970 seconds after: 0.520872 seconds (-11.71%) # netbeans-2018-08-01-zstd-sparse-revlog before: 2.980300 seconds after: 2.626791 seconds (-11.86%) # mozilla-central-2018-08-01-zstd-sparse-revlog before: 6.863204 seconds after: 6.096700 seconds (-11.17%) # mozilla-try-2019-02-18-zstd-sparse-revlog before: 14.921393 seconds after: 13.640779 seconds (-8.58%) ### Current state compared to the pre-refactoring state The refactoring introduced multiple string manipulation and dictionary creation that seems to induce a signifiant slowdown Slowdown # mercurial-2018-08-01-zstd-sparse-revlog 6.4.3: 0.019459 seconds after: 0.029274 seconds (+50.44%) ## pypy-2018-08-01-zstd-sparse-revlog 6.4.3: 0.290715 seconds after: 0.520872 seconds (+79.17%) # netbeans-2018-08-01-zstd-sparse-revlog 6.4.3: 1.403447 seconds after: 2.626791 seconds (+87.17%) # mozilla-central-2018-08-01-zstd-sparse-revlog 6.4.3: 3.163549 seconds after: 6.096700 seconds (+92.72%) # mozilla-try-2019-02-18-zstd-sparse-revlog 6.4.3: 6.702184 seconds after: 13.640779 seconds (+103.53%)
Tue, 30 May 2023 16:38:13 +0100 store: directly pass the filesize in the `details` of revlog
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 16:38:13 +0100] rev 50678
store: directly pass the filesize in the `details` of revlog The dictionary only contains 1 (or 0) entries, we can directly store that information (or None). Moving to a simpler argument passing result in a noticable speedup (because Python) The number below use `hg perf::stream-locked-section` to measure the time spend in the locked section of the streaming clone. Number are run on various repository. ### mercurial-2018-08-01-zstd-sparse-revlog before: 0.031247 seconds after: 0.030246 seconds (-3.20%) ### mozilla-central-2018-08-01-zstd-sparse-revlog before: 6.718968 seconds after: 6.304179 seconds (-6.17%) ### mozilla-try-2019-02-18-zstd-sparse-revlog before: 14.631343 seconds after: 14.142687 seconds (-3.34%) ### netbeans-2018-08-01-zstd-sparse-revlog before: 2.895584 seconds after: 2.719939 seconds (-6.07%) ### pypy-2018-08-01-zstd-sparse-revlog before: 0.561843 seconds after: 0.543034 seconds (-3.35%)
Tue, 30 May 2023 16:35:10 +0100 store: explicitly pass file_size when creating StoreFile
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 16:35:10 +0100] rev 50677
store: explicitly pass file_size when creating StoreFile A small cleanup before large cleanup in the next patch.
Tue, 30 May 2023 16:33:28 +0100 store: have the revlog determine which files are volatile itself
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 30 May 2023 16:33:28 +0100] rev 50676
store: have the revlog determine which files are volatile itself This is a first step toward simplifying the walk step.
Wed, 08 Mar 2023 14:23:43 +0100 clonebundles: add support for inline (streaming) clonebundles
Mathias De Mare <mathias.de_mare@nokia.com> [Wed, 08 Mar 2023 14:23:43 +0100] rev 50675
clonebundles: add support for inline (streaming) clonebundles The idea behind inline clonebundles is to send them through the ssh or https connection to the Mercurial server. We've been using this specifically for streaming clonebundles, although it works for 'regular' clonebundles as well (but is less relevant, since pullbundles exist). We've had this enabled for around 9 months for a part of our users. A few benefits are: - no need to secure an external system, since everything goes through the same Mercurial server - easier scaling (in our case: no risk of inconsistencies between multiple mercurial-server mirrors and nginx clonebundles hosts) Remaining topics/questions right now: - The inline clonebundles don't work for https yet. This is because httppeer doesn't seem to support sending client capabilities. I didn't focus on that as my main goal was to get this working for ssh.
Wed, 31 May 2023 18:08:56 +0100 tree-manifest: allow `debugupgraderepo` to run on tree manifest repo
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 31 May 2023 18:08:56 +0100] rev 50674
tree-manifest: allow `debugupgraderepo` to run on tree manifest repo There does not seems to be anything wrong with running the current logic on them. So we remove the limitation.
Wed, 31 May 2023 16:04:16 +0100 stream-clone: update debugcreatestreamclonebundle helps
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 31 May 2023 16:04:16 +0100] rev 50673
stream-clone: update debugcreatestreamclonebundle helps People to stop using streamv1, so we should point them to alternative in the place where people might find it.
Thu, 25 May 2023 00:23:05 +0200 rewrite: simplify the `retained_extras` extra logic
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 25 May 2023 00:23:05 +0200] rev 50672
rewrite: simplify the `retained_extras` extra logic First, we move the definition of value outside of the rebase extensions, as this apply to all rebase-like operation and some live in other place (like evolve). Second we make it a simple set, so that it is easy for an extension to add a new value in it. Third, we move the associated logic in core too. That make it easily available to other extensions. Fourth we simplify it usage, as the verbose version of the filtering is just a handful on line long, we are just going to test all the value for updates, so the Projection overlay is not bringing much here. Note that, we make it a module level set, is a key is worth preserving it is probably worth preserving in all cases. This was already the behavior prior to this change.
Mon, 29 May 2023 18:41:58 +0200 stream-clone: smoothly detect and handle a case were a revlog is split
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 18:41:58 +0200] rev 50671
stream-clone: smoothly detect and handle a case were a revlog is split This detect and handle the most common case for a race condition around stream and revlog splitting. The one were the revlog is split between the initial collection of data and the time were we start considering stream that data. In such case, we repatch an inlined version of that revlog together when this happens. This is necessary as stream-v2 promised a specific number of bytes and a specific number of files to the client. In stream-v3, we will have the opportunity to just send a split revlog instead. Getting a better version of the protocol for stream-v3 is still useful, but it is no longer a blocket to fix that race condition. Note that another, rarer race condition exist, were the revlog is split while we creating the revlog and extracing content from it. This can be dealt with later.
Mon, 29 May 2023 14:07:58 +0200 stream-clone: implement decidated `get_streams` method for revlog
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 14:07:58 +0200] rev 50670
stream-clone: implement decidated `get_streams` method for revlog For revlog, we can do better using the maximum linkrev expected. This approach open the way to dealing with a much larger set of non-trivial changes, like splitting of inline revlogs. We will actually tackle this issue in the next changesets (thanks to this one).
Sun, 28 May 2023 05:52:58 +0200 stream-clone: make it the responsability of the store entry to stream content
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 28 May 2023 05:52:58 +0200] rev 50669
stream-clone: make it the responsability of the store entry to stream content The store entry has more context, this will especially be true when it comes to revlogs. So we move the details of how to retrieve binary content to the StoreEntry. The stream clone code now focus on the protocol bits.
Mon, 29 May 2023 11:42:16 +0200 store: declare a `files` method on BaseStoreEntry
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 11:42:16 +0200] rev 50668
store: declare a `files` method on BaseStoreEntry This will help pytype to type check. We have to move `StoreFile` earlier in the file to use it in the type declaration.
Sun, 28 May 2023 05:23:46 +0200 revlog: add a `get_revlog` method
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 28 May 2023 05:23:46 +0200] rev 50667
revlog: add a `get_revlog` method This might seen weird, but I actually thing we have been needing this for a long time. There is multiple object that kind of pretend being revlogs while actually wrapping the actual revlog. Since multiple code needs to access the actuel revlog. See documentation for more details. Expect cleanup of various places one the current series is done.
Mon, 29 May 2023 04:26:39 +0200 stream-clone: drop the _emit_v2 function
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 04:26:39 +0200] rev 50666
stream-clone: drop the _emit_v2 function It has no user left.
Mon, 29 May 2023 04:24:39 +0200 stream-clone: directly use `_entries_walk` to generate stream-v2
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 04:24:39 +0200] rev 50665
stream-clone: directly use `_entries_walk` to generate stream-v2 This does not requires that much changes and will give us much more flexibility, like improving revlog handling to gracefully handle race situation.
Mon, 29 May 2023 04:12:30 +0200 stream-clone: pre-indent some code
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 04:12:30 +0200] rev 50664
stream-clone: pre-indent some code This make the next changeset clearer.
Sun, 28 May 2023 04:12:10 +0200 local-clone: perform the hardlink/copy based from _entries_walk returns
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 28 May 2023 04:12:10 +0200] rev 50663
local-clone: perform the hardlink/copy based from _entries_walk returns We previously used `_v2_walk`. However it is not bringing us much. So lets use the higher level function instead. This will offer us more flexibility with the `_v2_walk` function… like deleting it eventually.
Mon, 29 May 2023 04:24:29 +0200 store: cache the file_size when we get it from disk
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 04:24:29 +0200] rev 50662
store: cache the file_size when we get it from disk The point of caching `files` is to ensure consistency and avoiding redoing expensive work. So we cache the file_size once retrieved.
Sun, 28 May 2023 03:46:48 +0200 store: cache the `files()` return for store entries
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 28 May 2023 03:46:48 +0200] rev 50661
store: cache the `files()` return for store entries This make it more efficient to directly use the entries list to retrieve data in various location. It also make the entry record the file size it previously promissed to user code, especially the stream clone code.
Sat, 27 May 2023 04:22:18 +0200 stream-clone: introduce a richer TempCopyManager object
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 27 May 2023 04:22:18 +0200] rev 50660
stream-clone: introduce a richer TempCopyManager object This replace the previous `copy` callable with a richer object that allow access to the backup path. This will simplify the user code as they won't need to keep and pass around the backup path explicitly.
Mon, 29 May 2023 13:29:01 +0200 store: properly compute the targer_id of manifestlog in no-fncache walk
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 13:29:01 +0200] rev 50659
store: properly compute the targer_id of manifestlog in no-fncache walk Creating RevlogStoreEntry is good, but we need to drop the final `00manifest` part to create something correct.
Mon, 29 May 2023 13:28:33 +0200 store: do not drop the final `/` when creating manifestlog instance
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 13:28:33 +0200] rev 50658
store: do not drop the final `/` when creating manifestlog instance This bug, inherited from the upgrade code leads to the acces/creation of broken revlog with name `DIRECTORY00manifest.i` instead of `DIRECTORY/00manifest.i` We fix it in its own changeset to preserve the "pure code movement" aspect of the previous changesets.
Sat, 27 May 2023 04:01:17 +0200 store: add a `get_revlog_instance` method on revlog entries
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 27 May 2023 04:01:17 +0200] rev 50657
store: add a `get_revlog_instance` method on revlog entries The upgrade code needs this a lot, and the stream code is about to needs it too. So we start by moving the upgrade code in a more generic location.
Mon, 29 May 2023 02:22:20 +0200 stream-clone: add a test that highlight crash on revlog splitting
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 02:22:20 +0200] rev 50656
stream-clone: add a test that highlight crash on revlog splitting This has been a long running problem, we should have a tests for it.
Mon, 29 May 2023 01:38:59 +0200 stream-clone: remove unused code in test-clone-stream.t
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 01:38:59 +0200] rev 50655
stream-clone: remove unused code in test-clone-stream.t We are not using the extension we create inline, we are using `tests/testlib/ext-stream-clone-steps.py`. So let us deleted the unused version.
Mon, 29 May 2023 01:38:34 +0200 stream-clone: document the ext-stream-clone-steps.py utility extension
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 01:38:34 +0200] rev 50654
stream-clone: document the ext-stream-clone-steps.py utility extension This extension is useful, let us clarify how to use it.
Mon, 29 May 2023 12:15:10 +0200 test-treemanifest: cleanup the test to more easily show server side error
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 29 May 2023 12:15:10 +0200] rev 50653
test-treemanifest: cleanup the test to more easily show server side error This made my life easier debugging.
Thu, 02 Feb 2023 17:26:10 +0100 safehasattr: pass attribute name as string instead of bytes
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 02 Feb 2023 17:26:10 +0100] rev 50652
safehasattr: pass attribute name as string instead of bytes This is a step toward replacing `util.safehasattr` usage with plain `hasattr`. The builtin function behave poorly in Python2 but this was fixed in Python3. These change are done one by one as they tend to have a small odd to trigger puzzling breackage.
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -32 +32 +50 +100 +300 +1000 tip