emitrevision: consider ancestors revision to emit as available base
This should make more delta base valid. This notably affects:
* case where we skipped some parent with empty delta to directly delta against
an ancestors
* case where an intermediate snapshots is stored.
This change means we could sent largish intermediate snapshots over the wire.
However this is actually a sub goal here. Sending snapshots over the wire means
the client have a high odd of simply storing the pre-computed delta instead of
doing a lengthy process that will… end up doing the same intermediate snapshot.
In addition the overall size of snapshot (or any level) is "only" some or the
overall delta size. (0.17% for my mercurial clone, 20% for my clone of Mozilla
try). So Sending them other the wire is unlikely to change large impact on the
bandwidth used.
If we decide that minimising the bandwidth is an explicit goal, we should
introduce new logic to filter-out snapshot as delta. The current code has no
notion explicite of snapshot so far, they just tended to fall into the wobbly
filtering options.
In some cases, this patch can yield large improvement to the bundling time:
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = perf-bundle
# benchmark.variants.revs = last-100000
before: 68.787066 seconds
after: 47.552677 seconds (-30.87%)
That translate to large improvement to the pull time :
### data-env-vars.name = mozilla-try-2019-02-18-zstd-sparse-revlog
# benchmark.name = pull
# benchmark.variants.
issue6528 = disabled
# benchmark.variants.revs = last-100000
before: 142.186625 seconds
after: 75.897745 seconds (-46.62%)
No significant negative impact have been observed.
sqlitestore: add an `ancestors` method
We will need it during bundling.
The implementation mirror the one in revlog.
emitrevision: if we need to compute a delta on the fly, try p1 or p2 first
Falling back to `prev` does not yield any real value on modern storage and
result in pathological changes to be created on the other side. Doing a delta
against a parent will likely be smaller (helping the network) and will be safer
to apply on the client (helping future pulls by Triggering intermediate
snapshop where they will be needed by later deltas).
emitrevision: simplify the fallback to computed delta
Not using the stored delta, or having a full snapshot on disk behave the same
ways, so lets use the same code path for that, this is simpler, and it update
will be simpler.
emitrevision: also check the parents in the availability closure
One of the point of having a closure is to gather the logic in it. So we gather
the logic.
The `parents[:]` part is a bit ugly but will be replaced by better code soon
anyway.
emitrevision: add a small closure to check if a base is usable
We will make more use of this and make it more complex too.
chg: scale the timeout in test with the rest
This should avoid some flakiness where the logs reports server shutting down.
hghave: we might need py310 and py311 at some point
Some tests are already showing slightly different results on Python 3.11. The
better idea would be to make them more portable, but if that's not possible,
now we can use hghave detection for certain lines.
I wonder if there will ever be Python 31.0 and 31.1 though.
hghave: detect Python 3.10 and 3.11 as well
Noticed because test-contrib-relnotes.t was skipped.
extensions: load help from hgext.__index__ as a fallback this time
Prior to
843418dc0b1b, `hgext.__index__` was consulted first if present, which
caused the longer help from the extension modules to be ignored, even when
available. But that change causes a bunch of test failures when the pyoxidized
binary bundles *.pyc in the binary, saying the there's no help topic for
`hg help $disabled_extension` and suggesting the use of `--keyword`, rather than
showing a summary and indicating that it is disabled. Current failures were in
test-check-help.t, test-extension.t, test-help.t, and test-qrecord.t.
Ideally, we would read the various *.pyc files from memory and slurp in the
docstring, but I know that they used to not be readable as resources, and I
can't figure out how to make it work now. So maybe 3.9 and/or the current
PyOxidizer doesn't support it yet. I got closer in py2exe with
`importlib.resources.open_binary("hgext", "rebase.pyc")`, but `open_binary()` on
*.pyc fails in pyoxidizer.[1] Either way, the *.pyc can't be passed to
`ast.parse()` as `extensions._disabledcmdtable()` is doing, so I'm setting that
aside for now.
[1] https://github.com/indygreg/PyOxidizer/issues/649
extensions: process disabled external paths when `hgext` package is in-memory
This fixes `hg help -e ambiguous` in test-helpt.t:2055 with the
`ambiguous = !./ambiguous.py` configuration, when `hgext` is not in the
filesystem (e.g. pyoxidizer builds with in-memory resources, or TortoiseHg with
py2exe), but the disabled external extension is. Now instead of aborting with a
suggestion to try `--keyword`, the help command prints text for the extension.
hg: show the correct message when cloning an LFS repo with extension disabled
The `extensions._disabledpaths()` doesn't handle fetching help from `__index__`,
so it returns an empty dictionary of paths. That means None is always returned
from `extensions.disabled_help()` when embedding resources inside the pyoxidizer
or py2exe binary, regardless of the arg or if is an external extension stored in
the filesystem. And that means wrongly telling the user with an explicitly
disabled LFS extension that it will be enabled locally upon cloning from an LFS
remote. That causes test-lfs-serve.t:295 to fail.
This effectively reverts most of the rest of
843418dc0b1b, while keeping the
help text change in place (which was specifically identified as a problem).
demandimport: fix a crash in LazyFinder.__delattr__
I was tinkering with `with hgdemandimport.deactivated()` wrapped around loading
the keyring module, and got spew that seemed to be confirmed by PyCharm. But I
can't believe we haven't seen this before (and phabricator uses the same
pattern):
** Unknown exception encountered with possibly-broken third-party extension "mercurial_keyring" 1.4.3 (keyring 23.11.0, backend unknown)
** which supports versions unknown of Mercurial.
** Please disable "mercurial_keyring" and try your action again.
** If that fixes the bug please report it to https://foss.heptapod.net/mercurial/mercurial_keyring/issues
** Python 3.9.15 (main, Oct 13 2022, 04:28:25) [GCC 7.5.0]
** Mercurial Distributed SCM (version 6.3.1)
** Extensions loaded: absorb, attorc
20220315, blackbox, eol, extdiff, fastannotate, lfs, mercurial_keyring 1.4.3 (keyring 23.11.0, backend unknown), phabblocker
20220315, phabricator
20220315, purge, rebase, schemes, share, show, strip, uncommit
Traceback (most recent call last):
File "/usr/local/bin/hg", line 59, in <module>
dispatch.run()
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 143, in run
status = dispatch(req)
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 232, in dispatch
status = _rundispatch(req)
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 276, in _rundispatch
ret = _runcatch(req) or 0
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 451, in _runcatch
return _callcatch(ui, _runcatchfunc)
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 461, in _callcatch
return scmutil.callcatch(ui, func)
File "/usr/local/lib/python3.9/site-packages/mercurial/scmutil.py", line 153, in callcatch
return func()
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 441, in _runcatchfunc
return _dispatch(req)
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 1265, in _dispatch
return runcommand(
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 899, in runcommand
ret = _runcommand(ui, options, cmd, d)
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 1277, in _runcommand
return cmdfunc()
File "/usr/local/lib/python3.9/site-packages/mercurial/dispatch.py", line 1263, in <lambda>
d = lambda: util.checksignature(func)(ui, *args, **strcmdopt)
File "/usr/local/lib/python3.9/site-packages/mercurial/util.py", line 1880, in check
return func(*args, **kwargs)
File "/root/mercurial_keyring/mercurial_keyring/mercurial_keyring.py", line 962, in cmd_keyring_check
user, pwd, source, final_url = handler.get_credentials(
File "/root/mercurial_keyring/mercurial_keyring/mercurial_keyring.py", line 497, in get_credentials
keyring_pwd = password_store.get_http_password(keyring_url, actual_user)
File "/root/mercurial_keyring/mercurial_keyring/mercurial_keyring.py", line 287, in get_http_password
return self._read_password_from_keyring(
File "/root/mercurial_keyring/mercurial_keyring/mercurial_keyring.py", line 335, in _read_password_from_keyring
keyring = import_keyring()
>> `with hgdemandimport.deactivated()` inserted here
File "/root/mercurial_keyring/mercurial_keyring/mercurial_keyring.py", line 120, in import_keyring
return _import_keyring()
File "/root/mercurial_keyring/mercurial_keyring/mercurial_keyring.py", line 133, in _import_keyring
mod, was_imported_now = meu.direct_import_ext(
File "/usr/lib/python3.9/site-packages/mercurial_extension_utils.py", line 1381, in direct_import_ext
__import__(module_name)
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "/usr/local/lib/python3.9/site-packages/hgdemandimport/demandimportpy3.py", line 46, in exec_module
self.loader.exec_module(module)
File "/usr/lib/python3.9/site-packages/keyring/__init__.py", line 1, in <module>
from .core import (
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "/usr/local/lib/python3.9/site-packages/hgdemandimport/demandimportpy3.py", line 46, in exec_module
self.loader.exec_module(module)
File "/usr/lib/python3.9/site-packages/keyring/core.py", line 11, in <module>
from . import backend, credentials
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "/usr/local/lib/python3.9/site-packages/hgdemandimport/demandimportpy3.py", line 46, in exec_module
self.loader.exec_module(module)
File "/usr/lib/python3.9/site-packages/keyring/backend.py", line 13, in <module>
from .py312compat import metadata
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "/usr/local/lib/python3.9/site-packages/hgdemandimport/demandimportpy3.py", line 46, in exec_module
self.loader.exec_module(module)
File "/usr/lib/python3.9/site-packages/keyring/py312compat.py", line 10, in <module>
import importlib_metadata as metadata # type: ignore
File "<frozen importlib._bootstrap>", line 1007, in _find_and_load
File "<frozen importlib._bootstrap>", line 986, in _find_and_load_unlocked
File "<frozen importlib._bootstrap>", line 680, in _load_unlocked
File "/usr/local/lib/python3.9/site-packages/hgdemandimport/demandimportpy3.py", line 46, in exec_module
self.loader.exec_module(module)
File "/usr/lib/python3.9/site-packages/importlib_metadata/__init__.py", line 715, in <module>
class MetadataPathFinder(NullFinder, DistributionFinder):
File "/usr/lib/python3.9/site-packages/importlib_metadata/_compat.py", line 24, in install
disable_stdlib_finder()
File "/usr/lib/python3.9/site-packages/importlib_metadata/_compat.py", line 43, in disable_stdlib_finder
del finder.find_distributions
File "/usr/local/lib/python3.9/site-packages/hgdemandimport/demandimportpy3.py", line 88, in __delattr__
return delattr(object.__getattribute__(self, "_finder"))
TypeError: delattr expected 2 arguments, got 1
bundle: emit full snapshot as is, without doing a redelta
With the new `forced` delta-reused policy, it become important to be able to
send full snapshot where full snapshot are needed. Otherwise, the fallback delta
will simply be used on the client side… creating monstrous delta chain, since
revision that are meant as a reset of delta-chain chain becoming too complex are
simply adding a new full delta-tree on the leaf of another one.
In the `non-forced` cases, client process full snapshot from the bundle
differently from deltas, so client will still try to convert the full snapshot
into a delta if possible. So this will no lead to pathological storage
explosion.
I have considered making this configurable, but the impact seems limited enough
that it does not seems to be worth it. Especially with the current
sparse-revlog format that use "delta-tree" with multiple level snapshots, full
snapshot are much less frequent and not that different from other intermediate
snapshot that we are already sending over the wire anyway.
CPU wise, this will help the bundling side a little as it will not need to
reconstruct revisions and compute deltas. The unbundling side might save a tiny
amount of CPU as it won't need to reconstruct the delta-base to reconstruct the
revision full text. This only slightly visible in some of the benchmarks. And
have no real impact on most of them.
### data-env-vars.name = pypy-2018-08-01-zstd-sparse-revlog
# benchmark.name = perf-bundle
# benchmark.variants.revs = last-40000
before: 11.467186 seconds
just-emit-full: 11.190576 seconds (-2.41%)
with-pull-force: 11.041091 seconds (-3.72%)
# benchmark.name = perf-unbundle
# benchmark.variants.revs = last-40000
before: 16.744862
just-emit-full:: 16.561036 seconds (-1.10%)
with-pull-force: 16.389344 seconds (-2.12%)
# benchmark.name = pull
# benchmark.variants.revs = last-40000
before: 26.870569
just-emit-full: 26.391188 seconds (-1.78%)
with-pull-force: 25.633184 seconds (-4.60%)
Space wise (so network-wise) the impact is fairly small. When taking compression into
account.
Below are tests the size of `hg bundle --all` for a handful of benchmark repositories
(with bzip, zstd compression and without it)
This show a small increase in the bundle size, but nothing really significant
except maybe for mozilla-try (+12%) that nobody really pulls large chunk of anyway.
Mozilla-try is also the repository that benefit the most for not having to
recompute deltas client size.
### mercurial:
bzip-before: 26 406 342 bytes
bzip-after: 26 691 543 bytes +1.08%
zstd-before: 27 918 645 bytes
zstd-after: 28 075 896 bytes +0.56%
none-before: 98 675 601 bytes
none-after: 100 411 237 bytes +1.76%
### pypy
bzip-before: 201 295 752 bytes
bzip-after: 209 780 282 bytes +4.21%
zstd-before: 202 974 795 bytes
zstd-after: 205 165 780 bytes +1.08%
none-before: 871 070 261 bytes
none-after: 993 595 057 bytes +14.07%
### netbeans
bzip-before: 601 314 330 bytes
bzip-after: 614 246 241 bytes +2.15%
zstd-before: 604 745 136 bytes
zstd-after: 615 497 705 bytes +1.78%
none-before: 3 338 238 571 bytes
none-after: 3 439 422 535 bytes +3.03%
### mozilla-central
bzip-before: 1 493 006 921 bytes
bzip-after: 1 549 650 570 bytes +3.79%
zstd-before: 1 481 910 102 bytes
zstd-after: 1 513 052 415 bytes +2.10%
none-before: 6 535 929 910 bytes
none-after: 7 010 191 342 bytes +7.26%
### mozilla-try
bzip-before: 6 583 425 999 bytes
bzip-after: 7 423 536 928 bytes +12.76%
zstd-before: 6 021 009 212 bytes
zstd-after: 6 674 922 420 bytes +10.86%
none-before: 22 954 739 558 bytes
none-after: 26 013 854 771 bytes +13.32%
bundle: when forcing acceptance of incoming delta also accept snapshot
Snapshot where never considered reusable and the unbundling side always tried
to find a delta from them. In the `forced` mode this is counter-productive
because it will either connect two delta-tree that should not be connected or
it will spend potentially a lot of time because creating a full snapshot
anyway.
So in this mode, we accept the full snapshot as is.
This changeset is benchmarked with its children so please do not split them
apart when landing.
delta-find: properly report full snapshot used from cache as such
The number of tries and the delta base is reported differently so we missed
there detection initially.