Thu, 22 Sep 2022 15:34:27 -0400 rhg: show a bug where repeated [hg status] is needed to cache everything stable
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 22 Sep 2022 15:34:27 -0400] rev 49545
rhg: show a bug where repeated [hg status] is needed to cache everything
Fri, 04 Nov 2022 16:15:12 -0400 upgrade: no longer keep all revlogs in memory at any point stable
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 04 Nov 2022 16:15:12 -0400] rev 49544
upgrade: no longer keep all revlogs in memory at any point Keeping all object open is unsustainable, so we will open them on demand. This mean opening them multiple times, but this is a lesser evil. Each revlog consume a small amount of memory (index content, associated nodemap, etc). While there are few "big" revlog, the sheer amount of small filelog can become a significant issue memory wise, consuming multiple GB of memory. If you combines this extra usage with the use of multiprocessing, this usage can quickly get out of control. This can effectively block the upgrade of larger repository. This changeset fixes this issue.
Wed, 02 Nov 2022 14:23:09 -0400 demandimport: convert ignored modules from bytes -> str in extensions stable
Matt Harbison <matt_harbison@yahoo.com> [Wed, 02 Nov 2022 14:23:09 -0400] rev 49543
demandimport: convert ignored modules from bytes -> str in extensions The default list of ignored modules are str, and test for bypassing the lazy import is `module.__name__ in ignores`, so these were effectively NOT ignored. Most of these date back to the grand byteification in 687b865b95ad, with some subsequent additions that followed the existing example. I have no idea if these modules in fact need to bypass lazy importing, but at least it follows the intent of the code.
Wed, 26 Oct 2022 18:46:56 +0200 dirstate-v2: fix edge case where entries aren't sorted stable
Raphaël Gomès <rgomes@octobus.net> [Wed, 26 Oct 2022 18:46:56 +0200] rev 49542
dirstate-v2: fix edge case where entries aren't sorted See previous commit for more details.
Wed, 26 Oct 2022 18:24:34 +0200 dirstate-v2: highlight a bug when Python-packed but used in `rhg` stable
Raphaël Gomès <rgomes@octobus.net> [Wed, 26 Oct 2022 18:24:34 +0200] rev 49541
dirstate-v2: highlight a bug when Python-packed but used in `rhg` The Python packer creates unsorted entries in the edge case that a file starts with the same name as a sibling folder. This bug has no effect on the Python `hg status` since Python ignores directories. `rhg` assumes that all on-disk entries are sorted (which is a property of the format) including folder, hence the issue highlighted. This is also technically broken in Rust-augmented `hg status`, but it makes setting up the test more complex than necessary, since it requires the packing to be Python only (which it isn't if you have Rust extensions). Fix is in the next commit.
Wed, 26 Oct 2022 12:20:23 +0200 dirstate-v2: correct documented return values of `pack_dirstate` stable
Raphaël Gomès <rgomes@octobus.net> [Wed, 26 Oct 2022 12:20:23 +0200] rev 49540
dirstate-v2: correct documented return values of `pack_dirstate`
Wed, 26 Oct 2022 12:19:47 +0200 dirstate-v2: fix typos in docstrings stable
Raphaël Gomès <rgomes@octobus.net> [Wed, 26 Oct 2022 12:19:47 +0200] rev 49539
dirstate-v2: fix typos in docstrings
Fri, 04 Nov 2022 14:52:16 -0400 dirstate-v2: update constant that wasn't kept in sync stable
Raphaël Gomès <rgomes@octobus.net> [Fri, 04 Nov 2022 14:52:16 -0400] rev 49538
dirstate-v2: update constant that wasn't kept in sync Despite the best efforts of the comment, this constant wasn't kept in sync when the flags were being rewritten. The fact that this doesn't actually break anything in the Rust implementation too much (which does use directories) relies on the fact that all nodes can have children and that dirstate traversal is not based on that flag, but for metadata in optimizations. However the bug could become more serious should we start encoding stronger guarantees using a combination of flags including this one.
Tue, 18 Oct 2022 13:56:45 -0400 lfs: avoid closing connections when the worker doesn't fork stable
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 13:56:45 -0400] rev 49537
lfs: avoid closing connections when the worker doesn't fork Probably not much more than an minor optimization, but could be useful in the case of `hg verify` where missing blobs are fetched one at a time.
Tue, 18 Oct 2022 13:36:33 -0400 lfs: fix blob corruption when tranferring with workers on posix stable
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 13:36:33 -0400] rev 49536
lfs: fix blob corruption when tranferring with workers on posix The problem seems to be that the connection used to request the location of the blobs is sitting in the connection pool, and then when workers are forked, they all see and attempt to use the same connection. This garbles everything. I have no clue how this ever worked reliably (but it seems to, even on Linux, with SCM Manager 1.58). See previous discussion when worker support was added[1]. It shouldn't be a problem on Windows, since the workers are just threads in the same process, and can see which connections are marked available and which are in use. (The fact that `mercurial.keepalive.ConnectionManager.set_ready()` doesn't acquire a lock does give me some pause though.) [1] https://phab.mercurial-scm.org/D1568#31621
Tue, 18 Oct 2022 12:58:34 -0400 keepalive: add `__repr__()` to the HTTPConnection class to ease debugging stable
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 12:58:34 -0400] rev 49535
keepalive: add `__repr__()` to the HTTPConnection class to ease debugging By default, this just printed the class name and memory address. By displaying the address and ports on both sides of the socket, it makes it easier to figure out what's in the ConnectionManager, and correlate with WireShark traces. It looks like the two connections mentioned in the previous commit come about because the LFS POST request to access the blobs opens connection 1, and gets a 401. Then for some reason, the follow up with credentials opens a new socket, instead of using the existing one in the pool. I have no clue why. This can be seen with something like this in the blobstore: ``` for h in self.urlopener.handlers: if hasattr(h, "close_all"): print('open connections on %s in pid %d' % (type(h), os.getpid())) for host, conns in h._cm.get_all().items(): for c in conns: print('connection: %r' % c) ```
Tue, 18 Oct 2022 11:54:58 -0400 keepalive: ensure `close_all()` actually closes all cached connections stable
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 11:54:58 -0400] rev 49534
keepalive: ensure `close_all()` actually closes all cached connections While debugging why LFS blob downloads are getting corrupted with workers, I noticed that prior to spinning up the workers, the ConnectionManager has 2 connections to the server and calling `KeepAliveHandler.close_all()` left one behind. The reason is the value component of `self._cm.get_all().items()` is a list, and `self._cm.remove()` modifies said list while the caller is iterating over it. Now `get_all()` is a deep copy of both the dict and lists in all cases.
Wed, 02 Nov 2022 16:46:46 -0400 localrepo: byteify the requirements.DIRSTATE_TRACKED_HINT_Vx warning message stable
Matt Harbison <matt_harbison@yahoo.com> [Wed, 02 Nov 2022 16:46:46 -0400] rev 49533
localrepo: byteify the requirements.DIRSTATE_TRACKED_HINT_Vx warning message Flagged by PyCharm.
Mon, 31 Oct 2022 16:15:54 +0000 rhg: fallback to slow path on invalid patterns in hgignore stable
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 31 Oct 2022 16:15:54 +0000] rev 49532
rhg: fallback to slow path on invalid patterns in hgignore
Mon, 31 Oct 2022 16:15:30 +0000 rhg: add a test involving hgignore lookaround stable
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 31 Oct 2022 16:15:30 +0000] rev 49531
rhg: add a test involving hgignore lookaround
Mon, 24 Oct 2022 18:07:22 +0200 relnotes: add 6.3 stable
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 18:07:22 +0200] rev 49530
relnotes: add 6.3
Mon, 24 Oct 2022 17:30:44 +0200 Added signature for changeset a3356ab610fc stable
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 17:30:44 +0200] rev 49529
Added signature for changeset a3356ab610fc
Mon, 24 Oct 2022 17:30:19 +0200 Added tag 6.3rc0 for changeset a3356ab610fc stable
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 17:30:19 +0200] rev 49528
Added tag 6.3rc0 for changeset a3356ab610fc
Mon, 24 Oct 2022 15:32:14 +0200 branching: merge default into stable stable 6.3rc0
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 15:32:14 +0200] rev 49527
branching: merge default into stable This marks the feature freeze for the 6.3 release
Wed, 19 Oct 2022 16:16:47 -0400 shelve: re-wrap now that the line fits
Jason R. Coombs <jaraco@jaraco.com> [Wed, 19 Oct 2022 16:16:47 -0400] rev 49526
shelve: re-wrap now that the line fits
Wed, 19 Oct 2022 16:14:50 -0400 shelve: avoid setting overloading tmpwctx
Jason R. Coombs <jaraco@jaraco.com> [Wed, 19 Oct 2022 16:14:50 -0400] rev 49525
shelve: avoid setting overloading tmpwctx
Mon, 10 Oct 2022 14:48:39 +0100 dirstate-v2: skip evaluation of hgignore regex on cached directories
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 10 Oct 2022 14:48:39 +0100] rev 49524
dirstate-v2: skip evaluation of hgignore regex on cached directories By making the computation of [has_ignored_ancestor] lazy we're eliding its computation in the common case when none of its descendants have changed on disk. On a ~400k files repo, with a cached status, we saw a ~64% reduction in CPU time, resulting in a speedup of ~10-15% (on ZFS), and a speedup of ~38% of XFS (XFS has faster stat operations for some reason).
Fri, 30 Sep 2022 09:05:48 -0600 releasenotes: use re.MULTILINE mode when checking admonitions
Craig Ozancin <c.ozancin@gmail.com> [Fri, 30 Sep 2022 09:05:48 -0600] rev 49523
releasenotes: use re.MULTILINE mode when checking admonitions Release note admonitions must start at the beginning of a line within the changeset description: .. admonitions:: The checkadmonitions function search for and validates admonitions. Unfortunately, since the ctx.description is multi-line, the regex search always fails unless the admonition is on the first line. This changeset adds re.MULTILINE to the re.compile to make the re opbject multi-line.
Wed, 05 Oct 2022 15:45:05 -0400 rhg: parallellize computation of [unsure_is_modified]
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 05 Oct 2022 15:45:05 -0400] rev 49522
rhg: parallellize computation of [unsure_is_modified] [unsure_is_modified] is called for every file for which we can't determine its status based on its size and mtime alone. In particular, this happens if the mtime of the file changes without its contents changing. Parallellizing this improves performance significantly when we have many of these files. Here's an example run (on a repo with ~400k files after dropping FS caches) ``` before: real 0m53.901s user 0m27.806s sys 0m31.325s after: real 0m32.017s user 0m34.277s sys 1m26.250s ``` Another example run (a different FS): ``` before: real 3m28.479s user 0m31.800s sys 0m25.324s after: real 0m29.751s user 0m41.814s sys 1m15.387s ```
Wed, 21 Sep 2022 10:14:29 -0400 rhg: enable in case ui.statuscopies=True
Arseniy Alekseyev <aalekseyev@janestreet.com> [Wed, 21 Sep 2022 10:14:29 -0400] rev 49521
rhg: enable in case ui.statuscopies=True rhg already has code to support ui.statuscopies, but it's disabled, for seemingly no good reason.
Thu, 22 Sep 2022 18:44:28 -0400 rhg: share some code
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 22 Sep 2022 18:44:28 -0400] rev 49520
rhg: share some code
Tue, 20 Sep 2022 18:28:25 -0400 rhg: support tweakdefaults
Arseniy Alekseyev <aalekseyev@janestreet.com> [Tue, 20 Sep 2022 18:28:25 -0400] rev 49519
rhg: support tweakdefaults
Thu, 22 Sep 2022 17:16:54 -0400 rhg: centralize PlainInfo
Arseniy Alekseyev <aalekseyev@janestreet.com> [Thu, 22 Sep 2022 17:16:54 -0400] rev 49518
rhg: centralize PlainInfo
Tue, 20 Sep 2022 18:16:50 -0400 rhg: central treatment of PLAIN and PLAINEXCEPT
Arseniy Alekseyev <aalekseyev@janestreet.com> [Tue, 20 Sep 2022 18:16:50 -0400] rev 49517
rhg: central treatment of PLAIN and PLAINEXCEPT
Tue, 04 Oct 2022 12:34:50 -0400 revset: handle wdir() in `sort(..., -topo)`
Matt Harbison <matt_harbison@yahoo.com> [Tue, 04 Oct 2022 12:34:50 -0400] rev 49516
revset: handle wdir() in `sort(..., -topo)` The last apparent usage of `repo.changelog.parentrevs` in revsets is in `children()`, but since the sets being operated on never include wdir(), it's never called with `wdirrev` and the wdir() arg on the command line is effectively ignored instead of aborting there. I'm not sure how to fix that. Before (on a clone of hg): $ python3.8 hg perf::revset --config extensions.perf=contrib/perf.py 'sort(all(), -topo)' ! wall 0.123663 comb 0.130000 user 0.130000 sys 0.000000 (best of 76) After: $ python3.8 hg perf::revset --config extensions.perf=contrib/perf.py 'sort(all(), -topo)' ! wall 0.123838 comb 0.130000 user 0.130000 sys 0.000000 (best of 75)
(0) -30000 -10000 -3000 -1000 -300 -100 -50 -30 +30 +50 +100 +300 +1000 tip