Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 13:56:45 -0400] rev 49545
lfs: avoid closing connections when the worker doesn't fork
Probably not much more than an minor optimization, but could be useful in the
case of `hg verify` where missing blobs are fetched one at a time.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 13:36:33 -0400] rev 49544
lfs: fix blob corruption when tranferring with workers on posix
The problem seems to be that the connection used to request the location of the
blobs is sitting in the connection pool, and then when workers are forked, they
all see and attempt to use the same connection. This garbles everything. I
have no clue how this ever worked reliably (but it seems to, even on Linux, with
SCM Manager 1.58). See previous discussion when worker support was added[1].
It shouldn't be a problem on Windows, since the workers are just threads in the
same process, and can see which connections are marked available and which are
in use. (The fact that `mercurial.keepalive.ConnectionManager.set_ready()`
doesn't acquire a lock does give me some pause though.)
[1] https://phab.mercurial-scm.org/D1568#31621
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 12:58:34 -0400] rev 49543
keepalive: add `__repr__()` to the HTTPConnection class to ease debugging
By default, this just printed the class name and memory address. By displaying
the address and ports on both sides of the socket, it makes it easier to figure
out what's in the ConnectionManager, and correlate with WireShark traces.
It looks like the two connections mentioned in the previous commit come about
because the LFS POST request to access the blobs opens connection 1, and gets a
401. Then for some reason, the follow up with credentials opens a new socket,
instead of using the existing one in the pool. I have no clue why.
This can be seen with something like this in the blobstore:
```
for h in self.urlopener.handlers:
if hasattr(h, "close_all"):
print('open connections on %s in pid %d' % (type(h), os.getpid()))
for host, conns in h._cm.get_all().items():
for c in conns:
print('connection: %r' % c)
```
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 11:54:58 -0400] rev 49542
keepalive: ensure `close_all()` actually closes all cached connections
While debugging why LFS blob downloads are getting corrupted with workers, I
noticed that prior to spinning up the workers, the ConnectionManager has 2
connections to the server and calling `KeepAliveHandler.close_all()` left one
behind. The reason is the value component of `self._cm.get_all().items()` is a
list, and `self._cm.remove()` modifies said list while the caller is iterating
over it. Now `get_all()` is a deep copy of both the dict and lists in all
cases.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 22 Sep 2022 16:27:17 +0200] rev 49541
tests: remove non-python3 line matching and tests block
We don't support Python2 anymore
Matt Harbison <matt_harbison@yahoo.com> [Wed, 02 Nov 2022 16:46:46 -0400] rev 49540
localrepo: byteify the requirements.DIRSTATE_TRACKED_HINT_Vx warning message
Flagged by PyCharm.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 31 Oct 2022 16:15:54 +0000] rev 49539
rhg: fallback to slow path on invalid patterns in hgignore
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 31 Oct 2022 16:15:30 +0000] rev 49538
rhg: add a test involving hgignore lookaround
Anton Shestakov <av6@dwimlabs.net> [Mon, 31 Oct 2022 16:50:22 +0400] rev 49537
pywatchman: remove obsolete comments about importing from future
See
6000f5b25c9b.
Anton Shestakov <av6@dwimlabs.net> [Mon, 31 Oct 2022 16:36:00 +0400] rev 49536
demandimport: remove an obsolete comment about importing from future
See
6000f5b25c9b.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 19 Oct 2022 11:10:54 -0400] rev 49535
mr-template: wrap the instructions inside a comment block
At least in preview mode, this hides the text so the user doesn't have to delete
it. It's still visible in edit mode, so the user sees it.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 19 Oct 2022 11:50:40 -0400] rev 49534
revlog: use the user facing filename as the display_id for filelogs
I had trouble isolating some LFS blob corruption detected by `hg verify` because
the traceback referenced a file, but with the `data/` prefix in the `.hg/store`
path, so it couldn't be located with the `file()` revset:
```
Traceback (most recent call last):
File "/mnt/d/mercurial/mercurial/revlog.py", line 3209, in verifyintegrity
_verify_revision(self, skipflags, state, node)
File "/mnt/d/mercurial/hgext/lfs/wrapper.py", line 246, in _verify_revision
orig(rl, skipflags, state, node)
File "/mnt/d/mercurial/mercurial/revlog.py", line 158, in _verify_revision
rl.revision(node)
File "/mnt/d/mercurial/mercurial/revlog.py", line 1816, in revision
return self._revisiondata(nodeorrev, _df)
File "/mnt/d/mercurial/mercurial/revlog.py", line 1870, in _revisiondata
self.checkhash(text, node, rev=rev)
File "/mnt/d/mercurial/mercurial/revlog.py", line 1996, in checkhash
% (self.display_id, pycompat.bytestr(revornode))
mercurial.error.RevlogError: integrity check failed on data/EXE/PPC/shrinksrec.exe:0
```
(I'm a little surprised it resulted in a stacktrace instead of just a message,
but that's a different issue. I'm also not sure how to trigger the simplestore
case, since IIUC, it's also a revlog based store.)
It's not clear how to handle the changelog and manifest (because the user
doesn't interact with them as a file), so those cases are left alone. The other
thing that would be nice to improve somehow is to indicate that the ":0" is a
revlog revision, not the changeset revision that users are used to. I'm not
sure how to handle the "or node" part though.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 19 Oct 2022 11:24:20 -0400] rev 49533
revlog: drop an unused variable assignment
It's assigned again 2 lines later.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 20 Oct 2022 13:12:37 -0400] rev 49532
lfs: improve an exception message for blob corruption detected on transfer
The message about the server crash originated in
0ee0a3f6a990 (after support for
serving blobs was added), but was copied from the Facebook repo that forked
prior to server side support. Therefore, this message only displayed in their
client, so it was safe to assume the server crashed. But that was never the
case for vanilla Mercurial, as I saw this in a server log.
Also, display the blob reference so that it's easier to figure out where the
problem was when a bunch of blobs are transferred at once.
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 18:07:22 +0200] rev 49531
relnotes: add 6.3
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 17:30:44 +0200] rev 49530
Added signature for changeset
a3356ab610fc
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 17:30:19 +0200] rev 49529
Added tag 6.3rc0 for changeset
a3356ab610fc
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 17:35:30 +0200] rev 49528
branching: merge stable into default
Raphaël Gomès <rgomes@octobus.net> [Mon, 24 Oct 2022 15:32:14 +0200] rev 49527
branching: merge default into stable
This marks the feature freeze for the 6.3 release
Matt Harbison <matt_harbison@yahoo.com> [Thu, 20 Oct 2022 12:05:17 -0400] rev 49526
lfs: fix interpolation of int and %s in an exception case
Seen in the wild in a server log when MS antivirus was quarantining a file on
the client side.
Anton Shestakov <av6@dwimlabs.net> [Wed, 19 Oct 2022 17:00:03 +0400] rev 49525
tests: catch "Can't assign requested address" in test-https.t (
issue6726)
Anton Shestakov <av6@dwimlabs.net> [Wed, 19 Oct 2022 16:55:46 +0400] rev 49524
tests: add another variation of EADDRNOTAVAIL message (e.g. from NetBSD)
Jason R. Coombs <jaraco@jaraco.com> [Wed, 19 Oct 2022 16:16:47 -0400] rev 49523
shelve: re-wrap now that the line fits
Jason R. Coombs <jaraco@jaraco.com> [Wed, 19 Oct 2022 16:14:50 -0400] rev 49522
shelve: avoid setting overloading tmpwctx
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 19:49:31 -0400] rev 49521
configitems: change the `verify.skipflags` default value to avoid a py3 crash
The revlog and LFS modules use various `&` and `&=` operations with this value,
which no longer treats `None` as 0. Since nothing cares if it was actually set
in the config or not, just default to 0 for simplicity.
Arseniy Alekseyev <aalekseyev@janestreet.com> [Mon, 10 Oct 2022 14:48:39 +0100] rev 49520
dirstate-v2: skip evaluation of hgignore regex on cached directories
By making the computation of [has_ignored_ancestor] lazy we're eliding
its computation in the common case when none of its descendants have
changed on disk.
On a ~400k files repo, with a cached status, we saw a ~64% reduction
in CPU time, resulting in a speedup of ~10-15% (on ZFS), and a speedup
of ~38% of XFS (XFS has faster stat operations for some reason).
Craig Ozancin <c.ozancin@gmail.com> [Fri, 30 Sep 2022 09:05:48 -0600] rev 49519
releasenotes: use re.MULTILINE mode when checking admonitions
Release note admonitions must start at the beginning of a line within
the changeset description:
.. admonitions::
The checkadmonitions function search for and validates admonitions.
Unfortunately, since the ctx.description is multi-line, the regex search
always fails unless the admonition is on the first line.
This changeset adds re.MULTILINE to the re.compile to make the re opbject
multi-line.
Matt Harbison <matt_harbison@yahoo.com> [Mon, 10 Oct 2022 11:28:19 -0400] rev 49518
windows: gracefully handle when the username cannot be determined
This assumes implementation details, but I don't see any other way than to check
the environment variables ourselves (which would miss out on any future
enhancements that Python may make). This was originally reported as
https://foss.heptapod.net/mercurial/tortoisehg/thg/-/issues/5835.