Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 13:56:45 -0400] rev 49537
lfs: avoid closing connections when the worker doesn't fork
Probably not much more than an minor optimization, but could be useful in the
case of `hg verify` where missing blobs are fetched one at a time.
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 13:36:33 -0400] rev 49536
lfs: fix blob corruption when tranferring with workers on posix
The problem seems to be that the connection used to request the location of the
blobs is sitting in the connection pool, and then when workers are forked, they
all see and attempt to use the same connection. This garbles everything. I
have no clue how this ever worked reliably (but it seems to, even on Linux, with
SCM Manager 1.58). See previous discussion when worker support was added[1].
It shouldn't be a problem on Windows, since the workers are just threads in the
same process, and can see which connections are marked available and which are
in use. (The fact that `mercurial.keepalive.ConnectionManager.set_ready()`
doesn't acquire a lock does give me some pause though.)
[1] https://phab.mercurial-scm.org/D1568#31621
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Oct 2022 12:58:34 -0400] rev 49535
keepalive: add `__repr__()` to the HTTPConnection class to ease debugging
By default, this just printed the class name and memory address. By displaying
the address and ports on both sides of the socket, it makes it easier to figure
out what's in the ConnectionManager, and correlate with WireShark traces.
It looks like the two connections mentioned in the previous commit come about
because the LFS POST request to access the blobs opens connection 1, and gets a
401. Then for some reason, the follow up with credentials opens a new socket,
instead of using the existing one in the pool. I have no clue why.
This can be seen with something like this in the blobstore:
```
for h in self.urlopener.handlers:
if hasattr(h, "close_all"):
print('open connections on %s in pid %d' % (type(h), os.getpid()))
for host, conns in h._cm.get_all().items():
for c in conns:
print('connection: %r' % c)
```