chg: handle EOF reading data block
We recently discovered a case in production that chg uses 100% CPU and is
trying to read data forever:
recvfrom(4, "",
1814012019, 0, NULL, NULL) = 0
Using gdb, apparently readchannel() got wrong data. It was reading in an
infinite loop because rsize == 0 does not exit the loop, while the server
process had ended.
(gdb) bt
#0 ... in recv () at /lib64/libc.so.6
#1 ... in readchannel (...) at /usr/include/bits/socket2.h:45
#2 ... in readchannel (hgc=...) at hgclient.c:129
#3 ... in handleresponse (hgc=...) at hgclient.c:255
#4 ... in hgc_runcommand (hgc=..., args=<optimized>, argsize=<optimized>)
#5 ... in main (argc=...
486922636, argv=..., envp=...) at chg.c:661
(gdb) frame 2
(gdb) p *hgc
$1 = {sockfd = 4, pid = 381152, ctx = {ch = 108 'l',
data = 0x
7fb05164f010 "st):\nTraceback (most recent call last):\n"
"Traceback (most recent call last):\ne", maxdatasize =
1814065152,"
" datasize =
1814064225}, capflags = 16131}
This patch addresses the infinite loop issue by detecting continuously empty
responses and abort in that case.
Note that datasize can be translated to ['l', ' ', 'l', 'a']. Concatenate
datasize and data, it forms part of "Traceback (most recent call last):".
This may indicate a server-side channeledoutput issue. If it is a race
condition, we may want to use flock to protect the channels.
sslutil: more robustly detect protocol support
The Python ssl module conditionally sets the TLS 1.1 and TLS 1.2
constants depending on whether HAVE_TLSv1_2 is defined. Yes, these
are both tied to the same constant (I would think there would be
separate constants for each version). Perhaps support for TLS 1.1
and 1.2 were added at the same time and the assumption is that
OpenSSL either has neither or both. I don't know.
As part of developing this patch, it was discovered that Apple's
/usr/bin/python2.7 does not support TLS 1.1 and 1.2 (only TLS 1.0)!
On OS X 10.11, Apple Python has the modern ssl module including
SSLContext, but it doesn't appear to negotiate TLS 1.1+ nor does
it expose the constants related to TLS 1.1+. Since this code is
doing more robust feature detection (and not assuming modern ssl
implies TLS 1.1+ support), we now get TLS 1.0 warnings when running
on Apple Python. Hence the test changes.
I'm not super thrilled about shipping a Mercurial that always
whines about TLS 1.0 on OS X. We may want a follow-up patch to
suppress this warning.