Mercurial > hg
changeset 36426:23d12524a202
http: drop custom http client logic
Eight and a half years ago, as my starter bug on code.google.com, I
investigated a mysterious "broken pipe" error from seemingly random
clients[0]. That investigation revealed a tragic story: the Python
standard library's httplib was (and remains) barely functional. During
large POSTs, if a server responds early with an error (even a
permission denied error!) the client only notices that the server
closed the connection and everything breaks. Such server behavior is
implicitly legal under RFC 2616 (the latest HTTP RFC as of when I was
last working on this), and my understanding is that later RFCs have
made it explicitly legal to respond early with any status code outside
the 2xx range.
I embarked, probably foolishly, on a journey to write a new http
library with better overall behavior. The http library appears to work
well in most cases, but it can get confused in the presence of
proxies, and it depends on select(2) which limits its utility if a lot
of file descriptors are open. I haven't touched the http library in
almost two years, and in the interim the Python community has
discovered a better way[1] of writing network code. In theory some day
urllib3 will have its own home-grown http library built on h11[2], or
we could do that. Either way, it's time to declare our current
confusingly-named "http2" client logic and move on. I do hope to
revisit this some day: it's still garbage that we can't even respond
with a 401 or 403 without reading the entire POST body from the
client, but the goalposts on writing a new http client library have
moved substantially. We're almost certainly better off just switching
to requests and eventually picking up their http fixes than trying to
live with something that realistically only we'll ever use. Another
approach would be to write an adapter so that Mercurial can use pycurl
if it's installed. Neither of those approaches seem like they should
be investigated prior to a release of Mercurial that works on Python
3: that's where the mindshare is going to be for any improvements to
the state of the http client art.
0: http://web.archive.org/web/20130501031801/http://code.google.com/p/support/issues/detail?id=2716
1: http://sans-io.readthedocs.io/
2: https://github.com/njsmith/h11
Differential Revision: https://phab.mercurial-scm.org/D2444
author | Augie Fackler <augie@google.com> |
---|---|
date | Sun, 25 Feb 2018 23:51:32 -0500 |
parents | 24c2c760c1cb |
children | 247b473f408e |
files | mercurial/configitems.py mercurial/httpclient/__init__.py mercurial/httpclient/_readers.py mercurial/httpconnection.py mercurial/httppeer.py mercurial/url.py setup.py tests/test-check-code.t tests/test-commandserver.t |
diffstat | 9 files changed, 3 insertions(+), 1368 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/configitems.py Sun Feb 25 23:34:58 2018 -0500 +++ b/mercurial/configitems.py Sun Feb 25 23:51:32 2018 -0500 @@ -1026,9 +1026,6 @@ coreconfigitem('ui', 'graphnodetemplate', default=None, ) -coreconfigitem('ui', 'http2debuglevel', - default=None, -) coreconfigitem('ui', 'interactive', default=None, ) @@ -1127,9 +1124,6 @@ coreconfigitem('ui', 'tweakdefaults', default=False, ) -coreconfigitem('ui', 'usehttp2', - default=False, -) coreconfigitem('ui', 'username', alias=[('ui', 'user')] )
--- a/mercurial/httpclient/__init__.py Sun Feb 25 23:34:58 2018 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,912 +0,0 @@ -# Copyright 2010, Google Inc. -# All rights reserved. -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted provided that the following conditions are -# met: -# -# * Redistributions of source code must retain the above copyright -# notice, this list of conditions and the following disclaimer. -# * Redistributions in binary form must reproduce the above -# copyright notice, this list of conditions and the following disclaimer -# in the documentation and/or other materials provided with the -# distribution. -# * Neither the name of Google Inc. nor the names of its -# contributors may be used to endorse or promote products derived from -# this software without specific prior written permission. - -# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -"""Improved HTTP/1.1 client library - -This library contains an HTTPConnection which is similar to the one in -httplib, but has several additional features: - - * supports keepalives natively - * uses select() to block for incoming data - * notices when the server responds early to a request - * implements ssl inline instead of in a different class -""" -from __future__ import absolute_import - -# Many functions in this file have too many arguments. -# pylint: disable=R0913 -import email -import email.message -import errno -import inspect -import logging -import select -import socket -import ssl -import sys - -try: - import cStringIO as io - io.StringIO -except ImportError: - import io - -try: - import httplib - httplib.HTTPException -except ImportError: - import http.client as httplib - -from . import ( - _readers, -) - -logger = logging.getLogger(__name__) - -__all__ = ['HTTPConnection', 'HTTPResponse'] - -HTTP_VER_1_0 = b'HTTP/1.0' -HTTP_VER_1_1 = b'HTTP/1.1' - -OUTGOING_BUFFER_SIZE = 1 << 15 -INCOMING_BUFFER_SIZE = 1 << 20 - -HDR_ACCEPT_ENCODING = 'accept-encoding' -HDR_CONNECTION_CTRL = 'connection' -HDR_CONTENT_LENGTH = 'content-length' -HDR_XFER_ENCODING = 'transfer-encoding' - -XFER_ENCODING_CHUNKED = 'chunked' - -CONNECTION_CLOSE = 'close' - -EOL = b'\r\n' -_END_HEADERS = EOL * 2 - -# Based on some searching around, 1 second seems like a reasonable -# default here. -TIMEOUT_ASSUME_CONTINUE = 1 -TIMEOUT_DEFAULT = None - -if sys.version_info > (3, 0): - _unicode = str -else: - _unicode = unicode - -def _ensurebytes(data): - if not isinstance(data, (_unicode, bytes)): - data = str(data) - if not isinstance(data, bytes): - try: - return data.encode('latin-1') - except UnicodeEncodeError as err: - raise UnicodeEncodeError( - err.encoding, - err.object, - err.start, - err.end, - '%r is not valid Latin-1 Use .encode("utf-8") ' - 'if sending as utf-8 is desired.' % ( - data[err.start:err.end],)) - return data - -class _CompatMessage(email.message.Message): - """Workaround for rfc822.Message and email.message.Message API diffs.""" - - @classmethod - def from_string(cls, s): - if sys.version_info > (3, 0): - # Python 3 can't decode headers from bytes, so we have to - # trust RFC 2616 and decode the headers as iso-8859-1 - # bytes. - s = s.decode('iso-8859-1') - headers = email.message_from_string(s, _class=_CompatMessage) - # Fix multi-line headers to match httplib's behavior from - # Python 2.x, since email.message.Message handles them in - # slightly different ways. - if sys.version_info < (3, 0): - new = [] - for h, v in headers._headers: - if '\r\n' in v: - v = '\n'.join([' ' + x.lstrip() for x in v.split('\r\n')])[1:] - new.append((h, v)) - headers._headers = new - return headers - - def getheaders(self, key): - return self.get_all(key) - - def getheader(self, key, default=None): - return self.get(key, failobj=default) - - -class HTTPResponse(object): - """Response from an HTTP server. - - The response will continue to load as available. If you need the - complete response before continuing, check the .complete() method. - """ - def __init__(self, sock, timeout, method): - self.sock = sock - self.method = method - self.raw_response = b'' - self._headers_len = 0 - self.headers = None - self.will_close = False - self.status_line = b'' - self.status = None - self.continued = False - self.http_version = None - self.reason = None - self._reader = None - - self._read_location = 0 - self._eol = EOL - - self._timeout = timeout - - @property - def _end_headers(self): - return self._eol * 2 - - def complete(self): - """Returns true if this response is completely loaded. - - Note that if this is a connection where complete means the - socket is closed, this will nearly always return False, even - in cases where all the data has actually been loaded. - """ - if self._reader: - return self._reader.done() - - def _close(self): - if self._reader is not None: - # We're a friend of the reader class here. - # pylint: disable=W0212 - self._reader._close() - - def getheader(self, header, default=None): - return self.headers.getheader(header, default=default) - - def getheaders(self): - if sys.version_info < (3, 0): - return [(k.lower(), v) for k, v in self.headers.items()] - # Starting in Python 3, headers aren't lowercased before being - # returned here. - return self.headers.items() - - def readline(self): - """Read a single line from the response body. - - This may block until either a line ending is found or the - response is complete. - """ - blocks = [] - while True: - self._reader.readto(b'\n', blocks) - - if blocks and blocks[-1][-1:] == b'\n' or self.complete(): - break - - self._select() - - return b''.join(blocks) - - def read(self, length=None): - """Read data from the response body.""" - # if length is None, unbounded read - while (not self.complete() # never select on a finished read - and (not length # unbounded, so we wait for complete() - or length > self._reader.available_data)): - self._select() - if not length: - length = self._reader.available_data - r = self._reader.read(length) - if self.complete() and self.will_close: - self.sock.close() - return r - - def _select(self): - r, unused_write, unused_err = select.select( - [self.sock], [], [], self._timeout) - if not r: - # socket was not readable. If the response is not - # complete, raise a timeout. - if not self.complete(): - logger.info('timed out with timeout of %s', self._timeout) - raise HTTPTimeoutException('timeout reading data') - try: - data = self.sock.recv(INCOMING_BUFFER_SIZE) - except ssl.SSLError as e: - if e.args[0] != ssl.SSL_ERROR_WANT_READ: - raise - logger.debug('SSL_ERROR_WANT_READ in _select, should retry later') - return True - logger.debug('response read %d data during _select', len(data)) - # If the socket was readable and no data was read, that means - # the socket was closed. Inform the reader (if any) so it can - # raise an exception if this is an invalid situation. - if not data: - if self._reader: - # We're a friend of the reader class here. - # pylint: disable=W0212 - self._reader._close() - return False - else: - self._load_response(data) - return True - - # This method gets replaced by _load later, which confuses pylint. - def _load_response(self, data): # pylint: disable=E0202 - # Being here implies we're not at the end of the headers yet, - # since at the end of this method if headers were completely - # loaded we replace this method with the load() method of the - # reader we created. - self.raw_response += data - # This is a bogus server with bad line endings - if self._eol not in self.raw_response: - for bad_eol in (b'\n', b'\r'): - if (bad_eol in self.raw_response - # verify that bad_eol is not the end of the incoming data - # as this could be a response line that just got - # split between \r and \n. - and (self.raw_response.index(bad_eol) < - (len(self.raw_response) - 1))): - logger.info('bogus line endings detected, ' - 'using %r for EOL', bad_eol) - self._eol = bad_eol - break - # exit early if not at end of headers - if self._end_headers not in self.raw_response or self.headers: - return - - # handle 100-continue response - hdrs, body = self.raw_response.split(self._end_headers, 1) - unused_http_ver, status = hdrs.split(b' ', 1) - if status.startswith(b'100'): - self.raw_response = body - self.continued = True - logger.debug('continue seen, setting body to %r', body) - return - - # arriving here means we should parse response headers - # as all headers have arrived completely - hdrs, body = self.raw_response.split(self._end_headers, 1) - del self.raw_response - if self._eol in hdrs: - self.status_line, hdrs = hdrs.split(self._eol, 1) - else: - self.status_line = hdrs - hdrs = b'' - # TODO HTTP < 1.0 support - (self.http_version, self.status, - self.reason) = self.status_line.split(b' ', 2) - self.status = int(self.status) - if self._eol != EOL: - hdrs = hdrs.replace(self._eol, b'\r\n') - headers = _CompatMessage.from_string(hdrs) - content_len = None - if HDR_CONTENT_LENGTH in headers: - content_len = int(headers[HDR_CONTENT_LENGTH]) - if self.http_version == HTTP_VER_1_0: - self.will_close = True - elif HDR_CONNECTION_CTRL in headers: - self.will_close = ( - headers[HDR_CONNECTION_CTRL].lower() == CONNECTION_CLOSE) - if (HDR_XFER_ENCODING in headers - and headers[HDR_XFER_ENCODING].lower() == XFER_ENCODING_CHUNKED): - self._reader = _readers.ChunkedReader(self._eol) - logger.debug('using a chunked reader') - else: - # HEAD responses are forbidden from returning a body, and - # it's implausible for a CONNECT response to use - # close-is-end logic for an OK response. - if (self.method == b'HEAD' or - (self.method == b'CONNECT' and content_len is None)): - content_len = 0 - if content_len is not None: - logger.debug('using a content-length reader with length %d', - content_len) - self._reader = _readers.ContentLengthReader(content_len) - else: - # Response body had no length specified and is not - # chunked, so the end of the body will only be - # identifiable by the termination of the socket by the - # server. My interpretation of the spec means that we - # are correct in hitting this case if - # transfer-encoding, content-length, and - # connection-control were left unspecified. - self._reader = _readers.CloseIsEndReader() - logger.debug('using a close-is-end reader') - self.will_close = True - - if body: - # We're a friend of the reader class here. - # pylint: disable=W0212 - self._reader._load(body) - logger.debug('headers complete') - self.headers = headers - # We're a friend of the reader class here. - # pylint: disable=W0212 - self._load_response = self._reader._load - -def _foldheaders(headers): - """Given some headers, rework them so we can safely overwrite values. - - >>> _foldheaders({'Accept-Encoding': 'wat'}) - {'accept-encoding': ('Accept-Encoding', 'wat')} - """ - return dict((k.lower(), (k, v)) for k, v in headers.items()) - -try: - inspect.signature - def _handlesarg(func, arg): - """ Try to determine if func accepts arg - - If it takes arg, return True - If it happens to take **args, then it could do anything: - * It could throw a different TypeError, just for fun - * It could throw an ArgumentError or anything else - * It could choose not to throw an Exception at all - ... return 'unknown' - - Otherwise, return False - """ - params = inspect.signature(func).parameters - if arg in params: - return True - for p in params: - if params[p].kind == inspect._ParameterKind.VAR_KEYWORD: - return 'unknown' - return False -except AttributeError: - def _handlesarg(func, arg): - """ Try to determine if func accepts arg - - If it takes arg, return True - If it happens to take **args, then it could do anything: - * It could throw a different TypeError, just for fun - * It could throw an ArgumentError or anything else - * It could choose not to throw an Exception at all - ... return 'unknown' - - Otherwise, return False - """ - spec = inspect.getargspec(func) - if arg in spec.args: - return True - if spec.keywords: - return 'unknown' - return False - -class HTTPConnection(object): - """Connection to a single http server. - - Supports 100-continue and keepalives natively. Uses select() for - non-blocking socket operations. - """ - http_version = HTTP_VER_1_1 - response_class = HTTPResponse - - def __init__(self, host, port=None, use_ssl=None, ssl_validator=None, - timeout=TIMEOUT_DEFAULT, - continue_timeout=TIMEOUT_ASSUME_CONTINUE, - proxy_hostport=None, proxy_headers=None, - ssl_wrap_socket=None, **ssl_opts): - """Create a new HTTPConnection. - - Args: - host: The host to which we'll connect. - port: Optional. The port over which we'll connect. Default 80 for - non-ssl, 443 for ssl. - use_ssl: Optional. Whether to use ssl. Defaults to False if port is - not 443, true if port is 443. - ssl_validator: a function(socket) to validate the ssl cert - timeout: Optional. Connection timeout, default is TIMEOUT_DEFAULT. - continue_timeout: Optional. Timeout for waiting on an expected - "100 Continue" response. Default is TIMEOUT_ASSUME_CONTINUE. - proxy_hostport: Optional. Tuple of (host, port) to use as an http - proxy for the connection. Default is to not use a proxy. - proxy_headers: Optional dict of header keys and values to send to - a proxy when using CONNECT. For compatibility with - httplib, the Proxy-Authorization header may be - specified in headers for request(), which will clobber - any such header specified here if specified. Providing - this option and not proxy_hostport will raise an - ValueError. - ssl_wrap_socket: Optional function to use for wrapping - sockets. If unspecified, the one from the ssl module will - be used if available, or something that's compatible with - it if on a Python older than 2.6. - - Any extra keyword arguments to this function will be provided - to the ssl_wrap_socket method. If no ssl - """ - host = _ensurebytes(host) - if port is None and host.count(b':') == 1 or b']:' in host: - host, port = host.rsplit(b':', 1) - port = int(port) - if b'[' in host: - host = host[1:-1] - if ssl_wrap_socket is not None: - _wrap_socket = ssl_wrap_socket - else: - _wrap_socket = ssl.wrap_socket - call_wrap_socket = None - handlesubar = _handlesarg(_wrap_socket, 'server_hostname') - if handlesubar is True: - # supports server_hostname - call_wrap_socket = _wrap_socket - handlesnobar = _handlesarg(_wrap_socket, 'serverhostname') - if handlesnobar is True and handlesubar is not True: - # supports serverhostname - def call_wrap_socket(sock, server_hostname=None, **ssl_opts): - return _wrap_socket(sock, serverhostname=server_hostname, - **ssl_opts) - if handlesubar is False and handlesnobar is False: - # does not support either - def call_wrap_socket(sock, server_hostname=None, **ssl_opts): - return _wrap_socket(sock, **ssl_opts) - if call_wrap_socket is None: - # we assume it takes **args - def call_wrap_socket(sock, **ssl_opts): - if 'server_hostname' in ssl_opts: - ssl_opts['serverhostname'] = ssl_opts['server_hostname'] - return _wrap_socket(sock, **ssl_opts) - self._ssl_wrap_socket = call_wrap_socket - if use_ssl is None and port is None: - use_ssl = False - port = 80 - elif use_ssl is None: - use_ssl = (port == 443) - elif port is None: - port = (use_ssl and 443 or 80) - self.port = port - self.ssl = use_ssl - self.ssl_opts = ssl_opts - self._ssl_validator = ssl_validator - self.host = host - self.sock = None - self._current_response = None - self._current_response_taken = False - if proxy_hostport is None: - self._proxy_host = self._proxy_port = None - if proxy_headers: - raise ValueError( - 'proxy_headers may not be specified unless ' - 'proxy_hostport is also specified.') - else: - self._proxy_headers = {} - else: - self._proxy_host, self._proxy_port = proxy_hostport - self._proxy_headers = _foldheaders(proxy_headers or {}) - - self.timeout = timeout - self.continue_timeout = continue_timeout - - def _connect(self, proxy_headers): - """Connect to the host and port specified in __init__.""" - if self.sock: - return - if self._proxy_host is not None: - logger.info('Connecting to http proxy %s:%s', - self._proxy_host, self._proxy_port) - sock = socket.create_connection((self._proxy_host, - self._proxy_port)) - if self.ssl: - data = self._buildheaders(b'CONNECT', b'%s:%d' % (self.host, - self.port), - proxy_headers, HTTP_VER_1_0) - sock.send(data) - sock.setblocking(0) - r = self.response_class(sock, self.timeout, b'CONNECT') - timeout_exc = HTTPTimeoutException( - 'Timed out waiting for CONNECT response from proxy') - while not r.complete(): - try: - # We're a friend of the response class, so let - # us use the private attribute. - # pylint: disable=W0212 - if not r._select(): - if not r.complete(): - raise timeout_exc - except HTTPTimeoutException: - # This raise/except pattern looks goofy, but - # _select can raise the timeout as well as the - # loop body. I wish it wasn't this convoluted, - # but I don't have a better solution - # immediately handy. - raise timeout_exc - if r.status != 200: - raise HTTPProxyConnectFailedException( - 'Proxy connection failed: %d %s' % (r.status, - r.read())) - logger.info('CONNECT (for SSL) to %s:%s via proxy succeeded.', - self.host, self.port) - else: - sock = socket.create_connection((self.host, self.port)) - if self.ssl: - # This is the default, but in the case of proxied SSL - # requests the proxy logic above will have cleared - # blocking mode, so re-enable it just to be safe. - sock.setblocking(1) - logger.debug('wrapping socket for ssl with options %r', - self.ssl_opts) - sock = self._ssl_wrap_socket(sock, server_hostname=self.host, - **self.ssl_opts) - if self._ssl_validator: - self._ssl_validator(sock) - sock.setblocking(0) - self.sock = sock - - def _buildheaders(self, method, path, headers, http_ver): - if self.ssl and self.port == 443 or self.port == 80: - # default port for protocol, so leave it out - hdrhost = self.host - else: - # include nonstandard port in header - if b':' in self.host: # must be IPv6 - hdrhost = b'[%s]:%d' % (self.host, self.port) - else: - hdrhost = b'%s:%d' % (self.host, self.port) - if self._proxy_host and not self.ssl: - # When talking to a regular http proxy we must send the - # full URI, but in all other cases we must not (although - # technically RFC 2616 says servers must accept our - # request if we screw up, experimentally few do that - # correctly.) - assert path[0:1] == b'/', 'path must start with a /' - path = b'http://%s%s' % (hdrhost, path) - outgoing = [b'%s %s %s%s' % (method, path, http_ver, EOL)] - headers[b'host'] = (b'Host', hdrhost) - headers[HDR_ACCEPT_ENCODING] = (HDR_ACCEPT_ENCODING, 'identity') - for hdr, val in sorted((_ensurebytes(h), _ensurebytes(v)) - for h, v in headers.values()): - outgoing.append(b'%s: %s%s' % (hdr, val, EOL)) - outgoing.append(EOL) - return b''.join(outgoing) - - def close(self): - """Close the connection to the server. - - This is a no-op if the connection is already closed. The - connection may automatically close if requested by the server - or required by the nature of a response. - """ - if self.sock is None: - return - self.sock.close() - self.sock = None - logger.info('closed connection to %s on %s', self.host, self.port) - - def busy(self): - """Returns True if this connection object is currently in use. - - If a response is still pending, this will return True, even if - the request has finished sending. In the future, - HTTPConnection may transparently juggle multiple connections - to the server, in which case this will be useful to detect if - any of those connections is ready for use. - """ - cr = self._current_response - if cr is not None: - if self._current_response_taken: - if cr.will_close: - self.sock = None - self._current_response = None - return False - elif cr.complete(): - self._current_response = None - return False - return True - return False - - def _reconnect(self, where, pheaders): - logger.info('reconnecting during %s', where) - self.close() - self._connect(pheaders) - - def request(self, method, path, body=None, headers=None, - expect_continue=False): - """Send a request to the server. - - For increased flexibility, this does not return the response - object. Future versions of HTTPConnection that juggle multiple - sockets will be able to send (for example) 5 requests all at - once, and then let the requests arrive as data is - available. Use the `getresponse()` method to retrieve the - response. - """ - if headers is None: - headers = {} - method = _ensurebytes(method) - path = _ensurebytes(path) - if self.busy(): - raise httplib.CannotSendRequest( - 'Can not send another request before ' - 'current response is read!') - self._current_response_taken = False - - logger.info('sending %s request for %s to %s on port %s', - method, path, self.host, self.port) - - hdrs = _foldheaders(headers) - # Figure out headers that have to be computed from the request - # body. - chunked = False - if body and HDR_CONTENT_LENGTH not in hdrs: - if getattr(body, '__len__', False): - hdrs[HDR_CONTENT_LENGTH] = (HDR_CONTENT_LENGTH, - b'%d' % len(body)) - elif getattr(body, 'read', False): - hdrs[HDR_XFER_ENCODING] = (HDR_XFER_ENCODING, - XFER_ENCODING_CHUNKED) - chunked = True - else: - raise BadRequestData('body has no __len__() nor read()') - # Figure out expect-continue header - if hdrs.get('expect', ('', ''))[1].lower() == b'100-continue': - expect_continue = True - elif expect_continue: - hdrs['expect'] = (b'Expect', b'100-Continue') - # httplib compatibility: if the user specified a - # proxy-authorization header, that's actually intended for a - # proxy CONNECT action, not the real request, but only if - # we're going to use a proxy. - pheaders = dict(self._proxy_headers) - if self._proxy_host and self.ssl: - pa = hdrs.pop('proxy-authorization', None) - if pa is not None: - pheaders['proxy-authorization'] = pa - # Build header data - outgoing_headers = self._buildheaders( - method, path, hdrs, self.http_version) - - # If we're reusing the underlying socket, there are some - # conditions where we'll want to retry, so make a note of the - # state of self.sock - fresh_socket = self.sock is None - self._connect(pheaders) - response = None - first = True - - while ((outgoing_headers or body) - and not (response and response.complete())): - select_timeout = self.timeout - out = outgoing_headers or body - blocking_on_continue = False - if expect_continue and not outgoing_headers and not ( - response and (response.headers or response.continued)): - logger.info( - 'waiting up to %s seconds for' - ' continue response from server', - self.continue_timeout) - select_timeout = self.continue_timeout - blocking_on_continue = True - out = False - if out: - w = [self.sock] - else: - w = [] - r, w, x = select.select([self.sock], w, [], select_timeout) - # if we were expecting a 100 continue and it's been long - # enough, just go ahead and assume it's ok. This is the - # recommended behavior from the RFC. - if r == w == x == []: - if blocking_on_continue: - expect_continue = False - logger.info('no response to continue expectation from ' - 'server, optimistically sending request body') - else: - raise HTTPTimeoutException('timeout sending data') - was_first = first - - # incoming data - if r: - try: - try: - data = r[0].recv(INCOMING_BUFFER_SIZE) - except ssl.SSLError as e: - if e.args[0] != ssl.SSL_ERROR_WANT_READ: - raise - logger.debug('SSL_ERROR_WANT_READ while sending ' - 'data, retrying...') - continue - if not data: - logger.info('socket appears closed in read') - self.sock = None - self._current_response = None - if response is not None: - # We're a friend of the response class, so let - # us use the private attribute. - # pylint: disable=W0212 - response._close() - # This if/elif ladder is a bit subtle, - # comments in each branch should help. - if response is not None and response.complete(): - # Server responded completely and then - # closed the socket. We should just shut - # things down and let the caller get their - # response. - logger.info('Got an early response, ' - 'aborting remaining request.') - break - elif was_first and response is None: - # Most likely a keepalive that got killed - # on the server's end. Commonly happens - # after getting a really large response - # from the server. - logger.info( - 'Connection appeared closed in read on first' - ' request loop iteration, will retry.') - self._reconnect('read', pheaders) - continue - else: - # We didn't just send the first data hunk, - # and either have a partial response or no - # response at all. There's really nothing - # meaningful we can do here. - raise HTTPStateError( - 'Connection appears closed after ' - 'some request data was written, but the ' - 'response was missing or incomplete!') - logger.debug('read %d bytes in request()', len(data)) - if response is None: - response = self.response_class( - r[0], self.timeout, method) - # We're a friend of the response class, so let us - # use the private attribute. - # pylint: disable=W0212 - response._load_response(data) - # Jump to the next select() call so we load more - # data if the server is still sending us content. - continue - except socket.error as e: - if e[0] != errno.EPIPE and not was_first: - raise - - # outgoing data - if w and out: - try: - if getattr(out, 'read', False): - # pylint guesses the type of out incorrectly here - # pylint: disable=E1103 - data = out.read(OUTGOING_BUFFER_SIZE) - if not data: - continue - if len(data) < OUTGOING_BUFFER_SIZE: - if chunked: - body = b'0' + EOL + EOL - else: - body = None - if chunked: - # This encode is okay because we know - # hex() is building us only 0-9 and a-f - # digits. - asciilen = hex(len(data))[2:].encode('ascii') - out = asciilen + EOL + data + EOL - else: - out = data - amt = w[0].send(out) - except socket.error as e: - if e[0] == ssl.SSL_ERROR_WANT_WRITE and self.ssl: - # This means that SSL hasn't flushed its buffer into - # the socket yet. - # TODO: find a way to block on ssl flushing its buffer - # similar to selecting on a raw socket. - continue - if e[0] == errno.EWOULDBLOCK or e[0] == errno.EAGAIN: - continue - elif (e[0] not in (errno.ECONNRESET, errno.EPIPE) - and not first): - raise - self._reconnect('write', pheaders) - amt = self.sock.send(out) - logger.debug('sent %d', amt) - first = False - if out is body: - body = out[amt:] - else: - outgoing_headers = out[amt:] - # End of request-sending loop. - - # close if the server response said to or responded before eating - # the whole request - if response is None: - response = self.response_class(self.sock, self.timeout, method) - if not fresh_socket: - if not response._select(): - # This means the response failed to get any response - # data at all, and in all probability the socket was - # closed before the server even saw our request. Try - # the request again on a fresh socket. - logger.debug('response._select() failed during request().' - ' Assuming request needs to be retried.') - self.sock = None - # Call this method explicitly to re-try the - # request. We don't use self.request() because - # some tools (notably Mercurial) expect to be able - # to subclass and redefine request(), and they - # don't have the same argspec as we do. - # - # TODO restructure sending of requests to avoid - # this recursion - return HTTPConnection.request( - self, method, path, body=body, headers=headers, - expect_continue=expect_continue) - data_left = bool(outgoing_headers or body) - if data_left: - logger.info('stopped sending request early, ' - 'will close the socket to be safe.') - response.will_close = True - if response.will_close: - # The socket will be closed by the response, so we disown - # the socket - self.sock = None - self._current_response = response - - def getresponse(self): - """Returns the response to the most recent request.""" - if self._current_response is None: - raise httplib.ResponseNotReady() - r = self._current_response - while r.headers is None: - # We're a friend of the response class, so let us use the - # private attribute. - # pylint: disable=W0212 - if not r._select() and not r.complete(): - raise _readers.HTTPRemoteClosedError() - if r.will_close: - self.sock = None - self._current_response = None - elif r.complete(): - self._current_response = None - else: - self._current_response_taken = True - return r - - -class HTTPTimeoutException(httplib.HTTPException): - """A timeout occurred while waiting on the server.""" - - -class BadRequestData(httplib.HTTPException): - """Request body object has neither __len__ nor read.""" - - -class HTTPProxyConnectFailedException(httplib.HTTPException): - """Connecting to the HTTP proxy failed.""" - - -class HTTPStateError(httplib.HTTPException): - """Invalid internal state encountered.""" - -# Forward this exception type from _readers since it needs to be part -# of the public API. -HTTPRemoteClosedError = _readers.HTTPRemoteClosedError -# no-check-code
--- a/mercurial/httpclient/_readers.py Sun Feb 25 23:34:58 2018 -0500 +++ /dev/null Thu Jan 01 00:00:00 1970 +0000 @@ -1,239 +0,0 @@ -# Copyright 2011, Google Inc. -# All rights reserved. -# -# Redistribution and use in source and binary forms, with or without -# modification, are permitted provided that the following conditions are -# met: -# -# * Redistributions of source code must retain the above copyright -# notice, this list of conditions and the following disclaimer. -# * Redistributions in binary form must reproduce the above -# copyright notice, this list of conditions and the following disclaimer -# in the documentation and/or other materials provided with the -# distribution. -# * Neither the name of Google Inc. nor the names of its -# contributors may be used to endorse or promote products derived from -# this software without specific prior written permission. - -# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS -# "AS IS" AND ANY EXPRESS OR IMPLIED WARRANTIES, INCLUDING, BUT NOT -# LIMITED TO, THE IMPLIED WARRANTIES OF MERCHANTABILITY AND FITNESS FOR -# A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE COPYRIGHT -# OWNER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, -# SPECIAL, EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT -# LIMITED TO, PROCUREMENT OF SUBSTITUTE GOODS OR SERVICES; LOSS OF USE, -# DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED AND ON ANY -# THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT -# (INCLUDING NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE -# OF THIS SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE. -"""Reader objects to abstract out different body response types. - -This module is package-private. It is not expected that these will -have any clients outside of httpplus. -""" -from __future__ import absolute_import - -try: - import httplib - httplib.HTTPException -except ImportError: - import http.client as httplib - -import logging - -logger = logging.getLogger(__name__) - - -class ReadNotReady(Exception): - """Raised when read() is attempted but not enough data is loaded.""" - - -class HTTPRemoteClosedError(httplib.HTTPException): - """The server closed the remote socket in the middle of a response.""" - - -class AbstractReader(object): - """Abstract base class for response readers. - - Subclasses must implement _load, and should implement _close if - it's not an error for the server to close their socket without - some termination condition being detected during _load. - """ - def __init__(self): - self._finished = False - self._done_chunks = [] - self.available_data = 0 - - def _addchunk(self, data): - self._done_chunks.append(data) - self.available_data += len(data) - - def _pushchunk(self, data): - self._done_chunks.insert(0, data) - self.available_data += len(data) - - def _popchunk(self): - b = self._done_chunks.pop(0) - self.available_data -= len(b) - - return b - - def done(self): - """Returns true if the response body is entirely read.""" - return self._finished - - def read(self, amt): - """Read amt bytes from the response body.""" - if self.available_data < amt and not self._finished: - raise ReadNotReady() - blocks = [] - need = amt - while self._done_chunks: - b = self._popchunk() - if len(b) > need: - nb = b[:need] - self._pushchunk(b[need:]) - b = nb - blocks.append(b) - need -= len(b) - if need == 0: - break - result = b''.join(blocks) - assert len(result) == amt or (self._finished and len(result) < amt) - - return result - - def readto(self, delimstr, blocks = None): - """return available data chunks up to the first one in which - delimstr occurs. No data will be returned after delimstr -- - the chunk in which it occurs will be split and the remainder - pushed back onto the available data queue. If blocks is - supplied chunks will be added to blocks, otherwise a new list - will be allocated. - """ - if blocks is None: - blocks = [] - - while self._done_chunks: - b = self._popchunk() - i = b.find(delimstr) + len(delimstr) - if i: - if i < len(b): - self._pushchunk(b[i:]) - blocks.append(b[:i]) - break - else: - blocks.append(b) - - return blocks - - def _load(self, data): # pragma: no cover - """Subclasses must implement this. - - As data is available to be read out of this object, it should - be placed into the _done_chunks list. Subclasses should not - rely on data remaining in _done_chunks forever, as it may be - reaped if the client is parsing data as it comes in. - """ - raise NotImplementedError - - def _close(self): - """Default implementation of close. - - The default implementation assumes that the reader will mark - the response as finished on the _finished attribute once the - entire response body has been read. In the event that this is - not true, the subclass should override the implementation of - close (for example, close-is-end responses have to set - self._finished in the close handler.) - """ - if not self._finished: - raise HTTPRemoteClosedError( - 'server appears to have closed the socket mid-response') - - -class AbstractSimpleReader(AbstractReader): - """Abstract base class for simple readers that require no response decoding. - - Examples of such responses are Connection: Close (close-is-end) - and responses that specify a content length. - """ - def _load(self, data): - if data: - assert not self._finished, ( - 'tried to add data (%r) to a closed reader!' % data) - logger.debug('%s read an additional %d data', - self.name, len(data)) # pylint: disable=E1101 - self._addchunk(data) - - -class CloseIsEndReader(AbstractSimpleReader): - """Reader for responses that specify Connection: Close for length.""" - name = 'close-is-end' - - def _close(self): - logger.info('Marking close-is-end reader as closed.') - self._finished = True - - -class ContentLengthReader(AbstractSimpleReader): - """Reader for responses that specify an exact content length.""" - name = 'content-length' - - def __init__(self, amount): - AbstractSimpleReader.__init__(self) - self._amount = amount - if amount == 0: - self._finished = True - self._amount_seen = 0 - - def _load(self, data): - AbstractSimpleReader._load(self, data) - self._amount_seen += len(data) - if self._amount_seen >= self._amount: - self._finished = True - logger.debug('content-length read complete') - - -class ChunkedReader(AbstractReader): - """Reader for chunked transfer encoding responses.""" - def __init__(self, eol): - AbstractReader.__init__(self) - self._eol = eol - self._leftover_skip_amt = 0 - self._leftover_data = '' - - def _load(self, data): - assert not self._finished, 'tried to add data to a closed reader!' - logger.debug('chunked read an additional %d data', len(data)) - position = 0 - if self._leftover_data: - logger.debug( - 'chunked reader trying to finish block from leftover data') - # TODO: avoid this string concatenation if possible - data = self._leftover_data + data - position = self._leftover_skip_amt - self._leftover_data = '' - self._leftover_skip_amt = 0 - datalen = len(data) - while position < datalen: - split = data.find(self._eol, position) - if split == -1: - self._leftover_data = data - self._leftover_skip_amt = position - return - amt = int(data[position:split], base=16) - block_start = split + len(self._eol) - # If the whole data chunk plus the eol trailer hasn't - # loaded, we'll wait for the next load. - if block_start + amt + len(self._eol) > len(data): - self._leftover_data = data - self._leftover_skip_amt = position - return - if amt == 0: - self._finished = True - logger.debug('closing chunked reader due to chunk of length 0') - return - self._addchunk(data[block_start:block_start + amt]) - position = block_start + amt + len(self._eol) -# no-check-code
--- a/mercurial/httpconnection.py Sun Feb 25 23:34:58 2018 -0500 +++ b/mercurial/httpconnection.py Sun Feb 25 23:51:32 2018 -0500 @@ -10,15 +10,10 @@ from __future__ import absolute_import -import logging import os -import socket from .i18n import _ from . import ( - httpclient, - sslutil, - urllibcompat, util, ) @@ -110,190 +105,3 @@ if user and not bestuser: auth['username'] = user return bestauth - -# Mercurial (at least until we can remove the old codepath) requires -# that the http response object be sufficiently file-like, so we -# provide a close() method here. -class HTTPResponse(httpclient.HTTPResponse): - def close(self): - pass - -class HTTPConnection(httpclient.HTTPConnection): - response_class = HTTPResponse - def request(self, method, uri, body=None, headers=None): - if headers is None: - headers = {} - if isinstance(body, httpsendfile): - body.seek(0) - httpclient.HTTPConnection.request(self, method, uri, body=body, - headers=headers) - - -_configuredlogging = False -LOGFMT = '%(levelname)s:%(name)s:%(lineno)d:%(message)s' -# Subclass BOTH of these because otherwise urllib2 "helpfully" -# reinserts them since it notices we don't include any subclasses of -# them. -class http2handler(urlreq.httphandler, urlreq.httpshandler): - def __init__(self, ui, pwmgr): - global _configuredlogging - urlreq.abstracthttphandler.__init__(self) - self.ui = ui - self.pwmgr = pwmgr - self._connections = {} - # developer config: ui.http2debuglevel - loglevel = ui.config('ui', 'http2debuglevel') - if loglevel and not _configuredlogging: - _configuredlogging = True - logger = logging.getLogger('mercurial.httpclient') - logger.setLevel(getattr(logging, loglevel.upper())) - handler = logging.StreamHandler() - handler.setFormatter(logging.Formatter(LOGFMT)) - logger.addHandler(handler) - - def close_all(self): - """Close and remove all connection objects being kept for reuse.""" - for openconns in self._connections.values(): - for conn in openconns: - conn.close() - self._connections = {} - - # shamelessly borrowed from urllib2.AbstractHTTPHandler - def do_open(self, http_class, req, use_ssl): - """Return an addinfourl object for the request, using http_class. - - http_class must implement the HTTPConnection API from httplib. - The addinfourl return value is a file-like object. It also - has methods and attributes including: - - info(): return a mimetools.Message object for the headers - - geturl(): return the original request URL - - code: HTTP status code - """ - # If using a proxy, the host returned by get_host() is - # actually the proxy. On Python 2.6.1, the real destination - # hostname is encoded in the URI in the urllib2 request - # object. On Python 2.6.5, it's stored in the _tunnel_host - # attribute which has no accessor. - tunhost = getattr(req, '_tunnel_host', None) - host = urllibcompat.gethost(req) - if tunhost: - proxyhost = host - host = tunhost - elif req.has_proxy(): - proxyhost = urllibcompat.gethost(req) - host = urllibcompat.getselector( - req).split('://', 1)[1].split('/', 1)[0] - else: - proxyhost = None - - if proxyhost: - if ':' in proxyhost: - # Note: this means we'll explode if we try and use an - # IPv6 http proxy. This isn't a regression, so we - # won't worry about it for now. - proxyhost, proxyport = proxyhost.rsplit(':', 1) - else: - proxyport = 3128 # squid default - proxy = (proxyhost, proxyport) - else: - proxy = None - - if not host: - raise urlerr.urlerror('no host given') - - connkey = use_ssl, host, proxy - allconns = self._connections.get(connkey, []) - conns = [c for c in allconns if not c.busy()] - if conns: - h = conns[0] - else: - if allconns: - self.ui.debug('all connections for %s busy, making a new ' - 'one\n' % host) - timeout = None - if req.timeout is not socket._GLOBAL_DEFAULT_TIMEOUT: - timeout = req.timeout - h = http_class(host, timeout=timeout, proxy_hostport=proxy) - self._connections.setdefault(connkey, []).append(h) - - headers = dict(req.headers) - headers.update(req.unredirected_hdrs) - headers = dict( - (name.title(), val) for name, val in headers.items()) - try: - path = urllibcompat.getselector(req) - if '://' in path: - path = path.split('://', 1)[1].split('/', 1)[1] - if path[0] != '/': - path = '/' + path - h.request(req.get_method(), path, req.data, headers) - r = h.getresponse() - except socket.error as err: # XXX what error? - raise urlerr.urlerror(err) - - # Pick apart the HTTPResponse object to get the addinfourl - # object initialized properly. - r.recv = r.read - - resp = urlreq.addinfourl(r, r.headers, urllibcompat.getfullurl(req)) - resp.code = r.status - resp.msg = r.reason - return resp - - # httplib always uses the given host/port as the socket connect - # target, and then allows full URIs in the request path, which it - # then observes and treats as a signal to do proxying instead. - def http_open(self, req): - if urllibcompat.getfullurl(req).startswith('https'): - return self.https_open(req) - def makehttpcon(*args, **kwargs): - k2 = dict(kwargs) - k2[r'use_ssl'] = False - return HTTPConnection(*args, **k2) - return self.do_open(makehttpcon, req, False) - - def https_open(self, req): - # urllibcompat.getfullurl(req) does not contain credentials and we may - # need them to match the certificates. - url = urllibcompat.getfullurl(req) - user, password = self.pwmgr.find_stored_password(url) - res = readauthforuri(self.ui, url, user) - if res: - group, auth = res - self.auth = auth - self.ui.debug("using auth.%s.* for authentication\n" % group) - else: - self.auth = None - return self.do_open(self._makesslconnection, req, True) - - def _makesslconnection(self, host, port=443, *args, **kwargs): - keyfile = None - certfile = None - - if args: # key_file - keyfile = args.pop(0) - if args: # cert_file - certfile = args.pop(0) - - # if the user has specified different key/cert files in - # hgrc, we prefer these - if self.auth and 'key' in self.auth and 'cert' in self.auth: - keyfile = self.auth['key'] - certfile = self.auth['cert'] - - # let host port take precedence - if ':' in host and '[' not in host or ']:' in host: - host, port = host.rsplit(':', 1) - port = int(port) - if '[' in host: - host = host[1:-1] - - kwargs[r'keyfile'] = keyfile - kwargs[r'certfile'] = certfile - - con = HTTPConnection(host, port, use_ssl=True, - ssl_wrap_socket=sslutil.wrapsocket, - ssl_validator=sslutil.validatesocket, - ui=self.ui, - **kwargs) - return con
--- a/mercurial/httppeer.py Sun Feb 25 23:34:58 2018 -0500 +++ b/mercurial/httppeer.py Sun Feb 25 23:51:32 2018 -0500 @@ -291,9 +291,6 @@ size = data.length elif data is not None: size = len(data) - if size and self.ui.configbool('ui', 'usehttp2'): - headers[r'Expect'] = r'100-Continue' - headers[r'X-HgHttp2'] = r'1' if data is not None and r'Content-Type' not in headers: headers[r'Content-Type'] = r'application/mercurial-0.1'
--- a/mercurial/url.py Sun Feb 25 23:34:58 2018 -0500 +++ b/mercurial/url.py Sun Feb 25 23:51:32 2018 -0500 @@ -470,17 +470,9 @@ construct an opener suitable for urllib2 authinfo will be added to the password manager ''' - # experimental config: ui.usehttp2 - if ui.configbool('ui', 'usehttp2'): - handlers = [ - httpconnectionmod.http2handler( - ui, - passwordmgr(ui, ui.httppasswordmgrdb)) - ] - else: - handlers = [httphandler()] - if has_https: - handlers.append(httpshandler(ui)) + handlers = [httphandler()] + if has_https: + handlers.append(httpshandler(ui)) handlers.append(proxyhandler(ui))
--- a/setup.py Sun Feb 25 23:34:58 2018 -0500 +++ b/setup.py Sun Feb 25 23:51:32 2018 -0500 @@ -806,7 +806,6 @@ 'mercurial.cext', 'mercurial.cffi', 'mercurial.hgweb', - 'mercurial.httpclient', 'mercurial.pure', 'mercurial.thirdparty', 'mercurial.thirdparty.attr',
--- a/tests/test-check-code.t Sun Feb 25 23:34:58 2018 -0500 +++ b/tests/test-check-code.t Sun Feb 25 23:51:32 2018 -0500 @@ -13,8 +13,6 @@ > -X mercurial/thirdparty \ > | sed 's-\\-/-g' | "$check_code" --warnings --per-file=0 - || false Skipping i18n/polib.py it has no-che?k-code (glob) - Skipping mercurial/httpclient/__init__.py it has no-che?k-code (glob) - Skipping mercurial/httpclient/_readers.py it has no-che?k-code (glob) Skipping mercurial/statprof.py it has no-che?k-code (glob) Skipping tests/badserverext.py it has no-che?k-code (glob)
--- a/tests/test-commandserver.t Sun Feb 25 23:34:58 2018 -0500 +++ b/tests/test-commandserver.t Sun Feb 25 23:51:32 2018 -0500 @@ -211,7 +211,6 @@ ui.slash=True ui.interactive=False ui.mergemarkers=detailed - ui.usehttp2=true (?) ui.foo=bar ui.nontty=true web.address=localhost @@ -221,7 +220,6 @@ ui.slash=True ui.interactive=False ui.mergemarkers=detailed - ui.usehttp2=true (?) ui.nontty=true $ rm -R foo