wireprotoserver: move all wire protocol handling logic out of hgweb
Previous patches from several days ago worked to isolate processing
of HTTP wire protocol requests to wireprotoserver. We still had a
little logic in hgweb. If feels like the right time to finish the
job.
This commit moves WSGI request servicing from hgweb to wireprotoserver.
The ugly dict holding the parsed request is no more. I think the new
code is cleaner.
As part of this, we now process wire protocol requests before the
block to obtain the "query" variable. This makes it clear that this
wonky "query" variable is not used by the wire protocol.
The wonkiest part about this code is the HTTP 404. I'm actually not
sure what all is going on here. It looks like the code is trying to
prevent URL with path components that specify a command from not
working. That part I grok. What I don't grok is why we need to send
a 404. I would think it would be OK to no-op and let another handler
try to service the request. But if we do this, we get some subrepo
test failures. So it looks like something is expecting the HTTP 404
and reacting to it in a specific way. It /might/ be possible to
change the behavior here. But it isn't something I'm comfortable
doing because I don't understand the problem space.
Differential Revision: https://phab.mercurial-scm.org/D2740
--- a/mercurial/hgweb/hgweb_mod.py Thu Mar 08 15:37:05 2018 -0800
+++ b/mercurial/hgweb/hgweb_mod.py Thu Mar 08 15:58:52 2018 -0800
@@ -318,25 +318,16 @@
if h[0] != 'Content-Security-Policy']
wsgireq.headers.append(('Content-Security-Policy', rctx.csp))
+ handled, res = wireprotoserver.handlewsgirequest(
+ rctx, wsgireq, req, self.check_perm)
+ if handled:
+ return res
+
if req.havepathinfo:
query = req.dispatchpath
else:
query = req.querystring.partition('&')[0].partition(';')[0]
- # Route it to a wire protocol handler if it looks like a wire protocol
- # request.
- protohandler = wireprotoserver.parsehttprequest(rctx, wsgireq, req,
- self.check_perm)
-
- if protohandler:
- try:
- if query:
- raise ErrorResponse(HTTP_NOT_FOUND)
-
- return protohandler['dispatch']()
- except ErrorResponse as inst:
- return protohandler['handleerror'](inst)
-
# translate user-visible url structure to internal structure
args = query.split('/', 2)
--- a/mercurial/wireprotoserver.py Thu Mar 08 15:37:05 2018 -0800
+++ b/mercurial/wireprotoserver.py Thu Mar 08 15:58:52 2018 -0800
@@ -150,24 +150,29 @@
def iscmd(cmd):
return cmd in wireproto.commands
-def parsehttprequest(rctx, wsgireq, req, checkperm):
- """Parse the HTTP request for a wire protocol request.
+def handlewsgirequest(rctx, wsgireq, req, checkperm):
+ """Possibly process a wire protocol request.
- If the current request appears to be a wire protocol request, this
- function returns a dict with details about that request, including
- an ``abstractprotocolserver`` instance suitable for handling the
- request. Otherwise, ``None`` is returned.
+ If the current request is a wire protocol request, the request is
+ processed by this function.
``wsgireq`` is a ``wsgirequest`` instance.
``req`` is a ``parsedrequest`` instance.
+
+ Returns a 2-tuple of (bool, response) where the 1st element indicates
+ whether the request was handled and the 2nd element is a return
+ value for a WSGI application (often a generator of bytes).
"""
+ # Avoid cycle involving hg module.
+ from .hgweb import common as hgwebcommon
+
repo = rctx.repo
# HTTP version 1 wire protocol requests are denoted by a "cmd" query
# string parameter. If it isn't present, this isn't a wire protocol
# request.
if 'cmd' not in req.querystringdict:
- return None
+ return False, None
cmd = req.querystringdict['cmd'][0]
@@ -179,17 +184,32 @@
# known wire protocol commands and it is less confusing for machine
# clients.
if not iscmd(cmd):
- return None
+ return False, None
+
+ # The "cmd" query string argument is only valid on the root path of the
+ # repo. e.g. ``/?cmd=foo``, ``/repo?cmd=foo``. URL paths within the repo
+ # like ``/blah?cmd=foo`` are not allowed. So don't recognize the request
+ # in this case. We send an HTTP 404 for backwards compatibility reasons.
+ if req.dispatchpath:
+ res = _handlehttperror(
+ hgwebcommon.ErrorResponse(hgwebcommon.HTTP_NOT_FOUND), wsgireq,
+ cmd)
+
+ return True, res
proto = httpv1protocolhandler(wsgireq, repo.ui,
lambda perm: checkperm(rctx, wsgireq, perm))
- return {
- 'cmd': cmd,
- 'proto': proto,
- 'dispatch': lambda: _callhttp(repo, wsgireq, proto, cmd),
- 'handleerror': lambda ex: _handlehttperror(ex, wsgireq, cmd),
- }
+ # The permissions checker should be the only thing that can raise an
+ # ErrorResponse. It is kind of a layer violation to catch an hgweb
+ # exception here. So consider refactoring into a exception type that
+ # is associated with the wire protocol.
+ try:
+ res = _callhttp(repo, wsgireq, proto, cmd)
+ except hgwebcommon.ErrorResponse as e:
+ res = _handlehttperror(e, wsgireq, cmd)
+
+ return True, res
def _httpresponsetype(ui, wsgireq, prefer_uncompressed):
"""Determine the appropriate response type and compression settings.