mercurial/hgweb/request.py
author Pierre-Yves David <pierre-yves.david@octobus.net>
Sun, 05 Apr 2020 13:12:05 +0200
changeset 44863 640d5b3bd060
parent 43117 8ff1ecfadcd1
child 44824 839328c5a728
permissions -rw-r--r--
nodemap: also use persistent nodemap for manifest The manifest as a different usage pattern than the changelog. First, while the lookup in changelog are not garanteed to match, the lookup in the manifest nodemap come from changelog and will exist in the manifest. In addition, looking up a manifest almost always result in unpacking a manifest an operation that rarely come cheap. Nevertheless, using a persistent nodemap provide a significant gain for some operations. For our measurementw, we use `hg cat --rev REV FILE` on the our reference mozilla-try. On this repository the persistent nodemap cache is about 29 MB in side for a total store side of 11,988 MB File with large history (file: b2g/config/gaia.json, revision: 195a1146daa0) no optimisation: 0.358s using mmap for index: 0.297s (-0.061s) persistent nodemap for changelog only: 0.275s (-0.024s) persistent nodemap for manifest too: 0.258s (-0.017s) File with small history (file: .hgignore, revision: 195a1146daa0) no optimisation: 0.377s using mmap for index: 0.296s (-0.061s) persistent nodemap for changelog only: 0.274s (-0.022s) persistent nodemap for manifest too: 0.257s (-0.017s) Same file but using a revision (8ba995b74e18) with a smaller manifest (3944829 bytes vs 10 bytes) no optimisation: 0.192s (-0.185s) using mmap for index: 0.131s (-0.061s) persistent nodemap for changelog only: 0.106s (-0.025s) persistent nodemap for manifest too: 0.087s (-0.019s) Differential Revision: https://phab.mercurial-scm.org/D8410
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
2391
d351a3be3371 Fixing up comment headers for split up code.
Eric Hopper <hopper@omnifarious.org>
parents: 2355
diff changeset
     1
# hgweb/request.py - An http request from either CGI or the standalone server.
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
     2
#
238
3b92f8fe47ae hgweb.py: kill #! line, clean up copyright notice
mpm@selenic.com
parents: 222
diff changeset
     3
# Copyright 21 May 2005 - (c) 2005 Jake Edge <jake@edge2.net>
2859
345bac2bc4ec update copyrights.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents: 2535
diff changeset
     4
# Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
     5
#
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7742
diff changeset
     6
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 10261
diff changeset
     7
# GNU General Public License version 2 or any later version.
131
c9d51742471c moving hgweb to mercurial subdir
jake@edge2.net
parents:
diff changeset
     8
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
     9
from __future__ import absolute_import
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    10
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
    11
# import wsgiref.validate
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    12
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
    13
from ..thirdparty import attr
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    14
from .. import (
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
    15
    error,
34514
528b21b853aa request: coerce content-type to native str
Augie Fackler <augie@google.com>
parents: 34513
diff changeset
    16
    pycompat,
27046
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    17
    util,
37fcfe52c68c hgweb: use absolute_import
Yuya Nishihara <yuya@tcha.org>
parents: 26846
diff changeset
    18
)
138
c77a679e9cfa Revamped templated hgweb
mpm@selenic.com
parents: 137
diff changeset
    19
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
    20
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    21
class multidict(object):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    22
    """A dict like object that can store multiple values for a key.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    23
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    24
    Used to store parsed request parameters.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    25
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    26
    This is inspired by WebOb's class of the same name.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    27
    """
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
    28
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    29
    def __init__(self):
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    30
        self._items = {}
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    31
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    32
    def __getitem__(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    33
        """Returns the last set value for a key."""
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    34
        return self._items[key][-1]
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    35
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    36
    def __setitem__(self, key, value):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    37
        """Replace a values for a key with a new value."""
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    38
        self._items[key] = [value]
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    39
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    40
    def __delitem__(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    41
        """Delete all values for a key."""
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    42
        del self._items[key]
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    43
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    44
    def __contains__(self, key):
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    45
        return key in self._items
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    46
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    47
    def __len__(self):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    48
        return len(self._items)
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    49
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    50
    def get(self, key, default=None):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    51
        try:
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    52
            return self.__getitem__(key)
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    53
        except KeyError:
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    54
            return default
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    55
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    56
    def add(self, key, value):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    57
        """Add a new value for a key. Does not replace existing values."""
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    58
        self._items.setdefault(key, []).append(value)
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    59
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    60
    def getall(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    61
        """Obtains all values for a key."""
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    62
        return self._items.get(key, [])
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    63
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    64
    def getone(self, key):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    65
        """Obtain a single value for a key.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    66
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    67
        Raises KeyError if key not defined or it has multiple values set.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    68
        """
37000
44467a4d472f hgweb: refactor multirequest to be a dict of lists
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36917
diff changeset
    69
        vals = self._items[key]
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    70
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    71
        if len(vals) > 1:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
    72
            raise KeyError(b'multiple values for %r' % key)
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    73
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    74
        return vals[0]
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    75
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    76
    def asdictoflists(self):
43106
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
    77
        return {k: list(v) for k, v in pycompat.iteritems(self._items)}
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
    78
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
    79
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    80
@attr.s(frozen=True)
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    81
class parsedrequest(object):
36863
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36862
diff changeset
    82
    """Represents a parsed WSGI request.
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36862
diff changeset
    83
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36862
diff changeset
    84
    Contains both parsed parameters as well as a handle on the input stream.
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36862
diff changeset
    85
    """
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    86
36854
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36853
diff changeset
    87
    # Request method.
16292bbda39c hgweb: store and use request method on parsed request
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36853
diff changeset
    88
    method = attr.ib()
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    89
    # Full URL for this request.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    90
    url = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    91
    # URL without any path components. Just <proto>://<host><port>.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    92
    baseurl = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    93
    # Advertised URL. Like ``url`` and ``baseurl`` but uses SERVER_NAME instead
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    94
    # of HTTP: Host header for hostname. This is likely what clients used.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    95
    advertisedurl = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
    96
    advertisedbaseurl = attr.ib()
36873
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36872
diff changeset
    97
    # URL scheme (part before ``://``). e.g. ``http`` or ``https``.
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36872
diff changeset
    98
    urlscheme = attr.ib()
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36872
diff changeset
    99
    # Value of REMOTE_USER, if set, or None.
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36872
diff changeset
   100
    remoteuser = attr.ib()
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36872
diff changeset
   101
    # Value of REMOTE_HOST, if set, or None.
a755fd3b7146 hgweb: expose URL scheme and REMOTE_* attributes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36872
diff changeset
   102
    remotehost = attr.ib()
36905
e67a2e05fa8a hgweb: clarify that apppath begins with a forward slash
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36904
diff changeset
   103
    # Relative WSGI application path. If defined, will begin with a
e67a2e05fa8a hgweb: clarify that apppath begins with a forward slash
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36904
diff changeset
   104
    # ``/``.
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   105
    apppath = attr.ib()
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   106
    # List of path parts to be used for dispatch.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   107
    dispatchparts = attr.ib()
36904
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   108
    # URL path component (no query string) used for dispatch. Can be
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   109
    # ``None`` to signal no path component given to the request, an
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   110
    # empty string to signal a request to the application's root URL,
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   111
    # or a string not beginning with ``/`` containing the requested
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   112
    # path under the application.
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   113
    dispatchpath = attr.ib()
36874
8ddb5c354906 hgweb: expose repo name on parsedrequest
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36873
diff changeset
   114
    # The name of the repository being accessed.
8ddb5c354906 hgweb: expose repo name on parsedrequest
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36873
diff changeset
   115
    reponame = attr.ib()
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   116
    # Raw query string (part after "?" in URL).
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   117
    querystring = attr.ib()
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
   118
    # multidict of query string parameters.
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
   119
    qsparams = attr.ib()
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   120
    # wsgiref.headers.Headers instance. Operates like a dict with case
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   121
    # insensitive keys.
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   122
    headers = attr.ib()
36863
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36862
diff changeset
   123
    # Request body input stream.
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36862
diff changeset
   124
    bodyfh = attr.ib()
36915
84110a1d0f7d hgweb: store the raw WSGI environment dict
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36914
diff changeset
   125
    # WSGI environment dict, unmodified.
84110a1d0f7d hgweb: store the raw WSGI environment dict
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36914
diff changeset
   126
    rawenv = attr.ib()
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   127
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   128
37818
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   129
def parserequestfromenv(env, reponame=None, altbaseurl=None, bodyfh=None):
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   130
    """Parse URL components from environment variables.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   131
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   132
    WSGI defines request attributes via environment variables. This function
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   133
    parses the environment variables into a data structure.
36903
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   134
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   135
    If ``reponame`` is defined, the leading path components matching that
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   136
    string are effectively shifted from ``PATH_INFO`` to ``SCRIPT_NAME``.
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   137
    This simulates the world view of a WSGI application that processes
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   138
    requests from the base URL of a repo.
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   139
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   140
    If ``altbaseurl`` (typically comes from ``web.baseurl`` config option)
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   141
    is defined, it is used - instead of the WSGI environment variables - for
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   142
    constructing URL components up to and including the WSGI application path.
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   143
    For example, if the current WSGI application is at ``/repo`` and a request
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   144
    is made to ``/rev/@`` with this argument set to
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   145
    ``http://myserver:9000/prefix``, the URL and path components will resolve as
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   146
    if the request were to ``http://myserver:9000/prefix/rev/@``. In other
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   147
    words, ``wsgi.url_scheme``, ``SERVER_NAME``, ``SERVER_PORT``, and
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   148
    ``SCRIPT_NAME`` are all effectively replaced by components from this URL.
37818
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   149
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   150
    ``bodyfh`` can be used to specify a file object to read the request body
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   151
    from. If not defined, ``wsgi.input`` from the environment dict is used.
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   152
    """
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   153
    # PEP 3333 defines the WSGI spec and is a useful reference for this code.
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   154
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   155
    # We first validate that the incoming object conforms with the WSGI spec.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   156
    # We only want to be dealing with spec-conforming WSGI implementations.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   157
    # TODO enable this once we fix internal violations.
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   158
    # wsgiref.validate.check_environ(env)
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   159
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   160
    # PEP-0333 states that environment keys and values are native strings
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   161
    # (bytes on Python 2 and str on Python 3). The code points for the Unicode
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   162
    # strings on Python 3 must be between \00000-\000FF. We deal with bytes
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   163
    # in Mercurial, so mass convert string keys and values to bytes.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   164
    if pycompat.ispy3:
43106
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
   165
        env = {k.encode('latin-1'): v for k, v in pycompat.iteritems(env)}
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   166
        env = {
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   167
            k: v.encode('latin-1') if isinstance(v, str) else v
43106
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
   168
            for k, v in pycompat.iteritems(env)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   169
        }
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   170
37616
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   171
    # Some hosting solutions are emulating hgwebdir, and dispatching directly
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   172
    # to an hgweb instance using this environment variable.  This was always
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   173
    # checked prior to d7fd203e36cc; keep doing so to avoid breaking them.
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   174
    if not reponame:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   175
        reponame = env.get(b'REPO_NAME')
37616
5e81cf9651c1 hgweb: fallback to checking wsgireq.env for REPO_NAME for 3rd party hosting
Matt Harbison <matt_harbison@yahoo.com>
parents: 37606
diff changeset
   176
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   177
    if altbaseurl:
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   178
        altbaseurl = util.url(altbaseurl)
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   179
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   180
    # https://www.python.org/dev/peps/pep-0333/#environ-variables defines
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   181
    # the environment variables.
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   182
    # https://www.python.org/dev/peps/pep-0333/#url-reconstruction defines
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   183
    # how URLs are reconstructed.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   184
    fullurl = env[b'wsgi.url_scheme'] + b'://'
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   185
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   186
    if altbaseurl and altbaseurl.scheme:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   187
        advertisedfullurl = altbaseurl.scheme + b'://'
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   188
    else:
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   189
        advertisedfullurl = fullurl
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   190
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   191
    def addport(s, port):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   192
        if s.startswith(b'https://'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   193
            if port != b'443':
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   194
                s += b':' + port
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   195
        else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   196
            if port != b'80':
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   197
                s += b':' + port
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   198
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   199
        return s
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   200
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   201
    if env.get(b'HTTP_HOST'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   202
        fullurl += env[b'HTTP_HOST']
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   203
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   204
        fullurl += env[b'SERVER_NAME']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   205
        fullurl = addport(fullurl, env[b'SERVER_PORT'])
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   206
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   207
    if altbaseurl and altbaseurl.host:
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   208
        advertisedfullurl += altbaseurl.host
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   209
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   210
        if altbaseurl.port:
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   211
            port = altbaseurl.port
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   212
        elif altbaseurl.scheme == b'http' and not altbaseurl.port:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   213
            port = b'80'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   214
        elif altbaseurl.scheme == b'https' and not altbaseurl.port:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   215
            port = b'443'
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   216
        else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   217
            port = env[b'SERVER_PORT']
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   218
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   219
        advertisedfullurl = addport(advertisedfullurl, port)
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   220
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   221
        advertisedfullurl += env[b'SERVER_NAME']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   222
        advertisedfullurl = addport(advertisedfullurl, env[b'SERVER_PORT'])
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   223
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   224
    baseurl = fullurl
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   225
    advertisedbaseurl = advertisedfullurl
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   226
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   227
    fullurl += util.urlreq.quote(env.get(b'SCRIPT_NAME', b''))
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   228
    fullurl += util.urlreq.quote(env.get(b'PATH_INFO', b''))
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   229
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   230
    if altbaseurl:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   231
        path = altbaseurl.path or b''
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   232
        if path and not path.startswith(b'/'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   233
            path = b'/' + path
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   234
        advertisedfullurl += util.urlreq.quote(path)
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   235
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   236
        advertisedfullurl += util.urlreq.quote(env.get(b'SCRIPT_NAME', b''))
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   237
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   238
    advertisedfullurl += util.urlreq.quote(env.get(b'PATH_INFO', b''))
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   239
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   240
    if env.get(b'QUERY_STRING'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   241
        fullurl += b'?' + env[b'QUERY_STRING']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   242
        advertisedfullurl += b'?' + env[b'QUERY_STRING']
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   243
36903
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   244
    # If ``reponame`` is defined, that must be a prefix on PATH_INFO
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   245
    # that represents the repository being dispatched to. When computing
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   246
    # the dispatch info, we ignore these leading path components.
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   247
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   248
    if altbaseurl:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   249
        apppath = altbaseurl.path or b''
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   250
        if apppath and not apppath.startswith(b'/'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   251
            apppath = b'/' + apppath
36906
219b23359f4c hgweb: support constructing URLs from an alternate base URL
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36905
diff changeset
   252
    else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   253
        apppath = env.get(b'SCRIPT_NAME', b'')
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   254
36903
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   255
    if reponame:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   256
        repoprefix = b'/' + reponame.strip(b'/')
36816
0031e972ded2 hgweb: use the parsed application path directly
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36814
diff changeset
   257
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   258
        if not env.get(b'PATH_INFO'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   259
            raise error.ProgrammingError(b'reponame requires PATH_INFO')
36903
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   260
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   261
        if not env[b'PATH_INFO'].startswith(repoprefix):
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   262
            raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   263
                b'PATH_INFO does not begin with repo '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   264
                b'name: %s (%s)' % (env[b'PATH_INFO'], reponame)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   265
            )
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   266
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   267
        dispatchpath = env[b'PATH_INFO'][len(repoprefix) :]
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   268
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   269
        if dispatchpath and not dispatchpath.startswith(b'/'):
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   270
            raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   271
                b'reponame prefix of PATH_INFO does '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   272
                b'not end at path delimiter: %s (%s)'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   273
                % (env[b'PATH_INFO'], reponame)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   274
            )
36903
d7fd203e36cc hgweb: refactor repository name URL parsing
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36902
diff changeset
   275
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   276
        apppath = apppath.rstrip(b'/') + repoprefix
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   277
        dispatchparts = dispatchpath.strip(b'/').split(b'/')
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   278
        dispatchpath = b'/'.join(dispatchparts)
36904
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   279
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   280
    elif b'PATH_INFO' in env:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   281
        if env[b'PATH_INFO'].strip(b'/'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   282
            dispatchparts = env[b'PATH_INFO'].strip(b'/').split(b'/')
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   283
            dispatchpath = b'/'.join(dispatchparts)
36904
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   284
        else:
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   285
            dispatchparts = []
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   286
            dispatchpath = b''
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   287
    else:
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   288
        dispatchparts = []
36904
d0b0fedbfb53 hgweb: change how dispatch path is reported
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36903
diff changeset
   289
        dispatchpath = None
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   290
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   291
    querystring = env.get(b'QUERY_STRING', b'')
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   292
36817
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
   293
    # We store as a list so we have ordering information. We also store as
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
   294
    # a dict to facilitate fast lookup.
36868
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
   295
    qsparams = multidict()
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
   296
    for k, v in util.urlreq.parseqsl(querystring, keep_blank_values=True):
ec0af9c59270 hgweb: use a multidict for holding query string parameters
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36867
diff changeset
   297
        qsparams.add(k, v)
36817
3c15b84ab66c hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36816
diff changeset
   298
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   299
    # HTTP_* keys contain HTTP request headers. The Headers structure should
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   300
    # perform case normalization for us. We just rewrite underscore to dash
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   301
    # so keys match what likely went over the wire.
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   302
    headers = []
43106
d783f945a701 py3: finish porting iteritems() to pycompat and remove source transformer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
   303
    for k, v in pycompat.iteritems(env):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   304
        if k.startswith(b'HTTP_'):
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   305
            headers.append((k[len(b'HTTP_') :].replace(b'_', b'-'), v))
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   306
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   307
    from . import wsgiheaders  # avoid cycle
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   308
36822
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   309
    headers = wsgiheaders.Headers(headers)
f9078c6caeb6 hgweb: parse and store HTTP request headers
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36819
diff changeset
   310
36853
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
   311
    # This is kind of a lie because the HTTP header wasn't explicitly
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
   312
    # sent. But for all intents and purposes it should be OK to lie about
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
   313
    # this, since a consumer will either either value to determine how many
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
   314
    # bytes are available to read.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   315
    if b'CONTENT_LENGTH' in env and b'HTTP_CONTENT_LENGTH' not in env:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   316
        headers[b'Content-Length'] = env[b'CONTENT_LENGTH']
36853
ed0456fde625 hgweb: handle CONTENT_LENGTH
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36822
diff changeset
   317
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   318
    if b'CONTENT_TYPE' in env and b'HTTP_CONTENT_TYPE' not in env:
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   319
        headers[b'Content-Type'] = env[b'CONTENT_TYPE']
37052
55e901396005 hgweb: also set Content-Type header
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37000
diff changeset
   320
37818
877185de62cf hgweb: reuse body file object when hgwebdir calls hgweb (issue5851)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37616
diff changeset
   321
    if bodyfh is None:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   322
        bodyfh = env[b'wsgi.input']
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   323
        if b'Content-Length' in headers:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   324
            bodyfh = util.cappedreader(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   325
                bodyfh, int(headers[b'Content-Length'] or b'0')
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   326
            )
36863
da4e2f87167d hgweb: expose input stream on parsed WSGI request object
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36862
diff changeset
   327
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   328
    return parsedrequest(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   329
        method=env[b'REQUEST_METHOD'],
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   330
        url=fullurl,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   331
        baseurl=baseurl,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   332
        advertisedurl=advertisedfullurl,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   333
        advertisedbaseurl=advertisedbaseurl,
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   334
        urlscheme=env[b'wsgi.url_scheme'],
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   335
        remoteuser=env.get(b'REMOTE_USER'),
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   336
        remotehost=env.get(b'REMOTE_HOST'),
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   337
        apppath=apppath,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   338
        dispatchparts=dispatchparts,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   339
        dispatchpath=dispatchpath,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   340
        reponame=reponame,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   341
        querystring=querystring,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   342
        qsparams=qsparams,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   343
        headers=headers,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   344
        bodyfh=bodyfh,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   345
        rawenv=env,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   346
    )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   347
36814
69b2d0900cd7 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36736
diff changeset
   348
36881
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   349
class offsettrackingwriter(object):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   350
    """A file object like object that is append only and tracks write count.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   351
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   352
    Instances are bound to a callable. This callable is called with data
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   353
    whenever a ``write()`` is attempted.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   354
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   355
    Instances track the amount of written data so they can answer ``tell()``
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   356
    requests.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   357
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   358
    The intent of this class is to wrap the ``write()`` function returned by
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   359
    a WSGI ``start_response()`` function. Since ``write()`` is a callable and
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   360
    not a file object, it doesn't implement other file object methods.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   361
    """
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   362
36881
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   363
    def __init__(self, writefn):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   364
        self._write = writefn
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   365
        self._offset = 0
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   366
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   367
    def write(self, s):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   368
        res = self._write(s)
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   369
        # Some Python objects don't report the number of bytes written.
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   370
        if res is None:
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   371
            self._offset += len(s)
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   372
        else:
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   373
            self._offset += res
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   374
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   375
    def flush(self):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   376
        pass
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   377
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   378
    def tell(self):
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   379
        return self._offset
16499427f6de hgweb: refactor fake file object proxy for archiving
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36874
diff changeset
   380
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   381
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   382
class wsgiresponse(object):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   383
    """Represents a response to a WSGI request.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   384
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   385
    A response consists of a status line, headers, and a body.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   386
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   387
    Consumers must populate the ``status`` and ``headers`` fields and
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   388
    make a call to a ``setbody*()`` method before the response can be
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   389
    issued.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   390
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   391
    When it is time to start sending the response over the wire,
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   392
    ``sendresponse()`` is called. It handles emitting the header portion
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   393
    of the response message. It then yields chunks of body data to be
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   394
    written to the peer. Typically, the WSGI application itself calls
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   395
    and returns the value from ``sendresponse()``.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   396
    """
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   397
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   398
    def __init__(self, req, startresponse):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   399
        """Create an empty response tied to a specific request.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   400
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   401
        ``req`` is a ``parsedrequest``. ``startresponse`` is the
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   402
        ``start_response`` function passed to the WSGI application.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   403
        """
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   404
        self._req = req
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   405
        self._startresponse = startresponse
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   406
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   407
        self.status = None
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   408
        from . import wsgiheaders  # avoid cycle
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   409
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   410
        self.headers = wsgiheaders.Headers([])
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   411
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   412
        self._bodybytes = None
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   413
        self._bodygen = None
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   414
        self._bodywillwrite = False
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   415
        self._started = False
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   416
        self._bodywritefn = None
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   417
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   418
    def _verifybody(self):
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   419
        if (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   420
            self._bodybytes is not None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   421
            or self._bodygen is not None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   422
            or self._bodywillwrite
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   423
        ):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   424
            raise error.ProgrammingError(b'cannot define body multiple times')
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   425
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   426
    def setbodybytes(self, b):
36884
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   427
        """Define the response body as static bytes.
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   428
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   429
        The empty string signals that there is no response body.
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   430
        """
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   431
        self._verifybody()
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   432
        self._bodybytes = b
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   433
        self.headers[b'Content-Length'] = b'%d' % len(b)
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   434
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   435
    def setbodygen(self, gen):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   436
        """Define the response body as a generator of bytes."""
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   437
        self._verifybody()
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   438
        self._bodygen = gen
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   439
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   440
    def setbodywillwrite(self):
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   441
        """Signal an intent to use write() to emit the response body.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   442
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   443
        **This is the least preferred way to send a body.**
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   444
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   445
        It is preferred for WSGI applications to emit a generator of chunks
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   446
        constituting the response body. However, some consumers can't emit
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   447
        data this way. So, WSGI provides a way to obtain a ``write(data)``
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   448
        function that can be used to synchronously perform an unbuffered
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   449
        write.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   450
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   451
        Calling this function signals an intent to produce the body in this
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   452
        manner.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   453
        """
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   454
        self._verifybody()
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   455
        self._bodywillwrite = True
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   456
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   457
    def sendresponse(self):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   458
        """Send the generated response to the client.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   459
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   460
        Before this is called, ``status`` must be set and one of
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   461
        ``setbodybytes()`` or ``setbodygen()`` must be called.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   462
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   463
        Calling this method multiple times is not allowed.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   464
        """
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   465
        if self._started:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   466
            raise error.ProgrammingError(
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   467
                b'sendresponse() called multiple times'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   468
            )
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   469
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   470
        self._started = True
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   471
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   472
        if not self.status:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   473
            raise error.ProgrammingError(b'status line not defined')
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   474
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   475
        if (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   476
            self._bodybytes is None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   477
            and self._bodygen is None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   478
            and not self._bodywillwrite
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   479
        ):
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   480
            raise error.ProgrammingError(b'response body not defined')
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   481
36884
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   482
        # RFC 7232 Section 4.1 states that a 304 MUST generate one of
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   483
        # {Cache-Control, Content-Location, Date, ETag, Expires, Vary}
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   484
        # and SHOULD NOT generate other headers unless they could be used
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   485
        # to guide cache updates. Furthermore, RFC 7230 Section 3.3.2
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   486
        # states that no response body can be issued. Content-Length can
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   487
        # be sent. But if it is present, it should be the size of the response
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   488
        # that wasn't transferred.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   489
        if self.status.startswith(b'304 '):
36884
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   490
            # setbodybytes('') will set C-L to 0. This doesn't conform with the
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   491
            # spec. So remove it.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   492
            if self.headers.get(b'Content-Length') == b'0':
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   493
                del self.headers[b'Content-Length']
36884
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   494
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   495
            # Strictly speaking, this is too strict. But until it causes
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   496
            # problems, let's be strict.
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   497
            badheaders = {
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   498
                k
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   499
                for k in self.headers.keys()
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   500
                if k.lower()
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   501
                not in (
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   502
                    b'date',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   503
                    b'etag',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   504
                    b'expires',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   505
                    b'cache-control',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   506
                    b'content-location',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   507
                    b'content-security-policy',
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   508
                    b'vary',
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   509
                )
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   510
            }
36884
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   511
            if badheaders:
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   512
                raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   513
                    b'illegal header on 304 response: %s'
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   514
                    % b', '.join(sorted(badheaders))
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   515
                )
36884
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   516
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   517
            if self._bodygen is not None or self._bodywillwrite:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   518
                raise error.ProgrammingError(
43117
8ff1ecfadcd1 cleanup: join string literals that are already on one line
Martin von Zweigbergk <martinvonz@google.com>
parents: 43106
diff changeset
   519
                    b"must use setbodybytes('') with 304 responses"
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   520
                )
36884
ccb70a77f746 hgweb: refactor 304 handling code
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36882
diff changeset
   521
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   522
        # Various HTTP clients (notably httplib) won't read the HTTP response
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   523
        # until the HTTP request has been sent in full. If servers (us) send a
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   524
        # response before the HTTP request has been fully sent, the connection
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   525
        # may deadlock because neither end is reading.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   526
        #
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   527
        # We work around this by "draining" the request data before
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   528
        # sending any response in some conditions.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   529
        drain = False
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   530
        close = False
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   531
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   532
        # If the client sent Expect: 100-continue, we assume it is smart enough
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   533
        # to deal with the server sending a response before reading the request.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   534
        # (httplib doesn't do this.)
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   535
        if self._req.headers.get(b'Expect', b'').lower() == b'100-continue':
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   536
            pass
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   537
        # Only tend to request methods that have bodies. Strictly speaking,
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   538
        # we should sniff for a body. But this is fine for our existing
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   539
        # WSGI applications.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   540
        elif self._req.method not in (b'POST', b'PUT'):
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   541
            pass
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   542
        else:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   543
            # If we don't know how much data to read, there's no guarantee
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   544
            # that we can drain the request responsibly. The WSGI
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   545
            # specification only says that servers *should* ensure the
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   546
            # input stream doesn't overrun the actual request. So there's
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   547
            # no guarantee that reading until EOF won't corrupt the stream
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   548
            # state.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   549
            if not isinstance(self._req.bodyfh, util.cappedreader):
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   550
                close = True
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   551
            else:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   552
                # We /could/ only drain certain HTTP response codes. But 200 and
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   553
                # non-200 wire protocol responses both require draining. Since
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   554
                # we have a capped reader in place for all situations where we
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   555
                # drain, it is safe to read from that stream. We'll either do
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   556
                # a drain or no-op if we're already at EOF.
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   557
                drain = True
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   558
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   559
        if close:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   560
            self.headers[b'Connection'] = b'Close'
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   561
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   562
        if drain:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   563
            assert isinstance(self._req.bodyfh, util.cappedreader)
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   564
            while True:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   565
                chunk = self._req.bodyfh.read(32768)
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   566
                if not chunk:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   567
                    break
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   568
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   569
        strheaders = [
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   570
            (pycompat.strurl(k), pycompat.strurl(v))
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   571
            for k, v in self.headers.items()
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   572
        ]
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   573
        write = self._startresponse(pycompat.sysstr(self.status), strheaders)
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   574
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   575
        if self._bodybytes:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   576
            yield self._bodybytes
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   577
        elif self._bodygen:
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   578
            for chunk in self._bodygen:
40434
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   579
                # PEP-3333 says that output must be bytes. And some WSGI
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   580
                # implementations enforce this. We cast bytes-like types here
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   581
                # for convenience.
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   582
                if isinstance(chunk, bytearray):
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   583
                    chunk = bytes(chunk)
6107d4549fcc hgweb: cast bytearray to bytes
Gregory Szorc <gregory.szorc@gmail.com>
parents: 37828
diff changeset
   584
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   585
                yield chunk
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   586
        elif self._bodywillwrite:
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   587
            self._bodywritefn = write
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   588
        else:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   589
            error.ProgrammingError(b'do not know how to send body')
36867
a88d68dc3ee8 hgweb: create dedicated type for WSGI responses
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36865
diff changeset
   590
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   591
    def getbodyfile(self):
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   592
        """Obtain a file object like object representing the response body.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   593
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   594
        For this to work, you must call ``setbodywillwrite()`` and then
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   595
        ``sendresponse()`` first. ``sendresponse()`` is a generator and the
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   596
        function won't run to completion unless the generator is advanced. The
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   597
        generator yields not items. The easiest way to consume it is with
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   598
        ``list(res.sendresponse())``, which should resolve to an empty list -
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   599
        ``[]``.
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   600
        """
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   601
        if not self._bodywillwrite:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   602
            raise error.ProgrammingError(b'must call setbodywillwrite() first')
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   603
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   604
        if not self._started:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   605
            raise error.ProgrammingError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   606
                b'must call sendresponse() first; did '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   607
                b'you remember to consume it since it '
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   608
                b'is a generator?'
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   609
            )
36882
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   610
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   611
        assert self._bodywritefn
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   612
        return offsettrackingwriter(self._bodywritefn)
97f44b0720e2 hgweb: port archive command to modern response API
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36881
diff changeset
   613
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   614
5566
d74fc8dec2b4 Less indirection in the WSGI web interface. This simplifies some code, and makes it more compliant with WSGI.
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5563
diff changeset
   615
def wsgiapplication(app_maker):
5887
41a3fce17625 hgweb: return iterable, add deprecation note
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5886
diff changeset
   616
    '''For compatibility with old CGI scripts. A plain hgweb() or hgwebdir()
41a3fce17625 hgweb: return iterable, add deprecation note
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5886
diff changeset
   617
    can and should now be used as a WSGI application.'''
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
   618
    application = app_maker()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   619
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
   620
    def run_wsgi(env, respond):
5887
41a3fce17625 hgweb: return iterable, add deprecation note
Dirkjan Ochtman <dirkjan@ochtman.nl>
parents: 5886
diff changeset
   621
        return application(env, respond)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 40434
diff changeset
   622
5760
0145f9afb0e7 Removed tabs and trailing whitespace in python files
Thomas Arendsen Hein <thomas@intevation.de>
parents: 5566
diff changeset
   623
    return run_wsgi