Thu, 08 Mar 2018 11:21:46 -0800 hgweb: teach WSGI parser about query strings
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 11:21:46 -0800] rev 36817
hgweb: teach WSGI parser about query strings Currently, req.form uses cgi.parse() to populate form data. Depending on the request, form data can come from POST multipart/form-data, application/x-www-form-urlencoded, or the URL query string. Putting all these things into one data structure makes it difficult to reason about how exactly parameters got to the request. It can lead to wonkiness such as pulling parameters from both the URL and POST data. This commit teaches our WSGI request parser about argument data in query strings. We populate fields containing the query string data and only the query string data so it can't be confused with POST data. Differential Revision: https://phab.mercurial-scm.org/D2737
Thu, 08 Mar 2018 15:08:20 -0800 hgweb: use the parsed application path directly
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 15:08:20 -0800] rev 36816
hgweb: use the parsed application path directly Previously, we assigned a custom system string with a trailing slash to wsgirequest.url. The addition of the trailing slash felt arbitrary and seems to go against how things typically work in WSGI. We also want our URLs to be bytes, not system strings. And, assigning a custom attribute to wsgirequest felt wrong. This commit fixes all those things by removing the trailing slash from the app path, changing consumers to use that variable and to use it without a trailing slash, and removing the custom attribute from wsgirequest. We preserve the trailing slash on {url}. Also, makebreadcrumb strips the trailing slash. So no change to it was needed. Differential Revision: https://phab.mercurial-scm.org/D2736
Thu, 08 Mar 2018 12:59:25 -0800 hgweb: use computed base URL from parsed request
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 12:59:25 -0800] rev 36815
hgweb: use computed base URL from parsed request Let's not reinvent URL construction in a function that runs the templater. Differential Revision: https://phab.mercurial-scm.org/D2735
Sat, 10 Mar 2018 10:20:51 -0800 hgweb: parse WSGI request into a data structure
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 10 Mar 2018 10:20:51 -0800] rev 36814
hgweb: parse WSGI request into a data structure Currently, our WSGI applications (hgweb_mod and hgwebdir_mod) process the raw WSGI request instance themselves. This means they have to talk in terms of system strings. And they need to know details about what's in the WSGI request. And in the case of hgweb_mod, it is doing some very funky things with URL parsing to impact dispatching. The code is difficult to read and maintain. This commit introduces parsing of the WSGI request into a higher-level and easier-to-reason-about data structure. To prove it works, we hook it up to hgweb_mod and use it for populating the relative URL on the request instance. We hold off on using it in more places because the logic in hgweb_mod is crazy and I don't want to involve those changes with review of the parsing code. The URL construction code has variations that use the HTTP: Host header (the canonical WSGI way of reconstructing the URL) and with the use of SERVER_NAME. We need to differentiate because hgweb is currently using SERVER_NAME for URL construction. Differential Revision: https://phab.mercurial-scm.org/D2734
Thu, 08 Mar 2018 15:14:32 -0800 hgweb: always use "?" when writing session vars
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 15:14:32 -0800] rev 36813
hgweb: always use "?" when writing session vars This code resolves a string to insert in URLs as part of a query string. Essentially, it resolves the {sessionvars} template keyword, which is used by hgweb templates to build a URL as a string. The whole approach here feels wrong because there's no way of knowing when this code runs how the final URL will look. There could be additional URL fragments added before this template keyword that add a query string component. Furthermore, I don't think there's *any* for req.url to have a query string. That's because the code that populates this variable only takes SCRIPT_NAME and REPO_NAME into account. The "?" character it is searching for would only be added if some code attempted to add QUERY_STRING to the URL. Hacking the code up to raise if "?" is present in the URL yields a clean test suite run. I'm not sure if we broke this code or if it has always been broken. Anyway, this commit removes support for emitting "&" as the first character in {sessionvars} and makes it always emit "?", which is what it was always doing before AFAICT. Differential Revision: https://phab.mercurial-scm.org/D2733
Thu, 08 Mar 2018 15:15:59 -0800 hgweb: rename req to wsgireq
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 15:15:59 -0800] rev 36812
hgweb: rename req to wsgireq We will soon introduce a parsed WSGI request object so we don't have to concern ourselves with low-level WSGI matters. Prepare for multiple request objects by renaming the existing one so it is clear it deals with WSGI. We also remove a symbol import to avoid even more naming confusion. # no-check-commit because of some new foo_bar naming that's required Differential Revision: https://phab.mercurial-scm.org/D2732
Thu, 08 Mar 2018 09:44:27 -0800 hgweb: validate WSGI environment dict
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 09:44:27 -0800] rev 36811
hgweb: validate WSGI environment dict The wsgiref.validate module contains useful functions for validating that various WSGI data structures are proper. This commit adds validation of the environment dict to our built-in HTTP server, which turns an HTTP request into an environment dict. The check discovered that we weren't always setting QUERY_STRING, which would cause the cgi module to fall back to sys.argv. So we change things to always set QUERY_STRING. The check passes on Python 2 and 3. Differential Revision: https://phab.mercurial-scm.org/D2731
Thu, 08 Mar 2018 09:26:51 -0800 hgweb: ensure all wsgi environment values are str
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 08 Mar 2018 09:26:51 -0800] rev 36810
hgweb: ensure all wsgi environment values are str Previously, we had a few entries that were bytes on Python 3. PEP-0333 states that all entries must be the native str type (bytes on Python 2, str on Python 3). This required a number of changes to hgweb_mod to unbreak things on Python 3. I suspect there still may be some regressions. I'm going to introduce a data structure that represents a parsed WSGI request in upcoming commits. This will hold bytes and will allow us to stop using raw literals throughout the WSGI code. Differential Revision: https://phab.mercurial-scm.org/D2730
Wed, 07 Mar 2018 16:18:52 -0800 wireproto: formalize permissions checking as part of protocol interface
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 07 Mar 2018 16:18:52 -0800] rev 36809
wireproto: formalize permissions checking as part of protocol interface Per the inline comment desiring to formalize permissions checking in the protocol interface, we do that. I'm not convinced this is the best way to go about things. I would love for there to e.g. be a better exception for denoting permissions problems. But it does feel strictly better than snipping attributes on the proto instance. Differential Revision: https://phab.mercurial-scm.org/D2719
Wed, 07 Mar 2018 16:02:24 -0800 wireproto: declare permissions requirements in @wireprotocommand (API)
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 07 Mar 2018 16:02:24 -0800] rev 36808
wireproto: declare permissions requirements in @wireprotocommand (API) With the security patches from 4.5.2 merged into default, we now have a per-command attribute defining what permissions are needed to run that command. We now have a richer @wireprotocommand that can be extended to record additional command metadata. So we port the permissions mechanism to be based on @wireprotocommand. .. api:: hgweb_mod.perms and wireproto.permissions have been removed. Wire protocol commands should declare their required permissions in the @wireprotocommand decorator. Differential Revision: https://phab.mercurial-scm.org/D2718
Tue, 06 Mar 2018 15:08:33 -0800 wireprotoserver: check permissions in main dispatch function
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 06 Mar 2018 15:08:33 -0800] rev 36807
wireprotoserver: check permissions in main dispatch function The permissions checking code merged from stable is out of place in the refactored hgweb_mod module. This commit moves the main call to wireprotoserver. We still have some lingering code in hgweb_mod. This will get addressed later. Differential Revision: https://phab.mercurial-scm.org/D2717
Tue, 06 Mar 2018 15:02:53 -0800 wireprotoserver: check if command available before calling it
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 06 Mar 2018 15:02:53 -0800] rev 36806
wireprotoserver: check if command available before calling it The previous behavior was just plain wrong. I have no clue how it landed. My guess is a merge conflict resolution gone wrong on my end a few weeks ago. Differential Revision: https://phab.mercurial-scm.org/D2716
Tue, 06 Mar 2018 02:43:17 -0600 py3: drop encoding.strio()
Yuya Nishihara <yuya@tcha.org> [Tue, 06 Mar 2018 02:43:17 -0600] rev 36805
py3: drop encoding.strio() Its buffered nature makes TextIOWrapper unsuitable for temporarily wrapping bytes I/O.
Tue, 06 Mar 2018 02:42:37 -0600 ui: adjust Windows workaround to new _readline() code
Yuya Nishihara <yuya@tcha.org> [Tue, 06 Mar 2018 02:42:37 -0600] rev 36804
ui: adjust Windows workaround to new _readline() code It's only needed when rawinput() is called. Also made it Py3 compatible.
(0) -30000 -10000 -3000 -1000 -300 -100 -14 +14 +100 +300 +1000 +3000 +10000 tip