view doc/docchecker @ 36855:2cdf47e14c30

hgweb: refactor the request draining code The previous code for draining was only invoked in a few places in the wire protocol. Behavior wasn't consist. Furthermore, it was difficult to reason about. With us converting the input stream to a capped reader, it is now safe to always drain the input stream when its size is known because we can never overrun the input and read into the next HTTP request. The only question is "should we?" This commit changes the draining code so every request is examined. Draining now kicks in for a few requests where it wouldn't before. But I think the code is sufficiently restricted so the behavior is safe. Possibly the most dangerous part of this code is the issuing of Connection: close for POST and PUT requests that don't have a Content-Length. I don't think there are any such uses in our WSGI application, so this should be safe. In the near future, I plan to significantly refactor the WSGI response handling. I anticipate this code evolving a bit. So any minor regressions around draining or connection closing behavior might be fixed as a result of that work. All tests pass with this change. That scares me a bit because it means we are lacking low-level tests for the HTTP protocol. Differential Revision: https://phab.mercurial-scm.org/D2769
author Gregory Szorc <gregory.szorc@gmail.com>
date Sat, 10 Mar 2018 11:03:45 -0800
parents c9ab5a0bc7c5
children 9bfbb9fc5871
line wrap: on
line source

#!/usr/bin/env python
#
# docchecker - look for problematic markup
#
# Copyright 2016 timeless <timeless@mozdev.org> and others
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

from __future__ import absolute_import, print_function

import re
import sys

leadingline = re.compile(r'(^\s*)(\S.*)$')

checks = [
  (r""":hg:`[^`]*'[^`]*`""",
    """warning: please avoid nesting ' in :hg:`...`"""),
  (r'\w:hg:`',
    'warning: please have a space before :hg:'),
  (r"""(?:[^a-z][^'.])hg ([^,;"`]*'(?!hg)){2}""",
    '''warning: please use " instead of ' for hg ... "..."'''),
]

def check(line):
    messages = []
    for match, msg in checks:
        if re.search(match, line):
            messages.append(msg)
    if messages:
        print(line)
        for msg in messages:
            print(msg)

def work(file):
    (llead, lline) = ('', '')

    for line in file:
        # this section unwraps lines
        match = leadingline.match(line)
        if not match:
            check(lline)
            (llead, lline) = ('', '')
            continue

        lead, line = match.group(1), match.group(2)
        if (lead == llead):
            if (lline != ''):
                lline += ' ' + line
            else:
                lline = line
        else:
            check(lline)
            (llead, lline) = (lead, line)
    check(lline)

def main():
    for f in sys.argv[1:]:
        try:
            with open(f) as file:
                work(file)
        except BaseException as e:
            print("failed to process %s: %s" % (f, e))

main()