doc/docchecker
author Arseniy Alekseyev <aalekseyev@janestreet.com>
Thu, 16 Feb 2023 19:00:56 +0000
changeset 50163 11661326b410
parent 48875 6000f5b25c9b
permissions -rwxr-xr-x
rhg: in path_encode, split Dest into VecDest and MeasureDest Two separate types make the write semantics easier to understand because we can consider the two sinks separately. Having two independent compiled functions for size measurement and for actual encoding seems likely to improve performance, too. (and maybe we should get rid of measurement altogether) Getting rid of [Dest] also removes the ugly option rewrapping code, which is good.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
45830
c102b704edb5 global: use python3 in shebangs
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43691
diff changeset
     1
#!/usr/bin/env python3
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     2
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     3
# docchecker - look for problematic markup
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     4
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     5
# Copyright 2016 timeless <timeless@mozdev.org> and others
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     6
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     7
# This software may be used and distributed according to the terms of the
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     8
# GNU General Public License version 2 or any later version.
29168
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
     9
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
    10
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    11
import os
29168
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
    12
import re
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    13
import sys
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    14
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    15
try:
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    16
    import msvcrt
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    17
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    18
    msvcrt.setmode(sys.stdout.fileno(), os.O_BINARY)
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    19
    msvcrt.setmode(sys.stderr.fileno(), os.O_BINARY)
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    20
except ImportError:
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    21
    pass
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    22
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    23
stdout = getattr(sys.stdout, 'buffer', sys.stdout)
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    24
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    25
leadingline = re.compile(br'(^\s*)(\S.*)$')
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    26
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    27
checks = [
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    28
    (
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    29
        br""":hg:`[^`]*'[^`]*`""",
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    30
        b"""warning: please avoid nesting ' in :hg:`...`""",
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    31
    ),
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    32
    (br'\w:hg:`', b'warning: please have a space before :hg:'),
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    33
    (
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    34
        br"""(?:[^a-z][^'.])hg ([^,;"`]*'(?!hg)){2}""",
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    35
        b'''warning: please use " instead of ' for hg ... "..."''',
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    36
    ),
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    37
]
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    38
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    39
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    40
def check(line):
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    41
    messages = []
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    42
    for match, msg in checks:
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    43
        if re.search(match, line):
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    44
            messages.append(msg)
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    45
    if messages:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    46
        stdout.write(b'%s\n' % line)
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    47
        for msg in messages:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    48
            stdout.write(b'%s\n' % msg)
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    49
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    50
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    51
def work(file):
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    52
    (llead, lline) = (b'', b'')
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    53
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    54
    for line in file:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    55
        # this section unwraps lines
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    56
        match = leadingline.match(line)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    57
        if not match:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    58
            check(lline)
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    59
            (llead, lline) = (b'', b'')
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    60
            continue
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    61
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    62
        lead, line = match.group(1), match.group(2)
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    63
        if lead == llead:
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    64
            if lline != b'':
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    65
                lline += b' ' + line
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    66
            else:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    67
                lline = line
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    68
        else:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    69
            check(lline)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    70
            (llead, lline) = (lead, line)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    71
    check(lline)
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    72
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    73
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    74
def main():
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    75
    for f in sys.argv[1:]:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    76
        try:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    77
            with open(f, 'rb') as file:
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    78
                work(file)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    79
        except BaseException as e:
41006
9bfbb9fc5871 py3: byteify docchecker
Matt Harbison <matt_harbison@yahoo.com>
parents: 29169
diff changeset
    80
            sys.stdout.write(r"failed to process %s: %s\n" % (f, e))
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    81
43691
47ef023d0165 black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 41006
diff changeset
    82
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    83
main()