doc/docchecker
author Augie Fackler <augie@google.com>
Mon, 30 Jul 2018 22:50:00 -0400
changeset 39238 1ddb296e0dee
parent 29169 c9ab5a0bc7c5
child 41016 9bfbb9fc5871
permissions -rwxr-xr-x
fastannotate: initial import from Facebook's hg-experimental I made as few changes as I could to get the tests to pass, but this was a bit involved due to some churn in the blame code since someone last gave fastannotate any TLC. There's still follow-up work here to rip out support for old versions of hg and to integrate the protocol with modern standards. Some performance numbers (all on my 2016 MacBook Pro with a 2.6Ghz i7): Mercurial mercurial/manifest.py traditional blame time: real 1.050 secs (user 0.990+0.000 sys 0.060+0.000) build cache time: real 5.900 secs (user 5.720+0.000 sys 0.110+0.000) fastannotate time: real 0.120 secs (user 0.100+0.000 sys 0.020+0.000) Mercurial mercurial/localrepo.py traditional blame time: real 3.330 secs (user 3.220+0.000 sys 0.070+0.000) build cache time: real 30.610 secs (user 30.190+0.000 sys 0.230+0.000) fastannotate time: real 0.180 secs (user 0.160+0.000 sys 0.020+0.000) mozilla-central dom/ipc/ContentParent.cpp traditional blame time: real 7.640 secs (user 7.210+0.000 sys 0.380+0.000) build cache time: real 98.650 secs (user 97.000+0.000 sys 0.950+0.000) fastannotate time: real 1.580 secs (user 1.340+0.000 sys 0.240+0.000) mozilla-central dom/base/nsDocument.cpp traditional blame time: real 17.110 secs (user 16.490+0.000 sys 0.500+0.000) build cache time: real 399.750 secs (user 394.520+0.000 sys 2.610+0.000) fastannotate time: real 1.780 secs (user 1.530+0.000 sys 0.240+0.000) So building the cache is expensive (but might be faster with xdiff enabled), but the blame results are *way* faster. Differential Revision: https://phab.mercurial-scm.org/D3994
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     1
#!/usr/bin/env python
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     2
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     3
# docchecker - look for problematic markup
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     4
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     5
# Copyright 2016 timeless <timeless@mozdev.org> and others
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     6
#
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     7
# This software may be used and distributed according to the terms of the
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
     8
# GNU General Public License version 2 or any later version.
29168
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
     9
29169
c9ab5a0bc7c5 py3: make doc/docchecker use print_function
Pulkit Goyal <7895pulkit@gmail.com>
parents: 29168
diff changeset
    10
from __future__ import absolute_import, print_function
29168
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
    11
8f2805ce93d9 py3: make doc/docchecker use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 28811
diff changeset
    12
import re
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    13
import sys
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    14
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    15
leadingline = re.compile(r'(^\s*)(\S.*)$')
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    16
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    17
checks = [
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    18
  (r""":hg:`[^`]*'[^`]*`""",
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    19
    """warning: please avoid nesting ' in :hg:`...`"""),
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    20
  (r'\w:hg:`',
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    21
    'warning: please have a space before :hg:'),
28811
1a623585a658 docchecker: try to reject single quotes
timeless <timeless@mozdev.org>
parents: 28810
diff changeset
    22
  (r"""(?:[^a-z][^'.])hg ([^,;"`]*'(?!hg)){2}""",
1a623585a658 docchecker: try to reject single quotes
timeless <timeless@mozdev.org>
parents: 28810
diff changeset
    23
    '''warning: please use " instead of ' for hg ... "..."'''),
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    24
]
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    25
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    26
def check(line):
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    27
    messages = []
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    28
    for match, msg in checks:
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    29
        if re.search(match, line):
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    30
            messages.append(msg)
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    31
    if messages:
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    32
        print(line)
28810
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    33
        for msg in messages:
9934362978e1 docchecker: report context line at most once
timeless <timeless@mozdev.org>
parents: 28049
diff changeset
    34
            print(msg)
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    35
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    36
def work(file):
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    37
    (llead, lline) = ('', '')
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    38
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    39
    for line in file:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    40
        # this section unwraps lines
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    41
        match = leadingline.match(line)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    42
        if not match:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    43
            check(lline)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    44
            (llead, lline) = ('', '')
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    45
            continue
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    46
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    47
        lead, line = match.group(1), match.group(2)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    48
        if (lead == llead):
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    49
            if (lline != ''):
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    50
                lline += ' ' + line
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    51
            else:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    52
                lline = line
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    53
        else:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    54
            check(lline)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    55
            (llead, lline) = (lead, line)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    56
    check(lline)
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    57
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    58
def main():
28049
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    59
    for f in sys.argv[1:]:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    60
        try:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    61
            with open(f) as file:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    62
                work(file)
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    63
        except BaseException as e:
c00f67c15c5a docchecker: use indentation of 4 spaces
FUJIWARA Katsunori <foozy@lares.dti.ne.jp>
parents: 28048
diff changeset
    64
            print("failed to process %s: %s" % (f, e))
27730
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    65
ed86fb2a4187 docchecker: introduce a way to check for poor markup
timeless <timeless@mozdev.org>
parents:
diff changeset
    66
main()