doc/docchecker
author Jun Wu <quark@fb.com>
Sat, 18 Feb 2017 00:39:31 -0800
changeset 31037 17b5cda5a84a
parent 29169 c9ab5a0bc7c5
child 41016 9bfbb9fc5871
permissions -rwxr-xr-x
revset: use phasecache.getrevset This is part of a refactoring that moves some phase query optimization from revset.py to phases.py. See the previous patch for motivation. This patch changes revset code to use phasecache.getrevset so it no longer accesses the private field: _phasecache._phasesets directly. For performance impact, this patch was tested using the following query, on my hg-committed repo: for i in 'public()' 'not public()' 'draft()' 'not draft()'; do echo $i; hg perfrevset "$i"; hg perfrevset "$i" --hidden; done For the CPython implementation, most operations are unchanged (within +/- 1%), while "not public()" and "draft()" is noticeably faster on an unfiltered repo. It may be because the new code avoids a set copy if filteredrevs is empty. revset | public() | not public() | draft() | not draft() hidden | yes | no | yes | no | yes | no | yes | no ------------------------------------------------------------------ before | 19006 | 17352 | 239 | 286 | 180 | 228 | 7690 | 5745 after | 19137 | 17231 | 240 | 207 | 182 | 150 | 7687 | 5658 delta | | -38% | | -52% | (timed in microseconds) For the pure Python implementation, some operations are faster while "not draft()" is noticeably slower: revset | public() | not public() | draft() | not draft() hidden | yes | no | yes | no | yes | no | yes | no ------------------------------------------------------------------------ before | 18852 | 17183 | 17758 | 15921 | 17505 | 15973 | 41521 | 39822 after | 18924 | 17380 | 17558 | 14545 | 16727 | 13593 | 48356 | 43992 delta | | -9% | -5% | -15% | +16% | +10% That may be the different performance characters of generatorset vs. filteredset. The "not draft()" query could be optimized in this case where both "public" and "secret" are passed to "getrevsets" so it won't iterate the whole repo twice.

#!/usr/bin/env python
#
# docchecker - look for problematic markup
#
# Copyright 2016 timeless <timeless@mozdev.org> and others
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

from __future__ import absolute_import, print_function

import re
import sys

leadingline = re.compile(r'(^\s*)(\S.*)$')

checks = [
  (r""":hg:`[^`]*'[^`]*`""",
    """warning: please avoid nesting ' in :hg:`...`"""),
  (r'\w:hg:`',
    'warning: please have a space before :hg:'),
  (r"""(?:[^a-z][^'.])hg ([^,;"`]*'(?!hg)){2}""",
    '''warning: please use " instead of ' for hg ... "..."'''),
]

def check(line):
    messages = []
    for match, msg in checks:
        if re.search(match, line):
            messages.append(msg)
    if messages:
        print(line)
        for msg in messages:
            print(msg)

def work(file):
    (llead, lline) = ('', '')

    for line in file:
        # this section unwraps lines
        match = leadingline.match(line)
        if not match:
            check(lline)
            (llead, lline) = ('', '')
            continue

        lead, line = match.group(1), match.group(2)
        if (lead == llead):
            if (lline != ''):
                lline += ' ' + line
            else:
                lline = line
        else:
            check(lline)
            (llead, lline) = (lead, line)
    check(lline)

def main():
    for f in sys.argv[1:]:
        try:
            with open(f) as file:
                work(file)
        except BaseException as e:
            print("failed to process %s: %s" % (f, e))

main()