view contrib/check-config.py @ 38732:be4984261611

merge: mark file gets as not thread safe (issue5933) In default installs, this has the effect of disabling the thread-based worker on Windows when manifesting files in the working directory. My measurements have shown that with revlog-based repositories, Mercurial spends a lot of CPU time in revlog code resolving file data. This ends up incurring a lot of context switching across threads and slows down `hg update` operations when going from an empty working directory to the tip of the repo. On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs): before: 487s wall after: 360s wall (equivalent to worker.enabled=false) cpus=2: 379s wall Even with only 2 threads, the thread pool is still slower. The introduction of the thread-based worker (02b36e860e0b) states that it resulted in a "~50%" speedup for `hg sparse --enable-profile` and `hg sparse --disable-profile`. This disagrees with my measurement above. I theorize a few reasons for this: 1) Removal of files from the working directory is I/O - not CPU - bound and should benefit from a thread pool (unless I/O is insanely fast and the GIL release is near instantaneous). So tests like `hg sparse --enable-profile` may exercise deletion throughput and aren't good benchmarks for worker tasks that are CPU heavy. 2) The patch was authored by someone at Facebook. The results were likely measured against a repository using remotefilelog. And I believe that revision retrieval during working directory updates with remotefilelog will often use a remote store, thus being I/O and not CPU bound. This probably resulted in an overstated performance gain. Since there appears to be a need to enable the thread-based worker with some stores, I've made the flagging of file gets as thread safe configurable. I've made it experimental because I don't want to formalize a boolean flag for this option and because this attribute is best captured against the store implementation. But we don't have a proper store API for this yet. I'd rather cross this bridge later. It is possible there are revlog-based repositories that do benefit from a thread-based worker. I didn't do very comprehensive testing. If there are, we may want to devise a more proper algorithm for whether to use the thread-based worker, including possibly config options to limit the number of threads to use. But until I see evidence that justifies complexity, simplicity wins. Differential Revision: https://phab.mercurial-scm.org/D3963
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 18 Jul 2018 09:49:34 -0700
parents 2912bed9b0c7
children fe28267d5223
line wrap: on
line source

#!/usr/bin/env python
#
# check-config - a config flag documentation checker for Mercurial
#
# Copyright 2015 Matt Mackall <mpm@selenic.com>
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

from __future__ import absolute_import, print_function
import re
import sys

foundopts = {}
documented = {}
allowinconsistent = set()

configre = re.compile(br'''
    # Function call
    ui\.config(?P<ctype>|int|bool|list)\(
        # First argument.
        ['"](?P<section>\S+)['"],\s*
        # Second argument
        ['"](?P<option>\S+)['"](,\s+
        (?:default=)?(?P<default>\S+?))?
    \)''', re.VERBOSE | re.MULTILINE)

configwithre = re.compile(b'''
    ui\.config(?P<ctype>with)\(
        # First argument is callback function. This doesn't parse robustly
        # if it is e.g. a function call.
        [^,]+,\s*
        ['"](?P<section>\S+)['"],\s*
        ['"](?P<option>\S+)['"](,\s+
        (?:default=)?(?P<default>\S+?))?
    \)''', re.VERBOSE | re.MULTILINE)

configpartialre = (br"""ui\.config""")

ignorere = re.compile(br'''
    \#\s(?P<reason>internal|experimental|deprecated|developer|inconsistent)\s
    config:\s(?P<config>\S+\.\S+)$
    ''', re.VERBOSE | re.MULTILINE)

def main(args):
    for f in args:
        sect = b''
        prevname = b''
        confsect = b''
        carryover = b''
        linenum = 0
        for l in open(f, 'rb'):
            linenum += 1

            # check topic-like bits
            m = re.match(b'\s*``(\S+)``', l)
            if m:
                prevname = m.group(1)
            if re.match(b'^\s*-+$', l):
                sect = prevname
                prevname = b''

            if sect and prevname:
                name = sect + b'.' + prevname
                documented[name] = 1

            # check docstring bits
            m = re.match(br'^\s+\[(\S+)\]', l)
            if m:
                confsect = m.group(1)
                continue
            m = re.match(br'^\s+(?:#\s*)?(\S+) = ', l)
            if m:
                name = confsect + b'.' + m.group(1)
                documented[name] = 1

            # like the bugzilla extension
            m = re.match(br'^\s*(\S+\.\S+)$', l)
            if m:
                documented[m.group(1)] = 1

            # like convert
            m = re.match(br'^\s*:(\S+\.\S+):\s+', l)
            if m:
                documented[m.group(1)] = 1

            # quoted in help or docstrings
            m = re.match(br'.*?``(\S+\.\S+)``', l)
            if m:
                documented[m.group(1)] = 1

            # look for ignore markers
            m = ignorere.search(l)
            if m:
                if m.group('reason') == 'inconsistent':
                    allowinconsistent.add(m.group('config'))
                else:
                    documented[m.group('config')] = 1

            # look for code-like bits
            line = carryover + l
            m = configre.search(line) or configwithre.search(line)
            if m:
                ctype = m.group('ctype')
                if not ctype:
                    ctype = 'str'
                name = m.group('section') + "." + m.group('option')
                default = m.group('default')
                if default in (None, 'False', 'None', '0', '[]', '""', "''"):
                    default = ''
                if re.match(b'[a-z.]+$', default):
                    default = '<variable>'
                if (name in foundopts and (ctype, default) != foundopts[name]
                    and name not in allowinconsistent):
                    print(l.rstrip())
                    print("conflict on %s: %r != %r" % (name, (ctype, default),
                                                        foundopts[name]))
                    print("at %s:%d:" % (f, linenum))
                foundopts[name] = (ctype, default)
                carryover = ''
            else:
                m = re.search(configpartialre, line)
                if m:
                    carryover = line
                else:
                    carryover = ''

    for name in sorted(foundopts):
        if name not in documented:
            if not (name.startswith("devel.") or
                    name.startswith("experimental.") or
                    name.startswith("debug.")):
                ctype, default = foundopts[name]
                if default:
                    default = ' [%s]' % default
                print("undocumented: %s (%s)%s" % (name, ctype, default))

if __name__ == "__main__":
    if len(sys.argv) > 1:
        sys.exit(main(sys.argv[1:]))
    else:
        sys.exit(main([l.rstrip() for l in sys.stdin]))