Mercurial > hg
view contrib/check-config.py @ 38732:be4984261611
merge: mark file gets as not thread safe (issue5933)
In default installs, this has the effect of disabling the thread-based
worker on Windows when manifesting files in the working directory. My
measurements have shown that with revlog-based repositories, Mercurial
spends a lot of CPU time in revlog code resolving file data. This ends
up incurring a lot of context switching across threads and slows down
`hg update` operations when going from an empty working directory to
the tip of the repo.
On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs):
before: 487s wall
after: 360s wall (equivalent to worker.enabled=false)
cpus=2: 379s wall
Even with only 2 threads, the thread pool is still slower.
The introduction of the thread-based worker (02b36e860e0b) states that
it resulted in a "~50%" speedup for `hg sparse --enable-profile` and
`hg sparse --disable-profile`. This disagrees with my measurement
above. I theorize a few reasons for this:
1) Removal of files from the working directory is I/O - not CPU - bound
and should benefit from a thread pool (unless I/O is insanely fast
and the GIL release is near instantaneous). So tests like `hg sparse
--enable-profile` may exercise deletion throughput and aren't good
benchmarks for worker tasks that are CPU heavy.
2) The patch was authored by someone at Facebook. The results were
likely measured against a repository using remotefilelog. And I
believe that revision retrieval during working directory updates with
remotefilelog will often use a remote store, thus being I/O and not
CPU bound. This probably resulted in an overstated performance gain.
Since there appears to be a need to enable the thread-based worker with
some stores, I've made the flagging of file gets as thread safe
configurable. I've made it experimental because I don't want to formalize
a boolean flag for this option and because this attribute is best
captured against the store implementation. But we don't have a proper
store API for this yet. I'd rather cross this bridge later.
It is possible there are revlog-based repositories that do benefit from
a thread-based worker. I didn't do very comprehensive testing. If there
are, we may want to devise a more proper algorithm for whether to use
the thread-based worker, including possibly config options to limit the
number of threads to use. But until I see evidence that justifies
complexity, simplicity wins.
Differential Revision: https://phab.mercurial-scm.org/D3963
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 18 Jul 2018 09:49:34 -0700 |
parents | 2912bed9b0c7 |
children | fe28267d5223 |
line wrap: on
line source
#!/usr/bin/env python # # check-config - a config flag documentation checker for Mercurial # # Copyright 2015 Matt Mackall <mpm@selenic.com> # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. from __future__ import absolute_import, print_function import re import sys foundopts = {} documented = {} allowinconsistent = set() configre = re.compile(br''' # Function call ui\.config(?P<ctype>|int|bool|list)\( # First argument. ['"](?P<section>\S+)['"],\s* # Second argument ['"](?P<option>\S+)['"](,\s+ (?:default=)?(?P<default>\S+?))? \)''', re.VERBOSE | re.MULTILINE) configwithre = re.compile(b''' ui\.config(?P<ctype>with)\( # First argument is callback function. This doesn't parse robustly # if it is e.g. a function call. [^,]+,\s* ['"](?P<section>\S+)['"],\s* ['"](?P<option>\S+)['"](,\s+ (?:default=)?(?P<default>\S+?))? \)''', re.VERBOSE | re.MULTILINE) configpartialre = (br"""ui\.config""") ignorere = re.compile(br''' \#\s(?P<reason>internal|experimental|deprecated|developer|inconsistent)\s config:\s(?P<config>\S+\.\S+)$ ''', re.VERBOSE | re.MULTILINE) def main(args): for f in args: sect = b'' prevname = b'' confsect = b'' carryover = b'' linenum = 0 for l in open(f, 'rb'): linenum += 1 # check topic-like bits m = re.match(b'\s*``(\S+)``', l) if m: prevname = m.group(1) if re.match(b'^\s*-+$', l): sect = prevname prevname = b'' if sect and prevname: name = sect + b'.' + prevname documented[name] = 1 # check docstring bits m = re.match(br'^\s+\[(\S+)\]', l) if m: confsect = m.group(1) continue m = re.match(br'^\s+(?:#\s*)?(\S+) = ', l) if m: name = confsect + b'.' + m.group(1) documented[name] = 1 # like the bugzilla extension m = re.match(br'^\s*(\S+\.\S+)$', l) if m: documented[m.group(1)] = 1 # like convert m = re.match(br'^\s*:(\S+\.\S+):\s+', l) if m: documented[m.group(1)] = 1 # quoted in help or docstrings m = re.match(br'.*?``(\S+\.\S+)``', l) if m: documented[m.group(1)] = 1 # look for ignore markers m = ignorere.search(l) if m: if m.group('reason') == 'inconsistent': allowinconsistent.add(m.group('config')) else: documented[m.group('config')] = 1 # look for code-like bits line = carryover + l m = configre.search(line) or configwithre.search(line) if m: ctype = m.group('ctype') if not ctype: ctype = 'str' name = m.group('section') + "." + m.group('option') default = m.group('default') if default in (None, 'False', 'None', '0', '[]', '""', "''"): default = '' if re.match(b'[a-z.]+$', default): default = '<variable>' if (name in foundopts and (ctype, default) != foundopts[name] and name not in allowinconsistent): print(l.rstrip()) print("conflict on %s: %r != %r" % (name, (ctype, default), foundopts[name])) print("at %s:%d:" % (f, linenum)) foundopts[name] = (ctype, default) carryover = '' else: m = re.search(configpartialre, line) if m: carryover = line else: carryover = '' for name in sorted(foundopts): if name not in documented: if not (name.startswith("devel.") or name.startswith("experimental.") or name.startswith("debug.")): ctype, default = foundopts[name] if default: default = ' [%s]' % default print("undocumented: %s (%s)%s" % (name, ctype, default)) if __name__ == "__main__": if len(sys.argv) > 1: sys.exit(main(sys.argv[1:])) else: sys.exit(main([l.rstrip() for l in sys.stdin]))