revset: use phasecache.getrevset
This is part of a refactoring that moves some phase query optimization from
revset.py to phases.py. See the previous patch for motivation.
This patch changes revset code to use phasecache.getrevset so it no longer
accesses the private field: _phasecache._phasesets directly.
For performance impact, this patch was tested using the following query, on
my hg-committed repo:
for i in 'public()' 'not public()' 'draft()' 'not draft()'; do
echo $i;
hg perfrevset "$i";
hg perfrevset "$i" --hidden;
done
For the CPython implementation, most operations are unchanged (within
+/- 1%), while "not public()" and "draft()" is noticeably faster on an
unfiltered repo. It may be because the new code avoids a set copy if
filteredrevs is empty.
revset | public() | not public() | draft() | not draft()
hidden | yes | no | yes | no | yes | no | yes | no
------------------------------------------------------------------
before | 19006 | 17352 | 239 | 286 | 180 | 228 | 7690 | 5745
after | 19137 | 17231 | 240 | 207 | 182 | 150 | 7687 | 5658
delta | | -38% | | -52% |
(timed in microseconds)
For the pure Python implementation, some operations are faster while "not
draft()" is noticeably slower:
revset | public() | not public() | draft() | not draft()
hidden | yes | no | yes | no | yes | no | yes | no
------------------------------------------------------------------------
before | 18852 | 17183 | 17758 | 15921 | 17505 | 15973 | 41521 | 39822
after | 18924 | 17380 | 17558 | 14545 | 16727 | 13593 | 48356 | 43992
delta | | -9% | -5% | -15% | +16% | +10%
That may be the different performance characters of generatorset vs.
filteredset. The "not draft()" query could be optimized in this case where
both "public" and "secret" are passed to "getrevsets" so it won't iterate
the whole repo twice.
#!/usr/bin/env python
#
# docchecker - look for problematic markup
#
# Copyright 2016 timeless <timeless@mozdev.org> and others
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.
from __future__ import absolute_import, print_function
import re
import sys
leadingline = re.compile(r'(^\s*)(\S.*)$')
checks = [
(r""":hg:`[^`]*'[^`]*`""",
"""warning: please avoid nesting ' in :hg:`...`"""),
(r'\w:hg:`',
'warning: please have a space before :hg:'),
(r"""(?:[^a-z][^'.])hg ([^,;"`]*'(?!hg)){2}""",
'''warning: please use " instead of ' for hg ... "..."'''),
]
def check(line):
messages = []
for match, msg in checks:
if re.search(match, line):
messages.append(msg)
if messages:
print(line)
for msg in messages:
print(msg)
def work(file):
(llead, lline) = ('', '')
for line in file:
# this section unwraps lines
match = leadingline.match(line)
if not match:
check(lline)
(llead, lline) = ('', '')
continue
lead, line = match.group(1), match.group(2)
if (lead == llead):
if (lline != ''):
lline += ' ' + line
else:
lline = line
else:
check(lline)
(llead, lline) = (lead, line)
check(lline)
def main():
for f in sys.argv[1:]:
try:
with open(f) as file:
work(file)
except BaseException as e:
print("failed to process %s: %s" % (f, e))
main()