view tests/test-demandimport.py @ 38732:be4984261611

merge: mark file gets as not thread safe (issue5933) In default installs, this has the effect of disabling the thread-based worker on Windows when manifesting files in the working directory. My measurements have shown that with revlog-based repositories, Mercurial spends a lot of CPU time in revlog code resolving file data. This ends up incurring a lot of context switching across threads and slows down `hg update` operations when going from an empty working directory to the tip of the repo. On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs): before: 487s wall after: 360s wall (equivalent to worker.enabled=false) cpus=2: 379s wall Even with only 2 threads, the thread pool is still slower. The introduction of the thread-based worker (02b36e860e0b) states that it resulted in a "~50%" speedup for `hg sparse --enable-profile` and `hg sparse --disable-profile`. This disagrees with my measurement above. I theorize a few reasons for this: 1) Removal of files from the working directory is I/O - not CPU - bound and should benefit from a thread pool (unless I/O is insanely fast and the GIL release is near instantaneous). So tests like `hg sparse --enable-profile` may exercise deletion throughput and aren't good benchmarks for worker tasks that are CPU heavy. 2) The patch was authored by someone at Facebook. The results were likely measured against a repository using remotefilelog. And I believe that revision retrieval during working directory updates with remotefilelog will often use a remote store, thus being I/O and not CPU bound. This probably resulted in an overstated performance gain. Since there appears to be a need to enable the thread-based worker with some stores, I've made the flagging of file gets as thread safe configurable. I've made it experimental because I don't want to formalize a boolean flag for this option and because this attribute is best captured against the store implementation. But we don't have a proper store API for this yet. I'd rather cross this bridge later. It is possible there are revlog-based repositories that do benefit from a thread-based worker. I didn't do very comprehensive testing. If there are, we may want to devise a more proper algorithm for whether to use the thread-based worker, including possibly config options to limit the number of threads to use. But until I see evidence that justifies complexity, simplicity wins. Differential Revision: https://phab.mercurial-scm.org/D3963
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 18 Jul 2018 09:49:34 -0700
parents 1d0610fdd63b
children dffd6a301570
line wrap: on
line source

from __future__ import absolute_import, print_function

from mercurial import demandimport
demandimport.enable()

import os
import subprocess
import sys

# Only run if demandimport is allowed
if subprocess.call(['python', '%s/hghave' % os.environ['TESTDIR'],
                    'demandimport']):
    sys.exit(80)

if os.name != 'nt':
    try:
        import distutils.msvc9compiler
        print('distutils.msvc9compiler needs to be an immediate '
              'importerror on non-windows platforms')
        distutils.msvc9compiler
    except ImportError:
        pass

import re

rsub = re.sub
def f(obj):
    l = repr(obj)
    l = rsub("0x[0-9a-fA-F]+", "0x?", l)
    l = rsub("from '.*'", "from '?'", l)
    l = rsub("'<[a-z]*>'", "'<whatever>'", l)
    return l

demandimport.disable()
os.environ['HGDEMANDIMPORT'] = 'disable'
# this enable call should not actually enable demandimport!
demandimport.enable()
from mercurial import node
print("node =", f(node))
# now enable it for real
del os.environ['HGDEMANDIMPORT']
demandimport.enable()

# Test access to special attributes through demandmod proxy
from mercurial import error as errorproxy
print("errorproxy =", f(errorproxy))
print("errorproxy.__doc__ = %r"
      % (' '.join(errorproxy.__doc__.split()[:3]) + ' ...'))
print("errorproxy.__name__ = %r" % errorproxy.__name__)
# __name__ must be accessible via __dict__ so the relative imports can be
# resolved
print("errorproxy.__dict__['__name__'] = %r" % errorproxy.__dict__['__name__'])
print("errorproxy =", f(errorproxy))

import os

print("os =", f(os))
print("os.system =", f(os.system))
print("os =", f(os))

from mercurial.utils import procutil

print("procutil =", f(procutil))
print("procutil.system =", f(procutil.system))
print("procutil =", f(procutil))
print("procutil.system =", f(procutil.system))

from mercurial import hgweb
print("hgweb =", f(hgweb))
print("hgweb_mod =", f(hgweb.hgweb_mod))
print("hgweb =", f(hgweb))

import re as fred
print("fred =", f(fred))

import re as remod
print("remod =", f(remod))

import sys as re
print("re =", f(re))

print("fred =", f(fred))
print("fred.sub =", f(fred.sub))
print("fred =", f(fred))

remod.escape  # use remod
print("remod =", f(remod))

print("re =", f(re))
print("re.stderr =", f(re.stderr))
print("re =", f(re))

import contextlib
print("contextlib =", f(contextlib))
try:
    from contextlib import unknownattr
    print('no demandmod should be created for attribute of non-package '
          'module:\ncontextlib.unknownattr =', f(unknownattr))
except ImportError as inst:
    print('contextlib.unknownattr = ImportError: %s'
          % rsub(r"'", '', str(inst)))

from mercurial import util

# Unlike the import statement, __import__() function should not raise
# ImportError even if fromlist has an unknown item
# (see Python/import.c:import_module_level() and ensure_fromlist())
contextlibimp = __import__('contextlib', globals(), locals(), ['unknownattr'])
print("__import__('contextlib', ..., ['unknownattr']) =", f(contextlibimp))
print("hasattr(contextlibimp, 'unknownattr') =",
      util.safehasattr(contextlibimp, 'unknownattr'))