view hgext/commitextras.py @ 38732:be4984261611

merge: mark file gets as not thread safe (issue5933) In default installs, this has the effect of disabling the thread-based worker on Windows when manifesting files in the working directory. My measurements have shown that with revlog-based repositories, Mercurial spends a lot of CPU time in revlog code resolving file data. This ends up incurring a lot of context switching across threads and slows down `hg update` operations when going from an empty working directory to the tip of the repo. On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs): before: 487s wall after: 360s wall (equivalent to worker.enabled=false) cpus=2: 379s wall Even with only 2 threads, the thread pool is still slower. The introduction of the thread-based worker (02b36e860e0b) states that it resulted in a "~50%" speedup for `hg sparse --enable-profile` and `hg sparse --disable-profile`. This disagrees with my measurement above. I theorize a few reasons for this: 1) Removal of files from the working directory is I/O - not CPU - bound and should benefit from a thread pool (unless I/O is insanely fast and the GIL release is near instantaneous). So tests like `hg sparse --enable-profile` may exercise deletion throughput and aren't good benchmarks for worker tasks that are CPU heavy. 2) The patch was authored by someone at Facebook. The results were likely measured against a repository using remotefilelog. And I believe that revision retrieval during working directory updates with remotefilelog will often use a remote store, thus being I/O and not CPU bound. This probably resulted in an overstated performance gain. Since there appears to be a need to enable the thread-based worker with some stores, I've made the flagging of file gets as thread safe configurable. I've made it experimental because I don't want to formalize a boolean flag for this option and because this attribute is best captured against the store implementation. But we don't have a proper store API for this yet. I'd rather cross this bridge later. It is possible there are revlog-based repositories that do benefit from a thread-based worker. I didn't do very comprehensive testing. If there are, we may want to devise a more proper algorithm for whether to use the thread-based worker, including possibly config options to limit the number of threads to use. But until I see evidence that justifies complexity, simplicity wins. Differential Revision: https://phab.mercurial-scm.org/D3963
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 18 Jul 2018 09:49:34 -0700
parents 75c76cee1b1b
children 1cb7c9777852
line wrap: on
line source

# commitextras.py
#
# Copyright 2013 Facebook, Inc.
#
# This software may be used and distributed according to the terms of the
# GNU General Public License version 2 or any later version.

'''adds a new flag extras to commit (ADVANCED)'''

from __future__ import absolute_import

import re

from mercurial.i18n import _
from mercurial import (
    commands,
    error,
    extensions,
    registrar,
)

cmdtable = {}
command = registrar.command(cmdtable)
testedwith = 'ships-with-hg-core'

usedinternally = {
    'amend_source',
    'branch',
    'close',
    'histedit_source',
    'topic',
    'rebase_source',
    'intermediate-source',
    '__touch-noise__',
    'source',
    'transplant_source',
}

def extsetup(ui):
    entry = extensions.wrapcommand(commands.table, 'commit', _commit)
    options = entry[1]
    options.append(('', 'extra', [],
        _('set a changeset\'s extra values'), _("KEY=VALUE")))

def _commit(orig, ui, repo, *pats, **opts):
    origcommit = repo.commit
    try:
        def _wrappedcommit(*innerpats, **inneropts):
            extras = opts.get(r'extra')
            if extras:
                for raw in extras:
                    if '=' not in raw:
                        msg = _("unable to parse '%s', should follow "
                                "KEY=VALUE format")
                        raise error.Abort(msg % raw)
                    k, v = raw.split('=', 1)
                    if not k:
                        msg = _("unable to parse '%s', keys can't be empty")
                        raise error.Abort(msg % raw)
                    if re.search('[^\w-]', k):
                        msg = _("keys can only contain ascii letters, digits,"
                                " '_' and '-'")
                        raise error.Abort(msg)
                    if k in usedinternally:
                        msg = _("key '%s' is used internally, can't be set "
                                "manually")
                        raise error.Abort(msg % k)
                    inneropts[r'extra'][k] = v
            return origcommit(*innerpats, **inneropts)

        # This __dict__ logic is needed because the normal
        # extension.wrapfunction doesn't seem to work.
        repo.__dict__[r'commit'] = _wrappedcommit
        return orig(ui, repo, *pats, **opts)
    finally:
        del repo.__dict__[r'commit']