Mercurial > hg
view tests/sshprotoext.py @ 38732:be4984261611
merge: mark file gets as not thread safe (issue5933)
In default installs, this has the effect of disabling the thread-based
worker on Windows when manifesting files in the working directory. My
measurements have shown that with revlog-based repositories, Mercurial
spends a lot of CPU time in revlog code resolving file data. This ends
up incurring a lot of context switching across threads and slows down
`hg update` operations when going from an empty working directory to
the tip of the repo.
On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs):
before: 487s wall
after: 360s wall (equivalent to worker.enabled=false)
cpus=2: 379s wall
Even with only 2 threads, the thread pool is still slower.
The introduction of the thread-based worker (02b36e860e0b) states that
it resulted in a "~50%" speedup for `hg sparse --enable-profile` and
`hg sparse --disable-profile`. This disagrees with my measurement
above. I theorize a few reasons for this:
1) Removal of files from the working directory is I/O - not CPU - bound
and should benefit from a thread pool (unless I/O is insanely fast
and the GIL release is near instantaneous). So tests like `hg sparse
--enable-profile` may exercise deletion throughput and aren't good
benchmarks for worker tasks that are CPU heavy.
2) The patch was authored by someone at Facebook. The results were
likely measured against a repository using remotefilelog. And I
believe that revision retrieval during working directory updates with
remotefilelog will often use a remote store, thus being I/O and not
CPU bound. This probably resulted in an overstated performance gain.
Since there appears to be a need to enable the thread-based worker with
some stores, I've made the flagging of file gets as thread safe
configurable. I've made it experimental because I don't want to formalize
a boolean flag for this option and because this attribute is best
captured against the store implementation. But we don't have a proper
store API for this yet. I'd rather cross this bridge later.
It is possible there are revlog-based repositories that do benefit from
a thread-based worker. I didn't do very comprehensive testing. If there
are, we may want to devise a more proper algorithm for whether to use
the thread-based worker, including possibly config options to limit the
number of threads to use. But until I see evidence that justifies
complexity, simplicity wins.
Differential Revision: https://phab.mercurial-scm.org/D3963
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 18 Jul 2018 09:49:34 -0700 |
parents | b4d85bc122bd |
children | 2372284d9457 |
line wrap: on
line source
# sshprotoext.py - Extension to test behavior of SSH protocol # # Copyright 2018 Gregory Szorc <gregory.szorc@gmail.com> # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. # This extension replaces the SSH server started via `hg serve --stdio`. # The server behaves differently depending on environment variables. from __future__ import absolute_import from mercurial import ( error, extensions, registrar, sshpeer, wireprotoserver, wireprotov1server, ) configtable = {} configitem = registrar.configitem(configtable) configitem(b'sshpeer', b'mode', default=None) configitem(b'sshpeer', b'handshake-mode', default=None) class bannerserver(wireprotoserver.sshserver): """Server that sends a banner to stdout.""" def serve_forever(self): for i in range(10): self._fout.write(b'banner: line %d\n' % i) super(bannerserver, self).serve_forever() class prehelloserver(wireprotoserver.sshserver): """Tests behavior when connecting to <0.9.1 servers. The ``hello`` wire protocol command was introduced in Mercurial 0.9.1. Modern clients send the ``hello`` command when connecting to SSH servers. This mock server tests behavior of the handshake when ``hello`` is not supported. """ def serve_forever(self): l = self._fin.readline() assert l == b'hello\n' # Respond to unknown commands with an empty reply. wireprotoserver._sshv1respondbytes(self._fout, b'') l = self._fin.readline() assert l == b'between\n' proto = wireprotoserver.sshv1protocolhandler(self._ui, self._fin, self._fout) rsp = wireprotov1server.dispatch(self._repo, proto, b'between') wireprotoserver._sshv1respondbytes(self._fout, rsp.data) super(prehelloserver, self).serve_forever() def performhandshake(orig, ui, stdin, stdout, stderr): """Wrapped version of sshpeer._performhandshake to send extra commands.""" mode = ui.config(b'sshpeer', b'handshake-mode') if mode == b'pre-no-args': ui.debug(b'sending no-args command\n') stdin.write(b'no-args\n') stdin.flush() return orig(ui, stdin, stdout, stderr) elif mode == b'pre-multiple-no-args': ui.debug(b'sending unknown1 command\n') stdin.write(b'unknown1\n') ui.debug(b'sending unknown2 command\n') stdin.write(b'unknown2\n') ui.debug(b'sending unknown3 command\n') stdin.write(b'unknown3\n') stdin.flush() return orig(ui, stdin, stdout, stderr) else: raise error.ProgrammingError(b'unknown HANDSHAKECOMMANDMODE: %s' % mode) def extsetup(ui): # It's easier for tests to define the server behavior via environment # variables than config options. This is because `hg serve --stdio` # has to be invoked with a certain form for security reasons and # `dummyssh` can't just add `--config` flags to the command line. servermode = ui.environ.get(b'SSHSERVERMODE') if servermode == b'banner': wireprotoserver.sshserver = bannerserver elif servermode == b'no-hello': wireprotoserver.sshserver = prehelloserver elif servermode: raise error.ProgrammingError(b'unknown server mode: %s' % servermode) peermode = ui.config(b'sshpeer', b'mode') if peermode == b'extra-handshake-commands': extensions.wrapfunction(sshpeer, '_performhandshake', performhandshake) elif peermode: raise error.ProgrammingError(b'unknown peer mode: %s' % peermode)