Mercurial > hg
view tests/test-tools.t @ 38732:be4984261611
merge: mark file gets as not thread safe (issue5933)
In default installs, this has the effect of disabling the thread-based
worker on Windows when manifesting files in the working directory. My
measurements have shown that with revlog-based repositories, Mercurial
spends a lot of CPU time in revlog code resolving file data. This ends
up incurring a lot of context switching across threads and slows down
`hg update` operations when going from an empty working directory to
the tip of the repo.
On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs):
before: 487s wall
after: 360s wall (equivalent to worker.enabled=false)
cpus=2: 379s wall
Even with only 2 threads, the thread pool is still slower.
The introduction of the thread-based worker (02b36e860e0b) states that
it resulted in a "~50%" speedup for `hg sparse --enable-profile` and
`hg sparse --disable-profile`. This disagrees with my measurement
above. I theorize a few reasons for this:
1) Removal of files from the working directory is I/O - not CPU - bound
and should benefit from a thread pool (unless I/O is insanely fast
and the GIL release is near instantaneous). So tests like `hg sparse
--enable-profile` may exercise deletion throughput and aren't good
benchmarks for worker tasks that are CPU heavy.
2) The patch was authored by someone at Facebook. The results were
likely measured against a repository using remotefilelog. And I
believe that revision retrieval during working directory updates with
remotefilelog will often use a remote store, thus being I/O and not
CPU bound. This probably resulted in an overstated performance gain.
Since there appears to be a need to enable the thread-based worker with
some stores, I've made the flagging of file gets as thread safe
configurable. I've made it experimental because I don't want to formalize
a boolean flag for this option and because this attribute is best
captured against the store implementation. But we don't have a proper
store API for this yet. I'd rather cross this bridge later.
It is possible there are revlog-based repositories that do benefit from
a thread-based worker. I didn't do very comprehensive testing. If there
are, we may want to devise a more proper algorithm for whether to use
the thread-based worker, including possibly config options to limit the
number of threads to use. But until I see evidence that justifies
complexity, simplicity wins.
Differential Revision: https://phab.mercurial-scm.org/D3963
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 18 Jul 2018 09:49:34 -0700 |
parents | c1f7037c2ded |
children | 5abc47d4ca6b |
line wrap: on
line source
Tests of the file helper tool $ f -h ?sage: f [options] [filenames] (glob) ?ptions: (glob) -h, --help show this help message and exit -t, --type show file type (file or directory) -m, --mode show file mode -l, --links show number of links -s, --size show size of file -n NEWER, --newer=NEWER check if file is newer (or same) -r, --recurse recurse into directories -S, --sha1 show sha1 hash of the content --sha256 show sha256 hash of the content -M, --md5 show md5 hash of the content -D, --dump dump file content -H, --hexdump hexdump file content -B BYTES, --bytes=BYTES number of characters to dump -L LINES, --lines=LINES number of lines to dump -q, --quiet no default output $ mkdir dir $ cd dir $ f --size size=0 $ echo hello | f --md5 --size size=6, md5=b1946ac92492d2347c6235b4d2611184 $ f foo foo: file not found $ echo foo > foo $ f foo foo: $ f --sha1 foo foo: sha1=f1d2d2f924e986ac86fdf7b36c94bcdf32beec15 $ f --sha256 foo foo: sha256=b5bb9d8014a0f9b1d61e21e796d78dccdf1352f23cd32812f4850b878ae4944c #if symlink $ f foo --mode foo: mode=644 #endif #if no-windows $ $PYTHON $TESTDIR/seq.py 10 > bar #else Convert CRLF -> LF for consistency $ $PYTHON $TESTDIR/seq.py 10 | sed "s/$//" > bar #endif #if unix-permissions symlink $ chmod +x bar $ f bar --newer foo --mode --type --size --dump --links --bytes 7 bar: file, size=21, mode=755, links=1, newer than foo >>> 1 2 3 4 <<< no trailing newline #endif #if unix-permissions $ ln bar baz $ f bar -n baz -l --hexdump -t --sha1 --lines=9 -B 20 bar: file, links=2, newer than baz, sha1=612ca68d0305c821750a 0000: 31 0a 32 0a 33 0a 34 0a 35 0a 36 0a 37 0a 38 0a |1.2.3.4.5.6.7.8.| 0010: 39 0a |9.| $ rm baz #endif #if unix-permissions symlink $ ln -s yadda l $ f . --recurse -MStmsB4 .: directory with 3 files, mode=755 ./bar: file, size=21, mode=755, md5=3b03, sha1=612c ./foo: file, size=4, mode=644, md5=d3b0, sha1=f1d2 ./l: link, size=5, md5=2faa, sha1=af93 #endif $ f --quiet bar -DL 3 1 2 3 $ cd .. Yadda is a symlink $ f -qr dir -HB 17 dir: directory with 3 files (symlink !) dir: directory with 2 files (no-symlink !) dir/bar: 0000: 31 0a 32 0a 33 0a 34 0a 35 0a 36 0a 37 0a 38 0a |1.2.3.4.5.6.7.8.| 0010: 39 |9| dir/foo: 0000: 66 6f 6f 0a |foo.| dir/l: (symlink !) 0000: 79 61 64 64 61 |yadda| (symlink !)