perf: add threading capability to perfbdiff
Since we are releasing the GIL during diffing, it is interesting to see how a
thread pool would perform on diffing. We add a new `--threads` argument to
commands. Synchronizing the thread pool is a bit complex because we want to be
able to reuse it from one run to another.
On my computer (i7 with 4 cores + hyperthreading), I get the following data for
about 12000 revisions:
threads wall comb wall gain comb overhead
none 31.596715 31.59 0.00% 0.00%
1 31.621228 31.62 -0.08% 0.09%
2 16.406202 32.8 48.08% 3.83%
3 11.598334 34.76 63.29% 10.03%
4 9.205421 36.77 70.87% 16.40%
5 8.517604 42.51 73.04% 34.57%
6 7.94645 47.58 74.85% 50.62%
7 7.434972 51.92 76.47% 64.36%
8 7.070638 55.34 77.62% 75.18%
Compared to the feature disabled (threads=0), the overhead is negligible with
the threading code (threads=1), and the gain is already 48% with two threads.
create verbosemmap.py
$ cat << EOF > verbosemmap.py
> # extension to make util.mmapread verbose
>
> from __future__ import absolute_import
>
> from mercurial import (
> extensions,
> pycompat,
> util,
> )
>
> def extsetup(ui):
> def mmapread(orig, fp):
> ui.write(b"mmapping %s\n" % pycompat.bytestr(fp.name))
> ui.flush()
> return orig(fp)
>
> extensions.wrapfunction(util, 'mmapread', mmapread)
> EOF
setting up base repo
$ hg init a
$ cd a
$ touch a
$ hg add a
$ hg commit -qm base
$ for i in `$TESTDIR/seq.py 1 100` ; do
> echo $i > a
> hg commit -qm $i
> done
set up verbosemmap extension
$ cat << EOF >> $HGRCPATH
> [extensions]
> verbosemmap=$TESTTMP/verbosemmap.py
> EOF
mmap index which is now more than 4k long
$ hg log -l 5 -T '{rev}\n' --config experimental.mmapindexthreshold=4k
mmapping $TESTTMP/a/.hg/store/00changelog.i
100
99
98
97
96
do not mmap index which is still less than 32k
$ hg log -l 5 -T '{rev}\n' --config experimental.mmapindexthreshold=32k
100
99
98
97
96
$ cd ..