Mercurial > hg
view tests/test-lfs-bundle.t @ 38732:be4984261611
merge: mark file gets as not thread safe (issue5933)
In default installs, this has the effect of disabling the thread-based
worker on Windows when manifesting files in the working directory. My
measurements have shown that with revlog-based repositories, Mercurial
spends a lot of CPU time in revlog code resolving file data. This ends
up incurring a lot of context switching across threads and slows down
`hg update` operations when going from an empty working directory to
the tip of the repo.
On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs):
before: 487s wall
after: 360s wall (equivalent to worker.enabled=false)
cpus=2: 379s wall
Even with only 2 threads, the thread pool is still slower.
The introduction of the thread-based worker (02b36e860e0b) states that
it resulted in a "~50%" speedup for `hg sparse --enable-profile` and
`hg sparse --disable-profile`. This disagrees with my measurement
above. I theorize a few reasons for this:
1) Removal of files from the working directory is I/O - not CPU - bound
and should benefit from a thread pool (unless I/O is insanely fast
and the GIL release is near instantaneous). So tests like `hg sparse
--enable-profile` may exercise deletion throughput and aren't good
benchmarks for worker tasks that are CPU heavy.
2) The patch was authored by someone at Facebook. The results were
likely measured against a repository using remotefilelog. And I
believe that revision retrieval during working directory updates with
remotefilelog will often use a remote store, thus being I/O and not
CPU bound. This probably resulted in an overstated performance gain.
Since there appears to be a need to enable the thread-based worker with
some stores, I've made the flagging of file gets as thread safe
configurable. I've made it experimental because I don't want to formalize
a boolean flag for this option and because this attribute is best
captured against the store implementation. But we don't have a proper
store API for this yet. I'd rather cross this bridge later.
It is possible there are revlog-based repositories that do benefit from
a thread-based worker. I didn't do very comprehensive testing. If there
are, we may want to devise a more proper algorithm for whether to use
the thread-based worker, including possibly config options to limit the
number of threads to use. But until I see evidence that justifies
complexity, simplicity wins.
Differential Revision: https://phab.mercurial-scm.org/D3963
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 18 Jul 2018 09:49:34 -0700 |
parents | 556984ae0005 |
children | ca82929e433d |
line wrap: on
line source
In this test, we want to test LFS bundle application on both LFS and non-LFS repos. To make it more interesting, the file revisions will contain hg filelog metadata ('\1\n'). The bundle will have 1 file revision overlapping with the destination repo. # rev 1 2 3 # repo: yes yes no # bundle: no (base) yes yes (deltabase: 2 if possible) It is interesting because rev 2 could have been stored as LFS in the repo, and non-LFS in the bundle; or vice-versa. Init $ cat >> $HGRCPATH << EOF > [extensions] > lfs= > drawdag=$TESTDIR/drawdag.py > [lfs] > url=file:$TESTTMP/lfs-remote > EOF Helper functions $ commitxy() { > hg debugdrawdag "$@" <<'EOS' > Y # Y/X=\1\nAAAA\nE\nF > | # Y/Y=\1\nAAAA\nG\nH > X # X/X=\1\nAAAA\nC\n > # X/Y=\1\nAAAA\nD\n > EOS > } $ commitz() { > hg debugdrawdag "$@" <<'EOS' > Z # Z/X=\1\nAAAA\nI\n > | # Z/Y=\1\nAAAA\nJ\n > | # Z/Z=\1\nZ > Y > EOS > } $ enablelfs() { > cat >> .hg/hgrc <<EOF > [lfs] > track=all() > EOF > } Generate bundles $ for i in normal lfs; do > NAME=src-$i > hg init $TESTTMP/$NAME > cd $TESTTMP/$NAME > [ $i = lfs ] && enablelfs > commitxy > commitz > hg bundle -q --base X -r Y+Z $TESTTMP/$NAME.bundle > SRCNAMES="$SRCNAMES $NAME" > done Prepare destination repos $ for i in normal lfs; do > NAME=dst-$i > hg init $TESTTMP/$NAME > cd $TESTTMP/$NAME > [ $i = lfs ] && enablelfs > commitxy > DSTNAMES="$DSTNAMES $NAME" > done Apply bundles $ for i in $SRCNAMES; do > for j in $DSTNAMES; do > echo ---- Applying $i.bundle to $j ---- > cp -R $TESTTMP/$j $TESTTMP/tmp-$i-$j > cd $TESTTMP/tmp-$i-$j > if hg unbundle $TESTTMP/$i.bundle -q 2>/dev/null; then > hg verify -q && echo OK > else > echo CRASHED > fi > done > done ---- Applying src-normal.bundle to dst-normal ---- OK ---- Applying src-normal.bundle to dst-lfs ---- OK ---- Applying src-lfs.bundle to dst-normal ---- OK ---- Applying src-lfs.bundle to dst-lfs ---- OK