view tests/test-cache-abuse.t @ 38732:be4984261611

merge: mark file gets as not thread safe (issue5933) In default installs, this has the effect of disabling the thread-based worker on Windows when manifesting files in the working directory. My measurements have shown that with revlog-based repositories, Mercurial spends a lot of CPU time in revlog code resolving file data. This ends up incurring a lot of context switching across threads and slows down `hg update` operations when going from an empty working directory to the tip of the repo. On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs): before: 487s wall after: 360s wall (equivalent to worker.enabled=false) cpus=2: 379s wall Even with only 2 threads, the thread pool is still slower. The introduction of the thread-based worker (02b36e860e0b) states that it resulted in a "~50%" speedup for `hg sparse --enable-profile` and `hg sparse --disable-profile`. This disagrees with my measurement above. I theorize a few reasons for this: 1) Removal of files from the working directory is I/O - not CPU - bound and should benefit from a thread pool (unless I/O is insanely fast and the GIL release is near instantaneous). So tests like `hg sparse --enable-profile` may exercise deletion throughput and aren't good benchmarks for worker tasks that are CPU heavy. 2) The patch was authored by someone at Facebook. The results were likely measured against a repository using remotefilelog. And I believe that revision retrieval during working directory updates with remotefilelog will often use a remote store, thus being I/O and not CPU bound. This probably resulted in an overstated performance gain. Since there appears to be a need to enable the thread-based worker with some stores, I've made the flagging of file gets as thread safe configurable. I've made it experimental because I don't want to formalize a boolean flag for this option and because this attribute is best captured against the store implementation. But we don't have a proper store API for this yet. I'd rather cross this bridge later. It is possible there are revlog-based repositories that do benefit from a thread-based worker. I didn't do very comprehensive testing. If there are, we may want to devise a more proper algorithm for whether to use the thread-based worker, including possibly config options to limit the number of threads to use. But until I see evidence that justifies complexity, simplicity wins. Differential Revision: https://phab.mercurial-scm.org/D3963
author Gregory Szorc <gregory.szorc@gmail.com>
date Wed, 18 Jul 2018 09:49:34 -0700
parents 1a09dad8b85a
children 34a46d48d24e
line wrap: on
line source

Enable obsolete markers

  $ cat >> $HGRCPATH << EOF
  > [experimental]
  > evolution.createmarkers=True
  > [phases]
  > publish=False
  > EOF

Build a repo with some cacheable bits:

  $ hg init a
  $ cd a

  $ echo a > a
  $ hg ci -qAm0
  $ hg tag t1
  $ hg book -i bk1

  $ hg branch -q b2
  $ hg ci -Am1
  $ hg tag t2

  $ echo dumb > dumb
  $ hg ci -qAmdumb
  $ hg debugobsolete b1174d11b69e63cb0c5726621a43c859f0858d7f
  obsoleted 1 changesets

  $ hg phase -pr t1
  $ hg phase -fsr t2

Make a helper function to check cache damage invariants:

- command output shouldn't change
- cache should be present after first use
- corruption/repair should be silent (no exceptions or warnings)
- cache should survive deletion, overwrite, and append
- unreadable / unwriteable caches should be ignored
- cache should be rebuilt after corruption

  $ damage() {
  >  CMD=$1
  >  CACHE=.hg/cache/$2
  >  CLEAN=$3
  >  hg $CMD > before
  >  test -f $CACHE || echo "not present"
  >  echo bad > $CACHE
  >  test -z "$CLEAN" || $CLEAN
  >  hg $CMD > after
  >  "$RUNTESTDIR/pdiff" before after || echo "*** overwrite corruption"
  >  echo corruption >> $CACHE
  >  test -z "$CLEAN" || $CLEAN
  >  hg $CMD > after
  >  "$RUNTESTDIR/pdiff" before after || echo "*** append corruption"
  >  rm $CACHE
  >  mkdir $CACHE
  >  test -z "$CLEAN" || $CLEAN
  >  hg $CMD > after
  >  "$RUNTESTDIR/pdiff" before after || echo "*** read-only corruption"
  >  test -d $CACHE || echo "*** directory clobbered"
  >  rmdir $CACHE
  >  test -z "$CLEAN" || $CLEAN
  >  hg $CMD > after
  >  "$RUNTESTDIR/pdiff" before after || echo "*** missing corruption"
  >  test -f $CACHE || echo "not rebuilt"
  > }

Beat up tags caches:

  $ damage "tags --hidden" tags2
  $ damage tags tags2-visible
  $ damage "tag -f t3" hgtagsfnodes1
  1 new orphan changesets
  1 new orphan changesets
  1 new orphan changesets
  1 new orphan changesets
  1 new orphan changesets

Beat up branch caches:

  $ damage branches branch2-base "rm .hg/cache/branch2-[vs]*"
  $ damage branches branch2-served "rm .hg/cache/branch2-[bv]*"
  $ damage branches branch2-visible
  $ damage "log -r branch(.)" rbc-names-v1
  $ damage "log -r branch(default)" rbc-names-v1
  $ damage "log -r branch(b2)" rbc-revs-v1

We currently can't detect an rbc cache with unknown names:

  $ damage "log -qr branch(b2)" rbc-names-v1
  --- before	* (glob)
  +++ after	* (glob)
  @@ -1,8 +?,0 @@ (glob)
  -2:5fb7d38b9dc4
  -3:60b597ffdafa
  -4:b1174d11b69e
  -5:6354685872c0
  -6:5ebc725f1bef
  -7:7b76eec2f273
  -8:ef3428d9d644
  -9:ba7a936bc03c
  *** append corruption