Mercurial > hg
view tests/test-contrib-check-commit.t @ 38732:be4984261611
merge: mark file gets as not thread safe (issue5933)
In default installs, this has the effect of disabling the thread-based
worker on Windows when manifesting files in the working directory. My
measurements have shown that with revlog-based repositories, Mercurial
spends a lot of CPU time in revlog code resolving file data. This ends
up incurring a lot of context switching across threads and slows down
`hg update` operations when going from an empty working directory to
the tip of the repo.
On mozilla-unified (246,351 files) on an i7-6700K (4+4 CPUs):
before: 487s wall
after: 360s wall (equivalent to worker.enabled=false)
cpus=2: 379s wall
Even with only 2 threads, the thread pool is still slower.
The introduction of the thread-based worker (02b36e860e0b) states that
it resulted in a "~50%" speedup for `hg sparse --enable-profile` and
`hg sparse --disable-profile`. This disagrees with my measurement
above. I theorize a few reasons for this:
1) Removal of files from the working directory is I/O - not CPU - bound
and should benefit from a thread pool (unless I/O is insanely fast
and the GIL release is near instantaneous). So tests like `hg sparse
--enable-profile` may exercise deletion throughput and aren't good
benchmarks for worker tasks that are CPU heavy.
2) The patch was authored by someone at Facebook. The results were
likely measured against a repository using remotefilelog. And I
believe that revision retrieval during working directory updates with
remotefilelog will often use a remote store, thus being I/O and not
CPU bound. This probably resulted in an overstated performance gain.
Since there appears to be a need to enable the thread-based worker with
some stores, I've made the flagging of file gets as thread safe
configurable. I've made it experimental because I don't want to formalize
a boolean flag for this option and because this attribute is best
captured against the store implementation. But we don't have a proper
store API for this yet. I'd rather cross this bridge later.
It is possible there are revlog-based repositories that do benefit from
a thread-based worker. I didn't do very comprehensive testing. If there
are, we may want to devise a more proper algorithm for whether to use
the thread-based worker, including possibly config options to limit the
number of threads to use. But until I see evidence that justifies
complexity, simplicity wins.
Differential Revision: https://phab.mercurial-scm.org/D3963
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 18 Jul 2018 09:49:34 -0700 |
parents | 2fb3ae89e4e1 |
children | 1bd3e922de18 |
line wrap: on
line source
Test the 'check-commit' script ============================== A fine patch: $ cat > patch-with-long-header.diff << EOF > # HG changeset patch > # User timeless <timeless@mozdev.org> > # Date 1448911706 0 > # Mon Nov 30 19:28:26 2015 +0000 > # Node ID c41cb6d2b7dbd62b1033727f8606b8c09fc4aa88 > # Parent 42aa0e570eaa364a622bc4443b0bcb79b1100a58 > # ClownJoke This is a veryly long header that should not be warned about because its not the description > bundle2: use Oxford comma (issue123) (BC) > > diff --git a/hgext/transplant.py b/hgext/transplant.py > --- a/hgext/transplant.py > +++ b/hgext/transplant.py > @@ -599,7 +599,7 @@ > return > if not (opts.get('source') or revs or > opts.get('merge') or opts.get('branch')): > - raise error.Abort(_('no source URL, branch revision or revision ' > + raise error.Abort(_('no source URL, branch revision, or revision ' > 'list provided')) > if opts.get('all'): > > + def blahblah(x): > + pass > EOF $ cat patch-with-long-header.diff | $TESTDIR/../contrib/check-commit This would normally be against the rules, but it's okay because that's what tagging and signing looks like: $ cat > creates-a-tag.diff << EOF > # HG changeset patch > # User Augie Fackler <raf@durin42.com> > # Date 1484787778 18000 > # Wed Jan 18 20:02:58 2017 -0500 > # Branch stable > # Node ID c177635e4acf52923bc3aa9f72a5b1ad1197b173 > # Parent a1dd2c0c479e0550040542e392e87bc91262517e > Added tag 4.1-rc for changeset a1dd2c0c479e > > diff --git a/.hgtags b/.hgtags > --- a/.hgtags > +++ b/.hgtags > @@ -150,3 +150,4 @@ 438173c415874f6ac653efc1099dec9c9150e90f > eab27446995210c334c3d06f1a659e3b9b5da769 4.0 > b3b1ae98f6a0e14c1e1ba806a6c18e193b6dae5c 4.0.1 > e69874dc1f4e142746ff3df91e678a09c6fc208c 4.0.2 > +a1dd2c0c479e0550040542e392e87bc91262517e 4.1-rc > EOF $ $TESTDIR/../contrib/check-commit < creates-a-tag.diff A patch with lots of errors: $ cat > patch-with-long-header.diff << EOF > # HG changeset patch > # User timeless > # Date 1448911706 0 > # Mon Nov 30 19:28:26 2015 +0000 > # Node ID c41cb6d2b7dbd62b1033727f8606b8c09fc4aa88 > # Parent 42aa0e570eaa364a622bc4443b0bcb79b1100a58 > # ClownJoke This is a veryly long header that should not be warned about because its not the description > transplant/foo: this summary is way too long use Oxford comma (bc) (bug123) (issue 244) > > diff --git a/hgext/transplant.py b/hgext/transplant.py > --- a/hgext/transplant.py > +++ b/hgext/transplant.py > @@ -599,7 +599,7 @@ > return > if not (opts.get('source') or revs or > opts.get('merge') or opts.get('branch')): > - raise error.Abort(_('no source URL, branch revision or revision ' > + raise error.Abort(_('no source URL, branch revision, or revision ' > 'list provided')) > if opts.get('all'): > EOF $ cat patch-with-long-header.diff | $TESTDIR/../contrib/check-commit 1: username is not an email address # User timeless 7: summary keyword should be most user-relevant one-word command or topic transplant/foo: this summary is way too long use Oxford comma (bc) (bug123) (issue 244) 7: (BC) needs to be uppercase transplant/foo: this summary is way too long use Oxford comma (bc) (bug123) (issue 244) 7: use (issueDDDD) instead of bug transplant/foo: this summary is way too long use Oxford comma (bc) (bug123) (issue 244) 7: no space allowed between issue and number transplant/foo: this summary is way too long use Oxford comma (bc) (bug123) (issue 244) 7: summary line too long (limit is 78) transplant/foo: this summary is way too long use Oxford comma (bc) (bug123) (issue 244) [1] A patch with other errors: $ cat > patch-with-long-header.diff << EOF > # HG changeset patch > # User timeless > # Date 1448911706 0 > # Mon Nov 30 19:28:26 2015 +0000 > # Node ID c41cb6d2b7dbd62b1033727f8606b8c09fc4aa88 > # Parent 42aa0e570eaa364a622bc4443b0bcb79b1100a58 > # ClownJoke This is a veryly long header that should not be warned about because its not the description > This has no topic and ends with a period. > > diff --git a/hgext/transplant.py b/hgext/transplant.py > --- a/hgext/transplant.py > +++ b/hgext/transplant.py > @@ -599,7 +599,7 @@ > if opts.get('all'): > > > + > + some = otherjunk > + > + > + def blah_blah(x): > + pass > + > > EOF $ cat patch-with-long-header.diff | $TESTDIR/../contrib/check-commit 1: username is not an email address # User timeless 7: don't capitalize summary lines This has no topic and ends with a period. 7: summary line doesn't start with 'topic: ' This has no topic and ends with a period. 7: don't add trailing period on summary line This has no topic and ends with a period. 19: adds double empty line + 20: adds a function with foo_bar naming + def blah_blah(x): 23: adds double empty line + [1]