Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 Oct 2018 17:28:54 -0700] rev 40055
revlog: move loading of index data into own method
This will allow us to "reload" a revlog instance from a rewritten
index file, which will be used in a subsequent commit.
Differential Revision: https://phab.mercurial-scm.org/D4868
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:57:35 -0700] rev 40054
revlog: clear revision cache on hash verification failure
The revision cache is populated after raw revision fulltext is
retrieved but before hash verification. If hash verification
fails, the revision cache will be populated and subsequent
operations to retrieve the invalid fulltext may return the cached
fulltext instead of raising.
This commit changes hash verification so it will invalidate the
revision cache if the cached node fails hash verification. The
side-effect is that subsequent operations to request the revision
text - even the raw revision text - will always fail.
The new behavior is consistent and is definitely less wrong. There
is an open question of whether revision(raw=True) should validate
hashes. But I'm going to punt on this problem. We can always change
behavior later. And to be honest, I'm not sure we should expose
raw=True on the storage interface at all. Another day...
Differential Revision: https://phab.mercurial-scm.org/D4867
Augie Fackler <augie@google.com> [Thu, 06 Sep 2018 02:36:25 -0400] rev 40053
fuzz: new fuzzer for cext/manifest.c
This is a bit messy, because lazymanifest is tightly coupled to the
cpython API for performance reasons. As a result, we have to build a
whole Python without pymalloc (so ASAN can help us out) and link
against that. Then we have to use an embedded Python interpreter. We
could manually drive the lazymanifest in C from that point, but
experimentally just using PyEval_EvalCode isn't really any slower so
we may as well do that and write the innermost guts of the fuzzer in
Python.
Leak detection is currently disabled for this fuzzer because there are
a few global-lifetime things in our extensions that we more or less
intentionally leak and I didn't want to take the detour to work around
that for now.
This should not be pushed to our repo until
https://github.com/google/oss-fuzz/pull/1853 is merged, as this
depends on having the Python tarball around.
Differential Revision: https://phab.mercurial-scm.org/D4879
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:32:21 -0700] rev 40052
revlog: rename _cache to _revisioncache
"cache" is generic and revlog instances have multiple caches. Let's
be descriptive about what this is a cache for.
Differential Revision: https://phab.mercurial-scm.org/D4866
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:56:48 -0700] rev 40051
testing: add file storage integration for bad hashes and censoring
In order to implement these tests, we need a backdoor to write data
into storage backends while bypassing normal checks. We invent a
callable to do that.
As part of writing the tests, I found a bug with censorrevision()
pretty quickly! After calling censorrevision(), attempting to
access revision data for an affected node raises a cryptic error
related to malformed compression. This appears to be due to the
revlog not adjusting delta chains as part of censoring.
I also found a bug with regards to hash verification and revision
fulltext caching. Essentially, we cache the fulltext before hash
verification. If we look up the fulltext after a failed hash
verification, we don't get a hash verification exception. Furthermore,
the behavior of revision(raw=True) can be inconsistent depending on
the order of operations.
I'll be fixing both these bugs in subsequent commits.
Differential Revision: https://phab.mercurial-scm.org/D4865
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:03:41 -0700] rev 40050
testing: add file storage tests for getstrippoint() and strip()
Differential Revision: https://phab.mercurial-scm.org/D4864
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:04:04 -0700] rev 40049
wireprotov2: always advertise raw repo requirements
I'm pretty sure my original thinking behind making it conditional
on stream clone support was that the behavior mirrored wire protocol
version 1.
I don't see a compelling reason for us to not advertise the server's
storage requirements. The proper way to advertise stream clone support
in wireprotov2 would be to not advertise the command(s) required to
perform stream clone or to advertise a separate capability denoting
stream clone support.
Stream clone isn't yet implemented on wireprotov2, so we can cross
this bridge later.
Differential Revision: https://phab.mercurial-scm.org/D4863
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 09:48:22 -0700] rev 40048
tests: don't be as verbose in wireprotov2 tests
I don't think that printing low-level I/O and frames is beneficial to
testing command-level functionality. Protocol-level testing, yes. But
command-level functionality shouldn't care about low-level details in
most cases. This output makes tests more verbose and harder to read.
It also makes them harder to maintain, as you need to glob over various
dynamic width fields.
Let's remove these low-level details from many of the wireprotov2
tests.
Differential Revision: https://phab.mercurial-scm.org/D4861