Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 22:52:24 +0300] rev 40076
narrow: move adding of narrow server capabilities to core
We use the experimental.narrow config option introduced in one of the previous
patch and move the functionality of adding narrow server capabilities to core.
Differential Revision: https://phab.mercurial-scm.org/D4891
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 22:31:12 +0300] rev 40075
wireprotoserver: move narrow capabilities to wireprototypes.py
This is done because wireprotoserver import wireprotov1server, so you cannot
import wireprotoserver in wireprotov1server to use the capabilities constants.
Differential Revision: https://phab.mercurial-scm.org/D4890
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 22:19:19 +0300] rev 40074
narrow: introduce a config option to check if narrow is enabled or not
This patch introduces a new config option experimental.narrow which is set to
False by default and set to True by the narrow extension.
While moving narrow related logic into core, we need to know at places whether
narrow extension is enabled or not. Checking the list of extension enabled is
one solution but once narrow is inbuilt, we will definitely want a config option
to check whether narrow is turned on or not.
So this patch introduces a config option, which will evolve to the main point to
turn narrow capability on and off once all the narrow is in core.
Differential Revision: https://phab.mercurial-scm.org/D4889
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 20:24:07 +0300] rev 40073
narrow: move the code to generate a widening bundle2 to core
This is a part of moving more narrow related bits to core.
Differential Revision: https://phab.mercurial-scm.org/D4888
Pulkit Goyal <pulkit@yandex-team.ru> [Tue, 02 Oct 2018 17:09:56 +0300] rev 40072
narrow: start returning bundle2 from widen_bundle()
Differential Revision: https://phab.mercurial-scm.org/D4838
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 28 Sep 2018 23:42:31 +0300] rev 40071
narrow: the first version of narrow_widen wireprotocol command
This patch introduces a wireprotocol command narrow_widen() which will be used
to widen a narrow copy using `hg tracked` command provided by narrow extension.
The wireprotocol command takes the old and new includes and excludes, common
heads, changegroup version, known revs, and a boolean ellipses and generates a
bundle2 of the required data and send it. The clients receives the bundle2
and applies that.
A bundle2 instead of changegroup because in future we might want to add more
things to send while widening. Thanks for martinvonz for the suggestion.
I am not sure whether we need changegroup version as an argument to the command
as I *think* narrow needs changegroup3 already.
The tests shows that we don't exchange phase data now while widening which is
nice. Also we don't check for pushkeys, rbc-cache, bookmarks etc.
This does not support ellipses cases for now but will be supported in future
patches. Since we send bundle2, it won't be hard to plug the ellipses logic in
here.
The existing code for widening a non-ellipses case is also dropped in this
patch.
Differential Revision: https://phab.mercurial-scm.org/D4813
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:43:57 +0900] rev 40070
remotenames: abort if literal revset pattern matches nothing
This is the convention of the other namespace revsets such as tag(). Let's
make the remote variants do the same.
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:39:41 +0900] rev 40069
remotenames: remove unneeded sorted() from revset implementation
The order is constrained by the subset.
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:36:48 +0900] rev 40068
remotenames: don't call a set of nodes as "revs"
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:30:55 +0900] rev 40067
remotenames: use util.always instead of handcrafted lambda
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:29:21 +0900] rev 40066
remotenames: inline _parseargs() into _revsetutil()
The _parseargs() function gets quite simple, and the 0/1 loop can be rewritten
as "if".
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 16:27:40 -0700] rev 40065
repo: create changectx in a single place in localrepo.__getitem__
Differential Revision: https://phab.mercurial-scm.org/D4885
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 16:06:36 -0700] rev 40064
repo: remove the last few "pass" statements in localrepo.__getitem__
In case of IndexError or LookupError, we used "pass" statements and
fell through to the end of localrepo.__getitem__. I find the pass
statements easy to miss. Consistently raising and catching exceptions
seems easier to follow.
Oh -- and I didn't plan this before I wrote the above -- that probably
also lets us reuse the "return context.changectx(self, rev, node)" in
a later patch.
Differential Revision: https://phab.mercurial-scm.org/D4884
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 10:38:55 -0700] rev 40063
filectx: correct docstring about "changeid"
The changeid argument must be a revnum (basefile.rev() is defined as
"return self._changeid"), so fix the lie in the docstring. It seems to
have been incorrect for at least 10 years (I didn't check further
back).
Differential Revision: https://phab.mercurial-scm.org/D4881
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 10:30:05 -0700] rev 40062
context: drop incorrect and superfluous docstring
It's been incorrect at least since 8b86acc7aa64 (context: drop support
for looking up context by ambiguous changeid (API), 2018-04-28).
Differential Revision: https://phab.mercurial-scm.org/D4880
Augie Fackler <raf@durin42.com> [Thu, 04 Oct 2018 21:35:12 -0400] rev 40061
remotenames: follow-up on D3639 to make revset funcs take only one arg
Per the review discussion on D3639, we want this to just take one
argument. That ended up simplifying the code, so I'm sharing this as a
follow-up to that revision rather than editing in-flight.
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 12 Jul 2018 03:12:09 +0530] rev 40060
remotenames: add names argument to remotenames revset
This patch adds names argument to the revsets provided by the remotenames
extension. The revsets are remotenames(), remotebranches() and
remotebookmarks(). names can be a single names, list of names or can be empty too
which means it's an optional argument.
If names is/are passed, changesets which have those remotenames will be
returned.
If names are not passed, changesets from all the remotenames are shown.
Passing an invalid remotename does not throw error.
The name argument also supports pattern matching.
Tests are added for the argument in tests/test-logexchange.t
Differential Revision: https://phab.mercurial-scm.org/D3639
Boris Feld <boris.feld@octobus.net> [Fri, 07 Sep 2018 11:43:48 -0400] rev 40059
copies: add time information to the debug information
Boris Feld <boris.feld@octobus.net> [Fri, 07 Sep 2018 11:16:06 -0400] rev 40058
copies: add a devel debug mode to trace what copy tracing does
Mercurial can spend a lot of time finding renames between two commits. Having
more information about that process help to understand what makes it slow in
an individual instance. (eg: many files vs 1 file, etc...)
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 Oct 2018 17:34:34 -0700] rev 40057
revlog: rewrite censoring logic
I was able to corrupt a revlog relatively easily with the existing
censoring code. The underlying problem is that the existing code
doesn't fully take delta chains into account. When copying revisions
that occur after the censored revision, the delta base can refer
to a censored revision. Then at read time, things blow up due to the
revision data not being a compressed delta.
This commit rewrites the revlog censoring code to take a higher-level
approach. We now create a new revlog instance pointing at temp files.
We iterate through each revision in the source revlog and insert
those revisions into the new revlog, replacing the censored revision's
data along the way.
The new implementation isn't as efficient as the old one. This is
because it will fully engage delta computation on insertion. But I
don't think it matters.
The new implementation is a bit hacky because it attempts to reload
the revlog instance with a new revlog index/data file. This is fragile.
But this is needed because the index (which could be backed by C) would
have a cached copy of the old, possibly changed data and that could
lead to problems accessing index or revision data later.
One benefit of the new approach is that we integrate with the
transaction. The old revlog is backed up and if the transaction is
rolled back, the original revlog is restored.
As part of this, we had to teach the transaction about the store
vfs. I'm not super keen about this. But this was the easiest way
to hook things up to the transaction. We /could/ just ignore the
transaction like we were doing before. But any file mutation should
be governed by transaction semantics, including undo during rollback.
Differential Revision: https://phab.mercurial-scm.org/D4869
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 Oct 2018 17:28:54 -0700] rev 40056
revlog: move loading of index data into own method
This will allow us to "reload" a revlog instance from a rewritten
index file, which will be used in a subsequent commit.
Differential Revision: https://phab.mercurial-scm.org/D4868
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:57:35 -0700] rev 40055
revlog: clear revision cache on hash verification failure
The revision cache is populated after raw revision fulltext is
retrieved but before hash verification. If hash verification
fails, the revision cache will be populated and subsequent
operations to retrieve the invalid fulltext may return the cached
fulltext instead of raising.
This commit changes hash verification so it will invalidate the
revision cache if the cached node fails hash verification. The
side-effect is that subsequent operations to request the revision
text - even the raw revision text - will always fail.
The new behavior is consistent and is definitely less wrong. There
is an open question of whether revision(raw=True) should validate
hashes. But I'm going to punt on this problem. We can always change
behavior later. And to be honest, I'm not sure we should expose
raw=True on the storage interface at all. Another day...
Differential Revision: https://phab.mercurial-scm.org/D4867
Augie Fackler <augie@google.com> [Thu, 06 Sep 2018 02:36:25 -0400] rev 40054
fuzz: new fuzzer for cext/manifest.c
This is a bit messy, because lazymanifest is tightly coupled to the
cpython API for performance reasons. As a result, we have to build a
whole Python without pymalloc (so ASAN can help us out) and link
against that. Then we have to use an embedded Python interpreter. We
could manually drive the lazymanifest in C from that point, but
experimentally just using PyEval_EvalCode isn't really any slower so
we may as well do that and write the innermost guts of the fuzzer in
Python.
Leak detection is currently disabled for this fuzzer because there are
a few global-lifetime things in our extensions that we more or less
intentionally leak and I didn't want to take the detour to work around
that for now.
This should not be pushed to our repo until
https://github.com/google/oss-fuzz/pull/1853 is merged, as this
depends on having the Python tarball around.
Differential Revision: https://phab.mercurial-scm.org/D4879
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:32:21 -0700] rev 40053
revlog: rename _cache to _revisioncache
"cache" is generic and revlog instances have multiple caches. Let's
be descriptive about what this is a cache for.
Differential Revision: https://phab.mercurial-scm.org/D4866
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:56:48 -0700] rev 40052
testing: add file storage integration for bad hashes and censoring
In order to implement these tests, we need a backdoor to write data
into storage backends while bypassing normal checks. We invent a
callable to do that.
As part of writing the tests, I found a bug with censorrevision()
pretty quickly! After calling censorrevision(), attempting to
access revision data for an affected node raises a cryptic error
related to malformed compression. This appears to be due to the
revlog not adjusting delta chains as part of censoring.
I also found a bug with regards to hash verification and revision
fulltext caching. Essentially, we cache the fulltext before hash
verification. If we look up the fulltext after a failed hash
verification, we don't get a hash verification exception. Furthermore,
the behavior of revision(raw=True) can be inconsistent depending on
the order of operations.
I'll be fixing both these bugs in subsequent commits.
Differential Revision: https://phab.mercurial-scm.org/D4865
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:03:41 -0700] rev 40051
testing: add file storage tests for getstrippoint() and strip()
Differential Revision: https://phab.mercurial-scm.org/D4864
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:04:04 -0700] rev 40050
wireprotov2: always advertise raw repo requirements
I'm pretty sure my original thinking behind making it conditional
on stream clone support was that the behavior mirrored wire protocol
version 1.
I don't see a compelling reason for us to not advertise the server's
storage requirements. The proper way to advertise stream clone support
in wireprotov2 would be to not advertise the command(s) required to
perform stream clone or to advertise a separate capability denoting
stream clone support.
Stream clone isn't yet implemented on wireprotov2, so we can cross
this bridge later.
Differential Revision: https://phab.mercurial-scm.org/D4863
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 09:48:22 -0700] rev 40049
tests: don't be as verbose in wireprotov2 tests
I don't think that printing low-level I/O and frames is beneficial to
testing command-level functionality. Protocol-level testing, yes. But
command-level functionality shouldn't care about low-level details in
most cases. This output makes tests more verbose and harder to read.
It also makes them harder to maintain, as you need to glob over various
dynamic width fields.
Let's remove these low-level details from many of the wireprotov2
tests.
Differential Revision: https://phab.mercurial-scm.org/D4861
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 12:57:01 -0700] rev 40048
repository: define and use revision flag constants
Revlogs have a per-revision 2 byte field holding integer flags that
define how revision data should be interpreted. For historical reasons,
these integer values are sent verbatim on the wire protocol as part of
changegroup data.
From a semantic standpoint, the flags that go out over the wire are
different from the flags stored internally by revlogs. Failure to
establish this semantic distinction creates unwanted strong coupling
between revlog's internals and the wire protocol.
This commit establishes new constants on the repository module that
define the revision flags used by the wire protocol (and by some
internal storage APIs, sadly). The changegroups internals documentation
has been updated to document them explicitly. Various references
throughout the repo now use the repository constants instead of the
revlog constants. This is done to make it clear that we're operating
on generic revision data and this isn't tied to revlogs.
Differential Revision: https://phab.mercurial-scm.org/D4860
Boris Feld <boris.feld@octobus.net> [Thu, 04 Oct 2018 01:22:25 +0200] rev 40047
context: reverse conditional branch order in introrev
Positive logic will be simpler to follow. It will help to clarify coming
refactoring.