changegroup: port to emitrevisions() (
issue5976)
We now have a unified API for emitting revision data from a storage
backend. It handles sorting nodes and the complicated delta versus
revision decisions for us.
This commit ports changegroup to that API.
There should be no behavior changes for changegroups not using
ellipsis. And lack of test changes seems to confirm that.
There are some changes for ellipsis mode, however.
Before, when sending an ellipsis revision, we would always send a
fulltext revision (as opposed to a delta). There was a TODO tracking
this open item.
One of the things the emitrevisions() API does for us is figure out
whether we can safely emit a delta. So, it is now possible for
ellipsis revisions to be sent as deltas! (It does this by not
assuming parent/ancestor revisions are available and tracking which
revisions have been sent out.)
Because we eliminated the list of revision delta request objects,
performance has improved substantially:
$ hg perfchangegroupchangelog
before: ! wall 24.348077 comb 24.330000 user 24.140000 sys 0.190000 (best of 3)
after: ! wall 18.245911 comb 18.240000 user 18.100000 sys 0.140000 (best of 3)
That's a lot of overhead for creating a few hundred thousand Python
objects!
This is still a little slower than 4.7. Probably due to
23d582ca
introducing a type for the revision/delta results. There is
potentially room to optimize. But at some point we need to abstract
storage in order to support alternate storage backends. Unfortunately
that means using a Python data structure to represent results. And
unfortunately there is overhead with every new Python object created.
Differential Revision: https://phab.mercurial-scm.org/D4725
wireprotov2server: port to emitrevisions()
We now have a proper storage API to request data on multiple
revisions. We can drop it into wire protocol version 2 with
minimal effort.
The new API handles pretty much everything we were doing manually to
build up the delta request. So we were able to delete a lot of code.
As a bonus, wireprotov2 code is no longer accessing some low-level
storage APIs. This includes the assumption that a node has an
associated numeric revision number! This should make it drastically
simpler to implement a server that doesn't have the concept of
revision numbers.
Differential Revision: https://phab.mercurial-scm.org/D4724
tests: use more complex file storage test
The previous test was attempting to to test delta storage behavior.
It didn't do a very good job at it because there was a good chance
a delta wasn't being used in storage.
Let's switch the test to yield a delta in storage so an upcoming
change to delegate delta logic to storage has the desired effect.
Differential Revision: https://phab.mercurial-scm.org/D4723
revlog: new API to emit revision data
I recently refactored changegroup generation code to make it more
storage agnostic. I made significant progress. But there is still
a bit of work to be done. Specifically:
* Changegroup code is looking at low-level storage attributes to
influence sorting. Sorting should be done at the storage layer.
* The linknode lookup and sorting code for ellipsis is very
complicated.
* Linknodes are just generally wonky because e.g. file storage doesn't
know how to translate a linkrev to a changelog node.
* We regressed performance when introducing the request-response
objects.
Having thought about this problem a bit, I think I've come up with
a better interface for emitting revision deltas.
This commit defines and implements that interface. See the docstring
in repository.py for more info.
This API adds 3 notable features over the previous one.
First, it defers node ordering to the storage implementation in
the common case but allows overriding as necessary. We have a
facility for requesting an exact ordering (used in ellipsis
mode). We have another facility for storage order (used for changelog).
Second, we have an argument specifying assumptions about parents
revisions. This can be used to force a fulltext revision when we
don't know the receiver has a parent revision to delta against.
Third, we can control whether revision data is emitted. This makes
the API suitable as a generic "index data retrieval" API as well
as for producing revision deltas - possibly in the same operation!
The new API is much simpler: we no longer need a complicated "request"
object to encapsulate the delta generation request. I'm optimistic
this will restore performance loss associated with
emitrevisiondeltas().
Storage unit tests for the new API have been implemented.
Future commits will port existing consumers of emitrevisiondeltas()
to the new API then remove emitrevisiondeltas().
Differential Revision: https://phab.mercurial-scm.org/D4722