Fri, 21 Sep 2018 18:47:04 -0700 changegroup: port to emitrevisions() (issue5976)
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 21 Sep 2018 18:47:04 -0700] rev 39865
changegroup: port to emitrevisions() (issue5976) We now have a unified API for emitting revision data from a storage backend. It handles sorting nodes and the complicated delta versus revision decisions for us. This commit ports changegroup to that API. There should be no behavior changes for changegroups not using ellipsis. And lack of test changes seems to confirm that. There are some changes for ellipsis mode, however. Before, when sending an ellipsis revision, we would always send a fulltext revision (as opposed to a delta). There was a TODO tracking this open item. One of the things the emitrevisions() API does for us is figure out whether we can safely emit a delta. So, it is now possible for ellipsis revisions to be sent as deltas! (It does this by not assuming parent/ancestor revisions are available and tracking which revisions have been sent out.) Because we eliminated the list of revision delta request objects, performance has improved substantially: $ hg perfchangegroupchangelog before: ! wall 24.348077 comb 24.330000 user 24.140000 sys 0.190000 (best of 3) after: ! wall 18.245911 comb 18.240000 user 18.100000 sys 0.140000 (best of 3) That's a lot of overhead for creating a few hundred thousand Python objects! This is still a little slower than 4.7. Probably due to 23d582ca introducing a type for the revision/delta results. There is potentially room to optimize. But at some point we need to abstract storage in order to support alternate storage backends. Unfortunately that means using a Python data structure to represent results. And unfortunately there is overhead with every new Python object created. Differential Revision: https://phab.mercurial-scm.org/D4725
Mon, 24 Sep 2018 09:48:02 -0700 wireprotov2server: port to emitrevisions()
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 24 Sep 2018 09:48:02 -0700] rev 39864
wireprotov2server: port to emitrevisions() We now have a proper storage API to request data on multiple revisions. We can drop it into wire protocol version 2 with minimal effort. The new API handles pretty much everything we were doing manually to build up the delta request. So we were able to delete a lot of code. As a bonus, wireprotov2 code is no longer accessing some low-level storage APIs. This includes the assumption that a node has an associated numeric revision number! This should make it drastically simpler to implement a server that doesn't have the concept of revision numbers. Differential Revision: https://phab.mercurial-scm.org/D4724
Fri, 21 Sep 2018 14:54:59 -0700 tests: use more complex file storage test
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 21 Sep 2018 14:54:59 -0700] rev 39863
tests: use more complex file storage test The previous test was attempting to to test delta storage behavior. It didn't do a very good job at it because there was a good chance a delta wasn't being used in storage. Let's switch the test to yield a delta in storage so an upcoming change to delegate delta logic to storage has the desired effect. Differential Revision: https://phab.mercurial-scm.org/D4723
Fri, 21 Sep 2018 14:28:21 -0700 revlog: new API to emit revision data
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 21 Sep 2018 14:28:21 -0700] rev 39862
revlog: new API to emit revision data I recently refactored changegroup generation code to make it more storage agnostic. I made significant progress. But there is still a bit of work to be done. Specifically: * Changegroup code is looking at low-level storage attributes to influence sorting. Sorting should be done at the storage layer. * The linknode lookup and sorting code for ellipsis is very complicated. * Linknodes are just generally wonky because e.g. file storage doesn't know how to translate a linkrev to a changelog node. * We regressed performance when introducing the request-response objects. Having thought about this problem a bit, I think I've come up with a better interface for emitting revision deltas. This commit defines and implements that interface. See the docstring in repository.py for more info. This API adds 3 notable features over the previous one. First, it defers node ordering to the storage implementation in the common case but allows overriding as necessary. We have a facility for requesting an exact ordering (used in ellipsis mode). We have another facility for storage order (used for changelog). Second, we have an argument specifying assumptions about parents revisions. This can be used to force a fulltext revision when we don't know the receiver has a parent revision to delta against. Third, we can control whether revision data is emitted. This makes the API suitable as a generic "index data retrieval" API as well as for producing revision deltas - possibly in the same operation! The new API is much simpler: we no longer need a complicated "request" object to encapsulate the delta generation request. I'm optimistic this will restore performance loss associated with emitrevisiondeltas(). Storage unit tests for the new API have been implemented. Future commits will port existing consumers of emitrevisiondeltas() to the new API then remove emitrevisiondeltas(). Differential Revision: https://phab.mercurial-scm.org/D4722
(0) -30000 -10000 -3000 -1000 -300 -100 -30 -10 -4 +4 +10 +30 +100 +300 +1000 +3000 +10000 tip