Wed, 05 Sep 2018 09:09:52 -0700 wireprotov2: define and implement "manifestdata" command
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 05 Sep 2018 09:09:52 -0700] rev 39637
wireprotov2: define and implement "manifestdata" command The added command can be used for obtaining manifest data. Given a manifest path and set of manifest nodes, data about manifests can be retrieved. Unlike changeset data, we wish to emit deltas to describe manifest revisions. So the command uses the relatively new API for building delta requests and emitting them. The code calls into deltaparent(), which I'm not very keen of. There's still work to be done in delta generation land so implementation details of storage (e.g. exactly one delta is stored/available) don't creep into higher levels. But we can worry about this later (there is already a TODO on imanifestorage tracking this). On the subject of parent deltas, the server assumes parent revisions exist on the receiving end. This is obviously wrong for shallow clone. I've added TODOs to add a mechanism to the command to allow clients to specify desired behavior. This shouldn't be too difficult to implement. Another big change is that the client must explicitly request manifest nodes to retrieve. This is a major departure from "getbundle," where the server derives relevant manifests as it iterates changesets and sends them automatically. As implemented, the client must transmit each requested node to the server. At 20 bytes per node, we're looking at 2 MB per 100,000 nodes. Plus wire encoding overhead. This isn't ideal for clients with limited upload bandwidth. I plan to address this in the future by allowing alternate mechanisms for defining the revisions to retrieve. One idea is to define a range of changeset revisions whose manifest revisions to retrieve (similar to how "changesetdata" works). We almost certainly want an API to look up an individual manifest by node. And that's where I've chosen to start with the implementation. Again, a theme of this early exchangev2 work is I want to start by building primitives for accessing raw repository data first and see how far we can get with those before we need more complexity. Differential Revision: https://phab.mercurial-scm.org/D4488
Wed, 22 Aug 2018 14:51:11 -0700 wireprotov2: add TODOs around extending changesetdata fields
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 22 Aug 2018 14:51:11 -0700] rev 39636
wireprotov2: add TODOs around extending changesetdata fields Extensions will inevitably want to extend the set of changeset data/fields that can be requested. We'll need to implement support for extending this in the future. Add some TODOs to track that. Differential Revision: https://phab.mercurial-scm.org/D4487
Wed, 29 Aug 2018 17:03:19 -0700 exchangev2: fetch and apply bookmarks
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 29 Aug 2018 17:03:19 -0700] rev 39635
exchangev2: fetch and apply bookmarks This is pretty similar to phases data. We collect bookmarks data as we process records. Then at the end we make a call to the bookmarks subsystem to reflect the remote's bookmarks. Like phases, the code for handling bookmarks is vastly simpler than the previous wire protocol code because the server always transfers the full set of bookmarks when bookmarks are requested. We don't have to keep track of whether we requested bookmarks or not. Differential Revision: https://phab.mercurial-scm.org/D4486
Thu, 23 Aug 2018 18:14:19 -0700 wireprotov2: add bookmarks to "changesetdata" command
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 23 Aug 2018 18:14:19 -0700] rev 39634
wireprotov2: add bookmarks to "changesetdata" command Like we did for phases, we want to emit bookmarks data attached to each changeset. The approach here is very similar to phases: we emit bookmarks data inline with requested revision data. But we emit records for nodes that weren't requested as well so consumers have access to the full set of defined bookmarks. Differential Revision: https://phab.mercurial-scm.org/D4485
Wed, 12 Sep 2018 10:01:58 -0700 exchangev2: fetch and apply phases data
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 10:01:58 -0700] rev 39633
exchangev2: fetch and apply phases data Now that the server supports emitting phases data, we can request it and apply it on the client. Because we may receive phases-only updates from the server, we no longer conditionally perform the "changesetdata" command depending on whether there are revisions to fetch. In the previous wire protocol, this case would result in us falling back to performing "listkeys" commands to look up phases, bookmarks, etc data. But since "changesetdata" is smart enough to handle metadata only fetches, we can keep things consistent. It's worth noting that because of the unified approach to changeset data retrieval, phase handling code in wire proto v2 exchange is drastically simpler. Contrast with all the code in exchange.py dealing with all the variations for obtaining phases data. Differential Revision: https://phab.mercurial-scm.org/D4484
Tue, 28 Aug 2018 18:19:23 -0700 wireprotov2: add phases to "changesetdata" command
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 28 Aug 2018 18:19:23 -0700] rev 39632
wireprotov2: add phases to "changesetdata" command This commit teaches the "changesetdata" wire protocol command to emit the phase state for each changeset. This is a different approach from existing phase transfer in a few ways. Previously, if there are no new revisions (or we're not using bundle2), we perform a "listkeys" request to retrieve phase heads. And when revision data is being transferred with bundle2, phases data is encoded in a standalone bundle2 part. In both cases, phases data is logically decoupled from the changeset data and is encountered/applied after changeset revision data is received. The new wire protocol purposefully tries to more tightly associate changeset metadata (phases, bookmarks, obsolescence markers, etc) with the changeset revision and index data itself, rather than have it live as a separate entity that must be fetched and processed separately. I reckon that one reason we didn't do this before was it was difficult to add new data types/fields without breaking existing consumers. By using CBOR maps to transfer changeset data and putting clients in control of what fields are requested / present in those maps, we can easily add additional changeset data while maintaining backwards compatibility. I believe this to be a superior approach to the problem. That being said, for performance reasons, we may need to resort to alternative mechanisms for transferring data like phases. But for now, I think giving the wire protocol the ability to transfer changeset metadata next to the changeset itself is a powerful feature because it is a raw, changeset-centric data API. And if you build simple APIs for accessing the fundamental units of repository data, you enable client-side experimentation (partial clone, etc). If it turns out that we need specialized APIs or mechanisms for transferring data like phases, we can build in those APIs later. For now, I'd like to see how far we can get on simple APIs. It's worth noting that when phase data is being requested, the server will also emit changeset records for nodes in the bases specified by the "noderange" argument. This is to ensure that phase-only updates for nodes the client has are available to the client, even if no new changesets will be transferred. Differential Revision: https://phab.mercurial-scm.org/D4483
Wed, 12 Sep 2018 10:01:36 -0700 exchangev2: fetch changeset revisions
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 12 Sep 2018 10:01:36 -0700] rev 39631
exchangev2: fetch changeset revisions All Mercurial repository data is derived from changesets: you can't do anything unless you have changesets. Therefore, it makes sense for changesets to be the first piece of data that we transfer as part of pull. To do this, we call our new "changesetdata" command, requesting parents and revision data. This gives us all the data that a changegroup delta group would give us. We simply normalize this data into what addgroup() expects and call that API on the changelog to bulk insert revisions into the changelog. Code in this commit is heavily borrowed from changegroup.cg1unpacker.apply(). Differential Revision: https://phab.mercurial-scm.org/D4482
(0) -30000 -10000 -3000 -1000 -300 -100 -30 -10 -7 +7 +10 +30 +100 +300 +1000 +3000 +10000 tip