Mercurial > hg-stable
view mercurial/help/internals/wireprotocolv2.txt @ 39648:c1aacb0d76ff
wireprotov2: add phases to "changesetdata" command
This commit teaches the "changesetdata" wire protocol command
to emit the phase state for each changeset.
This is a different approach from existing phase transfer in a
few ways. Previously, if there are no new revisions (or we're
not using bundle2), we perform a "listkeys" request to retrieve
phase heads. And when revision data is being transferred
with bundle2, phases data is encoded in a standalone bundle2 part.
In both cases, phases data is logically decoupled from the changeset
data and is encountered/applied after changeset revision data
is received.
The new wire protocol purposefully tries to more tightly associate
changeset metadata (phases, bookmarks, obsolescence markers, etc)
with the changeset revision and index data itself, rather than
have it live as a separate entity that must be fetched and
processed separately. I reckon that one reason we didn't do this
before was it was difficult to add new data types/fields without
breaking existing consumers. By using CBOR maps to transfer
changeset data and putting clients in control of what fields are
requested / present in those maps, we can easily add additional
changeset data while maintaining backwards compatibility. I believe
this to be a superior approach to the problem.
That being said, for performance reasons, we may need to resort
to alternative mechanisms for transferring data like phases. But
for now, I think giving the wire protocol the ability to transfer
changeset metadata next to the changeset itself is a powerful feature
because it is a raw, changeset-centric data API. And if you build
simple APIs for accessing the fundamental units of repository data,
you enable client-side experimentation (partial clone, etc). If it
turns out that we need specialized APIs or mechanisms for transferring
data like phases, we can build in those APIs later. For now, I'd
like to see how far we can get on simple APIs.
It's worth noting that when phase data is being requested, the
server will also emit changeset records for nodes in the bases
specified by the "noderange" argument. This is to ensure that
phase-only updates for nodes the client has are available to the
client, even if no new changesets will be transferred.
Differential Revision: https://phab.mercurial-scm.org/D4483
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Tue, 28 Aug 2018 18:19:23 -0700 |
parents | 9c2c77c73f23 |
children | 9dffa99f9158 |
line wrap: on
line source
**Experimental and under active development** This section documents the wire protocol commands exposed to transports using the frame-based protocol. The set of commands exposed through these transports is distinct from the set of commands exposed to legacy transports. The frame-based protocol uses CBOR to encode command execution requests. All command arguments must be mapped to a specific or set of CBOR data types. The response to many commands is also CBOR. There is no common response format: each command defines its own response format. TODOs ===== * Add "node namespace" support to each command. In order to support SHA-1 hash transition, we want servers to be able to expose different "node namespaces" for the same data. Every command operating on nodes should specify which "node namespace" it is operating on and responses should encode the "node namespace" accordingly. Commands ======== The sections below detail all commands available to wire protocol version 2. branchmap --------- Obtain heads in named branches. Receives no arguments. The response is a map with bytestring keys defining the branch name. Values are arrays of bytestring defining raw changeset nodes. capabilities ------------ Obtain the server's capabilities. Receives no arguments. This command is typically called only as part of the handshake during initial connection establishment. The response is a map with bytestring keys defining server information. The defined keys are: commands A map defining available wire protocol commands on this server. Keys in the map are the names of commands that can be invoked. Values are maps defining information about that command. The bytestring keys are: args A map of argument names and their expected types. Types are defined as a representative value for the expected type. e.g. an argument expecting a boolean type will have its value set to true. An integer type will have its value set to 42. The actual values are arbitrary and may not have meaning. permissions An array of permissions required to execute this command. compression An array of maps defining available compression format support. The array is sorted from most preferred to least preferred. Each entry has the following bytestring keys: name Name of the compression engine. e.g. ``zstd`` or ``zlib``. framingmediatypes An array of bytestrings defining the supported framing protocol media types. Servers will not accept media types not in this list. rawrepoformats An array of storage formats the repository is using. This set of requirements can be used to determine whether a client can read a *raw* copy of file data available. changesetdata ------------- Obtain various data related to changesets. The command accepts the following arguments: noderange (array of arrays of bytestrings) An array of 2 elements, each being an array of node bytestrings. The first array denotes the changelog revisions that are already known to the client. The second array denotes the changelog revision DAG heads to fetch. The argument essentially defines a DAG range bounded by root and head nodes to fetch. The roots array may be empty. The heads array must be defined. nodes (array of bytestrings) Changelog revisions to request explicitly. fields (set of bytestring) Which data associated with changelog revisions to fetch. The following values are recognized: parents Parent revisions. phase The phase state of a revision. revision The raw, revision data for the changelog entry. The hash of this data will match the revision's node value. The server resolves the set of revisions relevant to the request by taking the union of the ``noderange`` and ``nodes`` arguments. At least one of these arguments must be defined. The response bytestream starts with a CBOR map describing the data that follows. This map has the following bytestring keys: totalitems (unsigned integer) Total number of changelog revisions whose data is being transferred. This maps to the set of revisions in the requested node range, not the total number of records that follow (see below for why). Following the map header is a series of 0 or more CBOR values. If values are present, the first value will always be a map describing a single changeset revision. If revision data is requested, the raw revision data (encoded as a CBOR bytestring) will follow the map describing it. Otherwise, another CBOR map describing the next changeset revision will occur. Each map has the following bytestring keys: node (bytestring) The node value for this revision. This is the SHA-1 hash of the raw revision data. parents (optional) (array of bytestrings) The nodes representing the parent revisions of this revision. Only present if ``parents`` data is being requested. phase (optional) (bytestring) The phase that a revision is in. Recognized values are ``secret``, ``draft``, and ``public``. Only present if ``phase`` data is being requested. revisionsize (optional) (unsigned integer) Indicates the size of raw revision data that follows this map. The following data contains a serialized form of the changeset data, including the author, date, commit message, set of changed files, manifest node, and other metadata. Only present if ``revision`` data was requested and the data follows this map. If nodes are requested via ``noderange``, they will be emitted in DAG order, parents always before children. If nodes are requested via ``nodes``, they will be emitted in requested order. Nodes from ``nodes`` are emitted before nodes from ``noderange``. The set of changeset revisions emitted may not match the exact set of changesets requested. Furthermore, the set of keys present on each map may vary. This is to facilitate emitting changeset updates as well as new revisions. For example, if the request wants ``phase`` and ``revision`` data, the response may contain entries for each changeset in the common nodes set with the ``phase`` key and without the ``revision`` key in order to reflect a phase-only update. TODO support different revision selection mechanisms (e.g. non-public, specific revisions) TODO support different hash "namespaces" for revisions (e.g. sha-1 versus other) TODO support emitting bookmarks data TODO support emitting obsolescence data TODO support filtering based on relevant paths (narrow clone) TODO support depth limiting TODO support hgtagsfnodes cache / tags data TODO support branch heads cache heads ----- Obtain DAG heads in the repository. The command accepts the following arguments: publiconly (optional) (boolean) If set, operate on the DAG for public phase changesets only. Non-public (i.e. draft) phase DAG heads will not be returned. The response is a CBOR array of bytestrings defining changeset nodes of DAG heads. The array can be empty if the repository is empty or no changesets satisfied the request. TODO consider exposing phase of heads in response known ----- Determine whether a series of changeset nodes is known to the server. The command accepts the following arguments: nodes (array of bytestrings) List of changeset nodes whose presence to query. The response is a bytestring where each byte contains a 0 or 1 for the corresponding requested node at the same index. TODO use a bit array for even more compact response listkeys -------- List values in a specified ``pushkey`` namespace. The command receives the following arguments: namespace (bytestring) Pushkey namespace to query. The response is a map with bytestring keys and values. TODO consider using binary to represent nodes in certain pushkey namespaces. lookup ------ Try to resolve a value to a changeset revision. Unlike ``known`` which operates on changeset nodes, lookup operates on node fragments and other names that a user may use. The command receives the following arguments: key (bytestring) Value to try to resolve. On success, returns a bytestring containing the resolved node. pushkey ------- Set a value using the ``pushkey`` protocol. The command receives the following arguments: namespace (bytestring) Pushkey namespace to operate on. key (bytestring) The pushkey key to set. old (bytestring) Old value for this key. new (bytestring) New value for this key. TODO consider using binary to represent nodes is certain pushkey namespaces. TODO better define response type and meaning.