Mercurial > hg
changeset 35975:40d94ea51402
internals: refactor wire protocol documentation
Upcoming work will introduce a new version of the HTTP and SSH
transports. The differences will be significant enough to consider
them new transports. So, we now attach a version number to each
transport.
In addition, having the handshake documented after the transport
and in a single shared section made it harder to follow the flow
of the connection. The handshake documentation is now moved to the
protocol section it describes. We now have a generic section about
the purpose of the handshake, which was rewritten significantly.
Differential Revision: https://phab.mercurial-scm.org/D2060
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Tue, 06 Feb 2018 10:51:15 -0800 |
parents | 9ba1d0c724e2 |
children | 48a3a9283f09 |
files | mercurial/help/internals/wireprotocol.txt |
diffstat | 1 files changed, 132 insertions(+), 71 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/help/internals/wireprotocol.txt Mon Feb 05 18:04:40 2018 +0100 +++ b/mercurial/help/internals/wireprotocol.txt Tue Feb 06 10:51:15 2018 -0800 @@ -10,11 +10,43 @@ The protocol is synchronous and does not support multiplexing (concurrent commands). -Transport Protocols -=================== +Handshake +========= + +It is required or common for clients to perform a *handshake* when connecting +to a server. The handshake serves the following purposes: + +* Negotiating protocol/transport level options +* Allows the client to learn about server capabilities to influence + future requests +* Ensures the underlying transport channel is in a *clean* state -HTTP Transport --------------- +An important goal of the handshake is to allow clients to use more modern +wire protocol features. By default, clients must assume they are talking +to an old version of Mercurial server (possibly even the very first +implementation). So, clients should not attempt to call or utilize modern +wire protocol features until they have confirmation that the server +supports them. The handshake implementation is designed to allow both +ends to utilize the latest set of features and capabilities with as +few round trips as possible. + +The handshake mechanism varies by transport and protocol and is documented +in the sections below. + +HTTP Protocol +============= + +Handshake +--------- + +The client sends a ``capabilities`` command request (``?cmd=capabilities``) +as soon as HTTP requests may be issued. + +The server responds with a capabilities string, which the client parses to +learn about the server's abilities. + +HTTP Version 1 Transport +------------------------ Commands are issued as HTTP/1.0 or HTTP/1.1 requests. Commands are sent to the base URL of the repository with the command name sent in @@ -112,11 +144,86 @@ ``application/mercurial-0.*`` media type and the HTTP response is typically using *chunked transfer* (``Transfer-Encoding: chunked``). -SSH Transport -============= +SSH Protocol +============ + +Handshake +--------- + +For all clients, the handshake consists of the client sending 1 or more +commands to the server using version 1 of the transport. Servers respond +to commands they know how to respond to and send an empty response (``0\n``) +for unknown commands (per standard behavior of version 1 of the transport). +Clients then typically look for a response to the newest sent command to +determine which transport version to use and what the available features for +the connection and server are. + +Preceding any response from client-issued commands, the server may print +non-protocol output. It is common for SSH servers to print banners, message +of the day announcements, etc when clients connect. It is assumed that any +such *banner* output will precede any Mercurial server output. So clients +must be prepared to handle server output on initial connect that isn't +in response to any client-issued command and doesn't conform to Mercurial's +wire protocol. This *banner* output should only be on stdout. However, +some servers may send output on stderr. + +Pre 0.9.1 clients issue a ``between`` command with the ``pairs`` argument +having the value +``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``. + +The ``between`` command has been supported since the original Mercurial +SSH server. Requesting the empty range will return a ``\n`` string response, +which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline +followed by the value, which happens to be a newline). + +For pre 0.9.1 clients and all servers, the exchange looks like:: + + c: between\n + c: pairs 81\n + c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 + s: 1\n + s: \n -The SSH transport is a custom text-based protocol suitable for use over any -bi-directional stream transport. It is most commonly used with SSH. +0.9.1+ clients send a ``hello`` command (with no arguments) before the +``between`` command. The response to this command allows clients to +discover server capabilities and settings. + +An example exchange between 0.9.1+ clients and a ``hello`` aware server looks +like:: + + c: hello\n + c: between\n + c: pairs 81\n + c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 + s: 324\n + s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n + s: 1\n + s: \n + +And a similar scenario but with servers sending a banner on connect:: + + c: hello\n + c: between\n + c: pairs 81\n + c: 0000000000000000000000000000000000000000-0000000000000000000000000000000000000000 + s: welcome to the server\n + s: if you find any issues, email someone@somewhere.com\n + s: 324\n + s: capabilities: lookup changegroupsubset branchmap pushkey known getbundle ...\n + s: 1\n + s: \n + +Note that output from the ``hello`` command is terminated by a ``\n``. This is +part of the response payload and not part of the wire protocol adding a newline +after responses. In other words, the length of the response contains the +trailing ``\n``. + +SSH Version 1 Transport +----------------------- + +The SSH transport (version 1) is a custom text-based protocol suitable for +use over any bi-directional stream transport. It is most commonly used with +SSH. A SSH transport server can be started with ``hg serve --stdio``. The stdin, stderr, and stdout file descriptors of the started process are used to exchange @@ -463,53 +570,6 @@ reflects the priority/preference of that type, where the first value is the most preferred type. -Handshake Protocol -================== - -While not explicitly required, it is common for clients to perform a -*handshake* when connecting to a server. The handshake accomplishes 2 things: - -* Obtaining capabilities and other server features -* Flushing extra server output (e.g. SSH servers may print extra text - when connecting that may confuse the wire protocol) - -This isn't a traditional *handshake* as far as network protocols go because -there is no persistent state as a result of the handshake: the handshake is -simply the issuing of commands and commands are stateless. - -The canonical clients perform a capabilities lookup at connection establishment -time. This is because clients must assume a server only supports the features -of the original Mercurial server implementation until proven otherwise (from -advertised capabilities). Nearly every server running today supports features -that weren't present in the original Mercurial server implementation. Rather -than wait for a client to perform functionality that needs to consult -capabilities, it issues the lookup at connection start to avoid any delay later. - -For HTTP servers, the client sends a ``capabilities`` command request as -soon as the connection is established. The server responds with a capabilities -string, which the client parses. - -For SSH servers, the client sends the ``hello`` command (no arguments) -and a ``between`` command with the ``pairs`` argument having the value -``0000000000000000000000000000000000000000-0000000000000000000000000000000000000000``. - -The ``between`` command has been supported since the original Mercurial -server. Requesting the empty range will return a ``\n`` string response, -which will be encoded as ``1\n\n`` (value length of ``1`` followed by a newline -followed by the value, which happens to be a newline). - -The ``hello`` command was later introduced. Servers supporting it will issue -a response to that command before sending the ``1\n\n`` response to the -``between`` command. Servers not supporting ``hello`` will send an empty -response (``0\n``). - -In addition to the expected output from the ``hello`` and ``between`` commands, -servers may also send other output, such as *message of the day (MOTD)* -announcements. Clients assume servers will send this output before the -Mercurial server replies to the client-issued commands. So any server output -not conforming to the expected command responses is assumed to be not related -to Mercurial and can be ignored. - Content Negotiation =================== @@ -519,8 +579,8 @@ well-defined response type and only certain commands needed to support functionality like compression. -Currently, only the HTTP transport supports content negotiation at the protocol -layer. +Currently, only the HTTP version 1 transport supports content negotiation +at the protocol layer. HTTP requests advertise supported response formats via the ``X-HgProto-<N>`` request header, where ``<N>`` is an integer starting at 1 allowing the logical @@ -739,7 +799,7 @@ Boolean indicating whether phases data is requested. The return type on success is a ``stream`` where the value is bundle. -On the HTTP transport, the response is zlib compressed. +On the HTTP version 1 transport, the response is zlib compressed. If an error occurs, a generic error response can be sent. @@ -842,13 +902,14 @@ The return type is a ``string``. The value depends on the transport protocol. -The SSH transport sends a string encoded integer followed by a newline -(``\n``) which indicates operation result. The server may send additional -output on the ``stderr`` stream that should be displayed to the user. +The SSH version 1 transport sends a string encoded integer followed by a +newline (``\n``) which indicates operation result. The server may send +additional output on the ``stderr`` stream that should be displayed to the +user. -The HTTP transport sends a string encoded integer followed by a newline -followed by additional server output that should be displayed to the user. -This may include output from hooks, etc. +The HTTP version 1 transport sends a string encoded integer followed by a +newline followed by additional server output that should be displayed to +the user. This may include output from hooks, etc. The integer result varies by namespace. ``0`` means an error has occurred and there should be additional output to display to the user. @@ -912,18 +973,18 @@ The encoding of the ``push response`` type varies by transport. -For the SSH transport, this type is composed of 2 ``string`` responses: an -empty response (``0\n``) followed by the integer result value. e.g. -``1\n2``. So the full response might be ``0\n1\n2``. +For the SSH version 1 transport, this type is composed of 2 ``string`` +responses: an empty response (``0\n``) followed by the integer result value. +e.g. ``1\n2``. So the full response might be ``0\n1\n2``. -For the HTTP transport, the response is a ``string`` type composed of an -integer result value followed by a newline (``\n``) followed by string +For the HTTP version 1 transport, the response is a ``string`` type composed +of an integer result value followed by a newline (``\n``) followed by string content holding server output that should be displayed on the client (output hooks, etc). In some cases, the server may respond with a ``bundle2`` bundle. In this -case, the response type is ``stream``. For the HTTP transport, the response -is zlib compressed. +case, the response type is ``stream``. For the HTTP version 1 transport, the +response is zlib compressed. The server may also respond with a generic error type, which contains a string indicating the failure.