Wed, 14 Mar 2018 16:51:34 -0700 wireproto: add request IDs to frames
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 14 Mar 2018 16:51:34 -0700] rev 37060
wireproto: add request IDs to frames One of my primary goals with the new wire protocol is to make operations faster and enable both client and server-side operations to scale to multiple CPU cores. One of the ways we make server interactions faster is by reducing the number of round trips to that server. With the existing wire protocol, the "batch" command facilitates executing multiple commands from a single request payload. The way it works is the requests for multiple commands are serialized. The server executes those commands sequentially then serializes all their results. As an optimization for reducing round trips, this is very effective. The technical implementation, however, is pretty bad and suffers from a number of deficiencies. For example, it creates a new place where authorization to run a command must be checked. (The lack of this checking in older Mercurial releases was CVE-2018-1000132.) The principles behind the "batch" command are sound. However, the execution is not. Therefore, I want to ditch "batch" in the new wire protocol and have protocol level support for issuing multiple requests in a single round trip. This commit introduces support in the frame-based wire protocol to facilitate this. We do this by adding a "request ID" to each frame. If a server sees frames associated with different "request IDs," it handles them as separate requests. All of this happening possibly as part of the same message from client to server (the same request body in the case of HTTP). We /could/ model the exchange the way pipelined HTTP requests do, where the server processes requests in order they are issued and received. But this artifically constrains scalability. A better model is to allow multi-requests to be executed concurrently and for responses to be sent and handled concurrently. So the specification explicitly allows this. There is some work to be done around specifying dependencies between multi-requests. We take the easy road for now and punt on this problem, declaring that if order is important, clients must not issue the request until responses to dependent requests have been received. This commit focuses on the boilerplate of implementing the request ID. The server reactor still can't manage multiple, in-flight request IDs. This will be addressed in a subsequent commit. Because the wire semantics have changed, we bump the version of the media type. Differential Revision: https://phab.mercurial-scm.org/D2869
Wed, 14 Mar 2018 14:01:16 -0700 wireproto: buffer output frames when in half duplex mode
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 14 Mar 2018 14:01:16 -0700] rev 37059
wireproto: buffer output frames when in half duplex mode Previously, when told that a response was ready, the server reactor would instruct the caller to send frames immediately. This was OK as an initial implementation. But it would not work for half-duplex connections where the sender can't receive until all data has been transmitted - such as httplib based clients. In this commit, we teach the reactor that output frames should be buffered until end of input is seen. This required a new event to inform the reactor of end of input. The result from that event will instruct the consumer to send all buffered frames. The HTTP server is buffered by default. This change effectively hides the complexity of buffering within the reactor so that transports need not be concerned about it. This helps keep the transports "dumb" and will make implementing multiple requests-responses per atomic exchange (like an HTTP request) much simpler. Differential Revision: https://phab.mercurial-scm.org/D2860
Wed, 14 Mar 2018 13:57:52 -0700 wireproto: define and implement responses in framing protocol
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 14 Mar 2018 13:57:52 -0700] rev 37058
wireproto: define and implement responses in framing protocol Previously, we only had client-side frame types defined. This commit defines and implements basic support for server-side frame types. We introduce two frame types - one for representing the raw bytes result of a command and another for representing error results. The types are quite primitive and behavior will expand over time. But you have to start somewhere. Our server reactor gains methods to react to an intent to send a response. Again, following the "sans I/O" pattern, the reactor doesn't actually send the data. Instead, it gives the caller a generator to frames that it can send out over the wire. Differential Revision: https://phab.mercurial-scm.org/D2858
Wed, 14 Mar 2018 13:32:31 -0700 wireproto: implement basic command dispatching for HTTPv2
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 14 Mar 2018 13:32:31 -0700] rev 37057
wireproto: implement basic command dispatching for HTTPv2 Now that we can ingest frames and decode them to requests to run commands, we are able to actually run those commands. So this commit starts to implement that. There are numerous shortcomings. We can't operate on commands with "*" arguments. We can only emit bytesresponse results. We don't yet issue a response in the unified framing protocol. But it's a start. Differential Revision: https://phab.mercurial-scm.org/D2857
Wed, 14 Mar 2018 08:18:15 -0700 wireproto: nominally don't expose "batch" to version 2 wire transports
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 14 Mar 2018 08:18:15 -0700] rev 37056
wireproto: nominally don't expose "batch" to version 2 wire transports The unified frame-based protocol will (eventually) support multiple requests per client transmission. This means that the [very hacky] "batch" command has no purpose existing in this protocol. This commit marks the command as applying to v1 transports only. But because SSHv2 == SSHv1 currently, we had to hack it back in for the SSHv2 transport. Bleh. Tests changed because the capabilities string changed. The order of tokens in the string is not important. Differential Revision: https://phab.mercurial-scm.org/D2856
Wed, 14 Mar 2018 15:25:06 -0700 wireproto: implement basic frame reading and processing
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 14 Mar 2018 15:25:06 -0700] rev 37055
wireproto: implement basic frame reading and processing We just implemented support for writing frames. Now let's implement support for reading them. The bulk of the new code is for a class that maintains the state of a server. Essentially, you construct an instance, feed frames to it, and it tells you what you should do next. The design is inspired by the "sans I/O" movement and the reactor pattern. We don't want to perform I/O or any major blocking event during frame ingestion because this arbitrarily limits ways that server pieces can be implemented. For example, it makes it much harder to swap in an alternate implementation based on asyncio or do crazy things like have requests dispatch to other processes. We do still implement readframe() which does I/O. But it is decoupled from the server reactor. And important parsing of frame headers is a standalone function. So I/O is only needed to obtain frame data. Because testing server-side ingest is useful and difficult on running servers, we create a new "debugreflect" endpoint that will echo back to the client what was received and how it was interpreted. This could be useful for a server admin, someone implementing a client. But immediately, it is useful for testing: we're able to demonstrate that frames are parsed correctly and turned into requests to run commands without having to implement command dispatch on the server! In addition, we implement Python level unit tests for the reactor. This is vastly more efficient than sending requests to the "debugreflect" endpoint and vastly more powerful for advanced testing. Differential Revision: https://phab.mercurial-scm.org/D2852
(0) -30000 -10000 -3000 -1000 -300 -100 -30 -10 -6 +6 +10 +30 +100 +300 +1000 +3000 +10000 tip