Mercurial > hg
view mercurial/help/internals/cbor.txt @ 40178:46a40bce3ae0
wireprotov2: define and implement "filesdata" command
Previously, the only way to access file revision data was the
"filedata" command. This command is useful to have. But, it only
allowed resolving revision data for a single file. This meant that
clients needed to send 1 command for each tracked path they were
seeking data on. Furthermore, those commands would need to enumerate
the exact file nodes they wanted data for.
This approach meant that clients were sending a lot of data to
remotes in order to request file data. e.g. if there were 1M
file revisions, we'd need at least 20,000,000 bytes just to encode
file nodes! Many clients on the internet don't have that kind of
upload capacity.
In order to limit the amount of data that clients must send, we'll
need more efficient ways to request repository data.
This commit defines and implements a new "filesdata" command. This
command allows the retrieval of data for multiple files by specifying
changeset revisions and optional file patterns. The command figures
out what file revisions are "relevant" and sends them in bulk.
The logic around choosing which file revisions to send in the case of
haveparents not being set is overly simple and will over-send files. We
will need more smarts here eventually. (Specifically, the client will
need to tell the server which revisions it knows about.) This work
is deferred until a later time.
Differential Revision: https://phab.mercurial-scm.org/D4981
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Wed, 03 Oct 2018 12:54:39 -0700 |
parents | 2fe21c65777e |
children |
line wrap: on
line source
Mercurial uses Concise Binary Object Representation (CBOR) (RFC 7049) for various data formats. This document describes the subset of CBOR that Mercurial uses and gives recommendations for appropriate use of CBOR within Mercurial. Type Limitations ================ Major types 0 and 1 (unsigned integers and negative integers) MUST be fully supported. Major type 2 (byte strings) MUST be fully supported. However, there are limitations around the use of indefinite-length byte strings. (See below.) Major type 3 (text strings) are NOT supported. Major type 4 (arrays) MUST be supported. However, values are limited to the set of types described in the "Container Types" section below. And indefinite-length arrays are NOT supported. Major type 5 (maps) MUST be supported. However, key values are limited to the set of types described in the "Container Types" section below. And indefinite-length maps are NOT supported. Major type 6 (semantic tagging of major types) can be used with the following semantic tag values: 258 Mathematical finite set. Suitable for representing Python's ``set`` type. All other semantic tag values are not allowed. Major type 7 (simple data types) can be used with the following type values: 20 False 21 True 22 Null 31 Break stop code (for indefinite-length items). All other simple data type values (including every value requiring the 1 byte extension) are disallowed. Indefinite-Length Byte Strings ============================== Indefinite-length byte strings (major type 2) are allowed. However, they MUST NOT occur inside a container type (such as an array or map). i.e. they can only occur as the "top-most" element in a stream of values. Encoders and decoders SHOULD *stream* indefinite-length byte strings. i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long byte string value when indefinite-length byte strings are being used if it can be avoided. Mercurial MAY use extremely long indefinite-length byte strings and buffering the source or destination value COULD lead to memory exhaustion. Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20 bytes. Container Types =============== Mercurial may use the array (major type 4), map (major type 5), and set (semantic tag 258 plus major type 4 array) container types. An array may contain any supported type as values. A map MUST only use the following types as keys: * unsigned integers (major type 0) * negative integers (major type 1) * byte strings (major type 2) (but not indefinite-length byte strings) * false (simple type 20) * true (simple type 21) * null (simple type 22) A map MUST only use the following types as values: * all types supported as map keys * arrays * maps * sets A set may only use the following types as values: * all types supported as map keys It is recommended that keys in maps and values in sets and arrays all be of a uniform type. Avoiding Large Byte Strings =========================== The use of large byte strings is discouraged, especially in scenarios where the total size of the byte string may by unbound for some inputs (e.g. when representing the content of a tracked file). It is highly recommended to use indefinite-length byte strings for these purposes. Since indefinite-length byte strings cannot be nested within an outer container (such as an array or map), to associate a large byte string with another data structure, it is recommended to use an array or map followed immediately by an indefinite-length byte string. For example, instead of the following map:: { "key1": "value1", "key2": "value2", "long_value": "some very large value...", } Use a map followed by a byte string: { "key1": "value1", "key2": "value2", "value_follows": True, } <BEGIN INDEFINITE-LENGTH BYTE STRING> "some very large value" "..." <END INDEFINITE-LENGTH BYTE STRING>