mercurial/help/internals/cbor.txt
author Gregory Szorc <gregory.szorc@gmail.com>
Wed, 03 Oct 2018 12:54:39 -0700
changeset 40178 46a40bce3ae0
parent 39409 2fe21c65777e
permissions -rw-r--r--
wireprotov2: define and implement "filesdata" command Previously, the only way to access file revision data was the "filedata" command. This command is useful to have. But, it only allowed resolving revision data for a single file. This meant that clients needed to send 1 command for each tracked path they were seeking data on. Furthermore, those commands would need to enumerate the exact file nodes they wanted data for. This approach meant that clients were sending a lot of data to remotes in order to request file data. e.g. if there were 1M file revisions, we'd need at least 20,000,000 bytes just to encode file nodes! Many clients on the internet don't have that kind of upload capacity. In order to limit the amount of data that clients must send, we'll need more efficient ways to request repository data. This commit defines and implements a new "filesdata" command. This command allows the retrieval of data for multiple files by specifying changeset revisions and optional file patterns. The command figures out what file revisions are "relevant" and sends them in bulk. The logic around choosing which file revisions to send in the case of haveparents not being set is overly simple and will over-send files. We will need more smarts here eventually. (Specifically, the client will need to tell the server which revisions it knows about.) This work is deferred until a later time. Differential Revision: https://phab.mercurial-scm.org/D4981
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
39409
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     1
Mercurial uses Concise Binary Object Representation (CBOR)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     2
(RFC 7049) for various data formats.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     3
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     4
This document describes the subset of CBOR that Mercurial uses and
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     5
gives recommendations for appropriate use of CBOR within Mercurial.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     6
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     7
Type Limitations
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     8
================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     9
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    10
Major types 0 and 1 (unsigned integers and negative integers) MUST be
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    11
fully supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    12
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    13
Major type 2 (byte strings) MUST be fully supported. However, there
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    14
are limitations around the use of indefinite-length byte strings.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    15
(See below.)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    16
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    17
Major type 3 (text strings) are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    18
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    19
Major type 4 (arrays) MUST be supported. However, values are limited
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    20
to the set of types described in the "Container Types" section below.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    21
And indefinite-length arrays are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    22
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    23
Major type 5 (maps) MUST be supported. However, key values are limited
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    24
to the set of types described in the "Container Types" section below.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    25
And indefinite-length maps are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    26
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    27
Major type 6 (semantic tagging of major types) can be used with the
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    28
following semantic tag values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    29
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    30
258
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    31
   Mathematical finite set. Suitable for representing Python's
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    32
   ``set`` type.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    33
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    34
All other semantic tag values are not allowed.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    35
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    36
Major type 7 (simple data types) can be used with the following
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    37
type values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    38
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    39
20
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    40
   False
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    41
21
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    42
   True
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    43
22
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    44
   Null
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    45
31
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    46
   Break stop code (for indefinite-length items).
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    47
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    48
All other simple data type values (including every value requiring the
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    49
1 byte extension) are disallowed.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    50
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    51
Indefinite-Length Byte Strings
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    52
==============================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    53
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    54
Indefinite-length byte strings (major type 2) are allowed. However,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    55
they MUST NOT occur inside a container type (such as an array or map).
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    56
i.e. they can only occur as the "top-most" element in a stream of
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    57
values.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    58
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    59
Encoders and decoders SHOULD *stream* indefinite-length byte strings.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    60
i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    61
byte string value when indefinite-length byte strings are being used
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    62
if it can be avoided. Mercurial MAY use extremely long indefinite-length
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    63
byte strings and buffering the source or destination value COULD lead to
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    64
memory exhaustion.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    65
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    66
Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    67
bytes.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    68
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    69
Container Types
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    70
===============
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    71
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    72
Mercurial may use the array (major type 4), map (major type 5), and
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    73
set (semantic tag 258 plus major type 4 array) container types.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    74
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    75
An array may contain any supported type as values.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    76
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    77
A map MUST only use the following types as keys:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    78
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    79
* unsigned integers (major type 0)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    80
* negative integers (major type 1)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    81
* byte strings (major type 2) (but not indefinite-length byte strings)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    82
* false (simple type 20)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    83
* true (simple type 21)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    84
* null (simple type 22)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    85
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    86
A map MUST only use the following types as values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    87
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    88
* all types supported as map keys
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    89
* arrays
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    90
* maps
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    91
* sets
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    92
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    93
A set may only use the following types as values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    94
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    95
* all types supported as map keys
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    96
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    97
It is recommended that keys in maps and values in sets and arrays all
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    98
be of a uniform type.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    99
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   100
Avoiding Large Byte Strings
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   101
===========================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   102
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   103
The use of large byte strings is discouraged, especially in scenarios where
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   104
the total size of the byte string may by unbound for some inputs (e.g. when
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   105
representing the content of a tracked file). It is highly recommended to use
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   106
indefinite-length byte strings for these purposes.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   107
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   108
Since indefinite-length byte strings cannot be nested within an outer
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   109
container (such as an array or map), to associate a large byte string
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   110
with another data structure, it is recommended to use an array or
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   111
map followed immediately by an indefinite-length byte string. For example,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   112
instead of the following map::
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   113
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   114
   {
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   115
      "key1": "value1",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   116
      "key2": "value2",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   117
      "long_value": "some very large value...",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   118
   }
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   119
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   120
Use a map followed by a byte string:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   121
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   122
   {
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   123
      "key1": "value1",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   124
      "key2": "value2",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   125
      "value_follows": True,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   126
   }
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   127
   <BEGIN INDEFINITE-LENGTH BYTE STRING>
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   128
   "some very large value"
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   129
   "..."
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   130
   <END INDEFINITE-LENGTH BYTE STRING>