mercurial/help/internals/cbor.txt
author Georges Racinet <georges.racinet@octobus.net>
Tue, 21 May 2019 20:07:20 +0200
changeset 42357 5b795108dd17
parent 39409 2fe21c65777e
permissions -rw-r--r--
rust-python3: useless python2 specific import This python27_sys import prevents building with python3, it had been previously removed in a5fa9140ce4c, but that has been since pruned Differential Revision: https://phab.mercurial-scm.org/D6415
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
39409
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     1
Mercurial uses Concise Binary Object Representation (CBOR)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     2
(RFC 7049) for various data formats.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     3
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     4
This document describes the subset of CBOR that Mercurial uses and
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     5
gives recommendations for appropriate use of CBOR within Mercurial.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     6
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     7
Type Limitations
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     8
================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     9
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    10
Major types 0 and 1 (unsigned integers and negative integers) MUST be
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    11
fully supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    12
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    13
Major type 2 (byte strings) MUST be fully supported. However, there
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    14
are limitations around the use of indefinite-length byte strings.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    15
(See below.)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    16
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    17
Major type 3 (text strings) are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    18
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    19
Major type 4 (arrays) MUST be supported. However, values are limited
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    20
to the set of types described in the "Container Types" section below.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    21
And indefinite-length arrays are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    22
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    23
Major type 5 (maps) MUST be supported. However, key values are limited
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    24
to the set of types described in the "Container Types" section below.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    25
And indefinite-length maps are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    26
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    27
Major type 6 (semantic tagging of major types) can be used with the
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    28
following semantic tag values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    29
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    30
258
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    31
   Mathematical finite set. Suitable for representing Python's
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    32
   ``set`` type.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    33
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    34
All other semantic tag values are not allowed.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    35
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    36
Major type 7 (simple data types) can be used with the following
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    37
type values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    38
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    39
20
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    40
   False
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    41
21
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    42
   True
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    43
22
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    44
   Null
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    45
31
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    46
   Break stop code (for indefinite-length items).
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    47
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    48
All other simple data type values (including every value requiring the
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    49
1 byte extension) are disallowed.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    50
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    51
Indefinite-Length Byte Strings
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    52
==============================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    53
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    54
Indefinite-length byte strings (major type 2) are allowed. However,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    55
they MUST NOT occur inside a container type (such as an array or map).
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    56
i.e. they can only occur as the "top-most" element in a stream of
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    57
values.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    58
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    59
Encoders and decoders SHOULD *stream* indefinite-length byte strings.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    60
i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    61
byte string value when indefinite-length byte strings are being used
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    62
if it can be avoided. Mercurial MAY use extremely long indefinite-length
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    63
byte strings and buffering the source or destination value COULD lead to
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    64
memory exhaustion.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    65
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    66
Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    67
bytes.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    68
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    69
Container Types
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    70
===============
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    71
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    72
Mercurial may use the array (major type 4), map (major type 5), and
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    73
set (semantic tag 258 plus major type 4 array) container types.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    74
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    75
An array may contain any supported type as values.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    76
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    77
A map MUST only use the following types as keys:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    78
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    79
* unsigned integers (major type 0)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    80
* negative integers (major type 1)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    81
* byte strings (major type 2) (but not indefinite-length byte strings)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    82
* false (simple type 20)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    83
* true (simple type 21)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    84
* null (simple type 22)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    85
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    86
A map MUST only use the following types as values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    87
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    88
* all types supported as map keys
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    89
* arrays
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    90
* maps
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    91
* sets
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    92
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    93
A set may only use the following types as values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    94
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    95
* all types supported as map keys
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    96
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    97
It is recommended that keys in maps and values in sets and arrays all
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    98
be of a uniform type.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    99
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   100
Avoiding Large Byte Strings
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   101
===========================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   102
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   103
The use of large byte strings is discouraged, especially in scenarios where
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   104
the total size of the byte string may by unbound for some inputs (e.g. when
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   105
representing the content of a tracked file). It is highly recommended to use
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   106
indefinite-length byte strings for these purposes.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   107
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   108
Since indefinite-length byte strings cannot be nested within an outer
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   109
container (such as an array or map), to associate a large byte string
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   110
with another data structure, it is recommended to use an array or
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   111
map followed immediately by an indefinite-length byte string. For example,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   112
instead of the following map::
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   113
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   114
   {
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   115
      "key1": "value1",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   116
      "key2": "value2",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   117
      "long_value": "some very large value...",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   118
   }
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   119
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   120
Use a map followed by a byte string:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   121
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   122
   {
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   123
      "key1": "value1",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   124
      "key2": "value2",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   125
      "value_follows": True,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   126
   }
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   127
   <BEGIN INDEFINITE-LENGTH BYTE STRING>
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   128
   "some very large value"
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   129
   "..."
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   130
   <END INDEFINITE-LENGTH BYTE STRING>