|
1 Mercurial uses Concise Binary Object Representation (CBOR) |
|
2 (RFC 7049) for various data formats. |
|
3 |
|
4 This document describes the subset of CBOR that Mercurial uses and |
|
5 gives recommendations for appropriate use of CBOR within Mercurial. |
|
6 |
|
7 Type Limitations |
|
8 ================ |
|
9 |
|
10 Major types 0 and 1 (unsigned integers and negative integers) MUST be |
|
11 fully supported. |
|
12 |
|
13 Major type 2 (byte strings) MUST be fully supported. However, there |
|
14 are limitations around the use of indefinite-length byte strings. |
|
15 (See below.) |
|
16 |
|
17 Major type 3 (text strings) are NOT supported. |
|
18 |
|
19 Major type 4 (arrays) MUST be supported. However, values are limited |
|
20 to the set of types described in the "Container Types" section below. |
|
21 And indefinite-length arrays are NOT supported. |
|
22 |
|
23 Major type 5 (maps) MUST be supported. However, key values are limited |
|
24 to the set of types described in the "Container Types" section below. |
|
25 And indefinite-length maps are NOT supported. |
|
26 |
|
27 Major type 6 (semantic tagging of major types) can be used with the |
|
28 following semantic tag values: |
|
29 |
|
30 258 |
|
31 Mathematical finite set. Suitable for representing Python's |
|
32 ``set`` type. |
|
33 |
|
34 All other semantic tag values are not allowed. |
|
35 |
|
36 Major type 7 (simple data types) can be used with the following |
|
37 type values: |
|
38 |
|
39 20 |
|
40 False |
|
41 21 |
|
42 True |
|
43 22 |
|
44 Null |
|
45 31 |
|
46 Break stop code (for indefinite-length items). |
|
47 |
|
48 All other simple data type values (including every value requiring the |
|
49 1 byte extension) are disallowed. |
|
50 |
|
51 Indefinite-Length Byte Strings |
|
52 ============================== |
|
53 |
|
54 Indefinite-length byte strings (major type 2) are allowed. However, |
|
55 they MUST NOT occur inside a container type (such as an array or map). |
|
56 i.e. they can only occur as the "top-most" element in a stream of |
|
57 values. |
|
58 |
|
59 Encoders and decoders SHOULD *stream* indefinite-length byte strings. |
|
60 i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long |
|
61 byte string value when indefinite-length byte strings are being used |
|
62 if it can be avoided. Mercurial MAY use extremely long indefinite-length |
|
63 byte strings and buffering the source or destination value COULD lead to |
|
64 memory exhaustion. |
|
65 |
|
66 Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20 |
|
67 bytes. |
|
68 |
|
69 Container Types |
|
70 =============== |
|
71 |
|
72 Mercurial may use the array (major type 4), map (major type 5), and |
|
73 set (semantic tag 258 plus major type 4 array) container types. |
|
74 |
|
75 An array may contain any supported type as values. |
|
76 |
|
77 A map MUST only use the following types as keys: |
|
78 |
|
79 * unsigned integers (major type 0) |
|
80 * negative integers (major type 1) |
|
81 * byte strings (major type 2) (but not indefinite-length byte strings) |
|
82 * false (simple type 20) |
|
83 * true (simple type 21) |
|
84 * null (simple type 22) |
|
85 |
|
86 A map MUST only use the following types as values: |
|
87 |
|
88 * all types supported as map keys |
|
89 * arrays |
|
90 * maps |
|
91 * sets |
|
92 |
|
93 A set may only use the following types as values: |
|
94 |
|
95 * all types supported as map keys |
|
96 |
|
97 It is recommended that keys in maps and values in sets and arrays all |
|
98 be of a uniform type. |
|
99 |
|
100 Avoiding Large Byte Strings |
|
101 =========================== |
|
102 |
|
103 The use of large byte strings is discouraged, especially in scenarios where |
|
104 the total size of the byte string may by unbound for some inputs (e.g. when |
|
105 representing the content of a tracked file). It is highly recommended to use |
|
106 indefinite-length byte strings for these purposes. |
|
107 |
|
108 Since indefinite-length byte strings cannot be nested within an outer |
|
109 container (such as an array or map), to associate a large byte string |
|
110 with another data structure, it is recommended to use an array or |
|
111 map followed immediately by an indefinite-length byte string. For example, |
|
112 instead of the following map:: |
|
113 |
|
114 { |
|
115 "key1": "value1", |
|
116 "key2": "value2", |
|
117 "long_value": "some very large value...", |
|
118 } |
|
119 |
|
120 Use a map followed by a byte string: |
|
121 |
|
122 { |
|
123 "key1": "value1", |
|
124 "key2": "value2", |
|
125 "value_follows": True, |
|
126 } |
|
127 <BEGIN INDEFINITE-LENGTH BYTE STRING> |
|
128 "some very large value" |
|
129 "..." |
|
130 <END INDEFINITE-LENGTH BYTE STRING> |