Mercurial > hg
annotate mercurial/helptext/internals/cbor.txt @ 52032:09a54892b7ee
mergestate: reduce the number of attribute lookups
This code is called a lot during updates, this is a very small but also very
easy thing to do.
author | Raphaël Gomès <rgomes@octobus.net> |
---|---|
date | Wed, 21 Aug 2024 09:48:14 +0200 |
parents | 2e017696181f |
children |
rev | line source |
---|---|
39409
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1 Mercurial uses Concise Binary Object Representation (CBOR) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
2 (RFC 7049) for various data formats. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
3 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
4 This document describes the subset of CBOR that Mercurial uses and |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
5 gives recommendations for appropriate use of CBOR within Mercurial. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
6 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
7 Type Limitations |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
8 ================ |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
9 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
10 Major types 0 and 1 (unsigned integers and negative integers) MUST be |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
11 fully supported. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
12 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
13 Major type 2 (byte strings) MUST be fully supported. However, there |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
14 are limitations around the use of indefinite-length byte strings. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
15 (See below.) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
16 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
17 Major type 3 (text strings) are NOT supported. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
18 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
19 Major type 4 (arrays) MUST be supported. However, values are limited |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
20 to the set of types described in the "Container Types" section below. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
21 And indefinite-length arrays are NOT supported. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
22 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
23 Major type 5 (maps) MUST be supported. However, key values are limited |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
24 to the set of types described in the "Container Types" section below. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
25 And indefinite-length maps are NOT supported. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
26 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
27 Major type 6 (semantic tagging of major types) can be used with the |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
28 following semantic tag values: |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
29 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
30 258 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
31 Mathematical finite set. Suitable for representing Python's |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
32 ``set`` type. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
33 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
34 All other semantic tag values are not allowed. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
35 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
36 Major type 7 (simple data types) can be used with the following |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
37 type values: |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
38 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
39 20 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
40 False |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
41 21 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
42 True |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
43 22 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
44 Null |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
45 31 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
46 Break stop code (for indefinite-length items). |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
47 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
48 All other simple data type values (including every value requiring the |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
49 1 byte extension) are disallowed. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
50 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
51 Indefinite-Length Byte Strings |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
52 ============================== |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
53 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
54 Indefinite-length byte strings (major type 2) are allowed. However, |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
55 they MUST NOT occur inside a container type (such as an array or map). |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
56 i.e. they can only occur as the "top-most" element in a stream of |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
57 values. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
58 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
59 Encoders and decoders SHOULD *stream* indefinite-length byte strings. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
60 i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
61 byte string value when indefinite-length byte strings are being used |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
62 if it can be avoided. Mercurial MAY use extremely long indefinite-length |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
63 byte strings and buffering the source or destination value COULD lead to |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
64 memory exhaustion. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
65 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
66 Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
67 bytes. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
68 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
69 Container Types |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
70 =============== |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
71 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
72 Mercurial may use the array (major type 4), map (major type 5), and |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
73 set (semantic tag 258 plus major type 4 array) container types. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
74 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
75 An array may contain any supported type as values. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
76 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
77 A map MUST only use the following types as keys: |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
78 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
79 * unsigned integers (major type 0) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
80 * negative integers (major type 1) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
81 * byte strings (major type 2) (but not indefinite-length byte strings) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
82 * false (simple type 20) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
83 * true (simple type 21) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
84 * null (simple type 22) |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
85 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
86 A map MUST only use the following types as values: |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
87 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
88 * all types supported as map keys |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
89 * arrays |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
90 * maps |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
91 * sets |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
92 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
93 A set may only use the following types as values: |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
94 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
95 * all types supported as map keys |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
96 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
97 It is recommended that keys in maps and values in sets and arrays all |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
98 be of a uniform type. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
99 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
100 Avoiding Large Byte Strings |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
101 =========================== |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
102 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
103 The use of large byte strings is discouraged, especially in scenarios where |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
104 the total size of the byte string may by unbound for some inputs (e.g. when |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
105 representing the content of a tracked file). It is highly recommended to use |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
106 indefinite-length byte strings for these purposes. |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
107 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
108 Since indefinite-length byte strings cannot be nested within an outer |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
109 container (such as an array or map), to associate a large byte string |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
110 with another data structure, it is recommended to use an array or |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
111 map followed immediately by an indefinite-length byte string. For example, |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
112 instead of the following map:: |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
113 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
114 { |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
115 "key1": "value1", |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
116 "key2": "value2", |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
117 "long_value": "some very large value...", |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
118 } |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
119 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
120 Use a map followed by a byte string: |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
121 |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
122 { |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
123 "key1": "value1", |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
124 "key2": "value2", |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
125 "value_follows": True, |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
126 } |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
127 <BEGIN INDEFINITE-LENGTH BYTE STRING> |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
128 "some very large value" |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
129 "..." |
2fe21c65777e
internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
130 <END INDEFINITE-LENGTH BYTE STRING> |