annotate mercurial/helptext/internals/cbor.txt @ 52032:09a54892b7ee

mergestate: reduce the number of attribute lookups This code is called a lot during updates, this is a very small but also very easy thing to do.
author Raphaël Gomès <rgomes@octobus.net>
date Wed, 21 Aug 2024 09:48:14 +0200
parents 2e017696181f
children
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
39409
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1 Mercurial uses Concise Binary Object Representation (CBOR)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
2 (RFC 7049) for various data formats.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
3
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
4 This document describes the subset of CBOR that Mercurial uses and
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
5 gives recommendations for appropriate use of CBOR within Mercurial.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
6
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
7 Type Limitations
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
8 ================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
9
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
10 Major types 0 and 1 (unsigned integers and negative integers) MUST be
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
11 fully supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
12
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
13 Major type 2 (byte strings) MUST be fully supported. However, there
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
14 are limitations around the use of indefinite-length byte strings.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
15 (See below.)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
16
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
17 Major type 3 (text strings) are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
18
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
19 Major type 4 (arrays) MUST be supported. However, values are limited
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
20 to the set of types described in the "Container Types" section below.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
21 And indefinite-length arrays are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
22
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
23 Major type 5 (maps) MUST be supported. However, key values are limited
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
24 to the set of types described in the "Container Types" section below.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
25 And indefinite-length maps are NOT supported.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
26
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
27 Major type 6 (semantic tagging of major types) can be used with the
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
28 following semantic tag values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
29
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
30 258
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
31 Mathematical finite set. Suitable for representing Python's
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
32 ``set`` type.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
33
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
34 All other semantic tag values are not allowed.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
35
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
36 Major type 7 (simple data types) can be used with the following
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
37 type values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
38
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
39 20
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
40 False
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
41 21
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
42 True
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
43 22
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
44 Null
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
45 31
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
46 Break stop code (for indefinite-length items).
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
47
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
48 All other simple data type values (including every value requiring the
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
49 1 byte extension) are disallowed.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
50
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
51 Indefinite-Length Byte Strings
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
52 ==============================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
53
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
54 Indefinite-length byte strings (major type 2) are allowed. However,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
55 they MUST NOT occur inside a container type (such as an array or map).
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
56 i.e. they can only occur as the "top-most" element in a stream of
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
57 values.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
58
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
59 Encoders and decoders SHOULD *stream* indefinite-length byte strings.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
60 i.e. an encoder or decoder SHOULD NOT buffer the entirety of a long
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
61 byte string value when indefinite-length byte strings are being used
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
62 if it can be avoided. Mercurial MAY use extremely long indefinite-length
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
63 byte strings and buffering the source or destination value COULD lead to
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
64 memory exhaustion.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
65
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
66 Chunks in an indefinite-length byte string SHOULD NOT exceed 2^20
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
67 bytes.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
68
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
69 Container Types
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
70 ===============
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
71
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
72 Mercurial may use the array (major type 4), map (major type 5), and
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
73 set (semantic tag 258 plus major type 4 array) container types.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
74
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
75 An array may contain any supported type as values.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
76
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
77 A map MUST only use the following types as keys:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
78
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
79 * unsigned integers (major type 0)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
80 * negative integers (major type 1)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
81 * byte strings (major type 2) (but not indefinite-length byte strings)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
82 * false (simple type 20)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
83 * true (simple type 21)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
84 * null (simple type 22)
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
85
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
86 A map MUST only use the following types as values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
87
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
88 * all types supported as map keys
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
89 * arrays
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
90 * maps
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
91 * sets
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
92
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
93 A set may only use the following types as values:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
94
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
95 * all types supported as map keys
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
96
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
97 It is recommended that keys in maps and values in sets and arrays all
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
98 be of a uniform type.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
99
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
100 Avoiding Large Byte Strings
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
101 ===========================
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
102
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
103 The use of large byte strings is discouraged, especially in scenarios where
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
104 the total size of the byte string may by unbound for some inputs (e.g. when
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
105 representing the content of a tracked file). It is highly recommended to use
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
106 indefinite-length byte strings for these purposes.
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
107
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
108 Since indefinite-length byte strings cannot be nested within an outer
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
109 container (such as an array or map), to associate a large byte string
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
110 with another data structure, it is recommended to use an array or
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
111 map followed immediately by an indefinite-length byte string. For example,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
112 instead of the following map::
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
113
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
114 {
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
115 "key1": "value1",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
116 "key2": "value2",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
117 "long_value": "some very large value...",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
118 }
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
119
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
120 Use a map followed by a byte string:
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
121
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
122 {
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
123 "key1": "value1",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
124 "key2": "value2",
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
125 "value_follows": True,
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
126 }
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
127 <BEGIN INDEFINITE-LENGTH BYTE STRING>
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
128 "some very large value"
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
129 "..."
2fe21c65777e internals: document CBOR utilization
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
130 <END INDEFINITE-LENGTH BYTE STRING>