Mercurial > hg
comparison hgext/clonebundles.py @ 26762:26f622859288
clonebundles: rewrite documentation
There are a lot of considerations server operators need to know before
deploying clone bundles. They should be documented. So I rewrote the
extension docs to contain this information.
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sat, 17 Oct 2015 11:23:54 -0700 |
parents | 23c0da28c034 |
children | e5a1df51bb25 |
comparison
equal
deleted
inserted
replaced
26761:8270ee357dd9 | 26762:26f622859288 |
---|---|
1 # This software may be used and distributed according to the terms of the | 1 # This software may be used and distributed according to the terms of the |
2 # GNU General Public License version 2 or any later version. | 2 # GNU General Public License version 2 or any later version. |
3 | 3 |
4 """server side extension to advertise pre-generated bundles to seed clones. | 4 """advertise pre-generated bundles to seed clones (experimental) |
5 | 5 |
6 The extension essentially serves the content of a .hg/clonebundles.manifest | 6 "clonebundles" is a server-side extension used to advertise the existence |
7 file to clients that request it. | 7 of pre-generated, externally hosted bundle files to clients that are |
8 | 8 cloning so that cloning can be faster, more reliable, and require less |
9 The clonebundles.manifest file contains a list of URLs and attributes. URLs | 9 resources on the server. |
10 hold pre-generated bundles that a client fetches and applies. After applying | 10 |
11 the pre-generated bundle, the client will connect back to the original server | 11 Cloning can be a CPU and I/O intensive operation on servers. Traditionally, |
12 and pull data not in the pre-generated bundle. | 12 the server, in response to a client's request to clone, dynamically generates |
13 | 13 a bundle containing the entire repository content and sends it to the client. |
14 Manifest File Format: | 14 There is no caching on the server and the server will have to redundantly |
15 generate the same outgoing bundle in response to each clone request. For | |
16 servers with large repositories or with high clone volume, the load from | |
17 clones can make scaling the server challenging and costly. | |
18 | |
19 This extension provides server operators the ability to offload potentially | |
20 expensive clone load to an external service. Here's how it works. | |
21 | |
22 1. A server operator establishes a mechanism for making bundle files available | |
23 on a hosting service where Mercurial clients can fetch them. | |
24 2. A manifest file listing available bundle URLs and some optional metadata | |
25 is added to the Mercurial repository on the server. | |
26 3. A client initiates a clone against a clone bundles aware server. | |
27 4. The client sees the server is advertising clone bundles and fetches the | |
28 manifest listing available bundles. | |
29 5. The client filters and sorts the available bundles based on what it | |
30 supports and prefers. | |
31 6. The client downloads and applies an available bundle from the | |
32 server-specified URL. | |
33 7. The client reconnects to the original server and performs the equivalent | |
34 of :hg:`pull` to retrieve all repository data not in the bundle. (The | |
35 repository could have been updated between when the bundle was created | |
36 and when the client started the clone.) | |
37 | |
38 Instead of the server generating full repository bundles for every clone | |
39 request, it generates full bundles once and they are subsequently reused to | |
40 bootstrap new clones. The server may still transfer data at clone time. | |
41 However, this is only data that has been added/changed since the bundle was | |
42 created. For large, established repositories, this can reduce server load for | |
43 clones to less than 1% of original. | |
44 | |
45 To work, this extension requires the following of server operators: | |
46 | |
47 * Generating bundle files of repository content (typically periodically, | |
48 such as once per day). | |
49 * A file server that clients have network access to and that Python knows | |
50 how to talk to through its normal URL handling facility (typically a | |
51 HTTP server). | |
52 * A process for keeping the bundles manifest in sync with available bundle | |
53 files. | |
54 | |
55 Strictly speaking, using a static file hosting server isn't required: a server | |
56 operator could use a dynamic service for retrieving bundle data. However, | |
57 static file hosting services are simple and scalable and should be sufficient | |
58 for most needs. | |
59 | |
60 Bundle files can be generated with the :hg:`bundle` comand. Typically | |
61 :hg:`bundle --all` is used to produce a bundle of the entire repository. | |
62 | |
63 :hg:`debugcreatestreamclonebundle` can be used to produce a special | |
64 *streaming clone bundle*. These are bundle files that are extremely efficient | |
65 to produce and consume (read: fast). However, they are larger than | |
66 traditional bundle formats and require that clients support the exact set | |
67 of repository data store formats in use by the repository that created them. | |
68 Typically, a newer server can serve data that is compatible with older clients. | |
69 However, *streaming clone bundles* don't have this guarantee. **Server | |
70 operators need to be aware that newer versions of Mercurial may produce | |
71 streaming clone bundles incompatible with older Mercurial versions.** | |
72 | |
73 The list of requirements printed by :hg:`debugcreatestreamclonebundle` should | |
74 be specified in the ``requirements`` parameter of the *bundle specification | |
75 string* for the ``BUNDLESPEC`` manifest property described below. e.g. | |
76 ``BUNDLESPEC=none-packed1;requirements%3Drevlogv1``. | |
77 | |
78 A server operator is responsible for creating a ``.hg/clonebundles.manifest`` | |
79 file containing the list of available bundle files suitable for seeding | |
80 clones. If this file does not exist, the repository will not advertise the | |
81 existence of clone bundles when clients connect. | |
15 | 82 |
16 The manifest file contains a newline (\n) delimited list of entries. | 83 The manifest file contains a newline (\n) delimited list of entries. |
17 | 84 |
18 Each line in this file defines an available bundle. Lines have the format: | 85 Each line in this file defines an available bundle. Lines have the format: |
19 | 86 |
20 <URL> [<key>=<value] | 87 <URL> [<key>=<value>[ <key>=<value>]] |
21 | 88 |
22 That is, a URL followed by extra metadata describing it. Metadata keys and | 89 That is, a URL followed by an optional, space-delimited list of key=value |
23 values should be URL encoded. | 90 pairs describing additional properties of this bundle. Both keys and values |
24 | 91 are URI encoded. |
25 This metadata is optional. It is up to server operators to populate this | 92 |
26 metadata. | 93 Keys in UPPERCASE are reserved for use by Mercurial and are defined below. |
27 | 94 All non-uppercase keys can be used by site installations. An example use |
28 Keys in UPPERCASE are reserved for use by Mercurial. All non-uppercase keys | 95 for custom properties is to use the *datacenter* attribute to define which |
29 can be used by site installations. | 96 data center a file is hosted in. Clients could then prefer a server in the |
30 | 97 data center closest to them. |
31 The server operator is responsible for generating the bundle manifest file. | 98 |
32 | 99 The following reserved keys are currently defined: |
33 Metadata Attributes: | |
34 | 100 |
35 BUNDLESPEC | 101 BUNDLESPEC |
36 A "bundle specification" string that describes the type of the bundle. | 102 A "bundle specification" string that describes the type of the bundle. |
37 | 103 |
38 These are string values that are accepted by the "--type" argument of | 104 These are string values that are accepted by the "--type" argument of |
39 `hg bundle`. | 105 :hg:`bundle`. |
40 | 106 |
41 The values are parsed in strict mode, which means they must be of the | 107 The values are parsed in strict mode, which means they must be of the |
42 "<compression>-<type>" form. See | 108 "<compression>-<type>" form. See |
43 mercurial.exchange.parsebundlespec() for more details. | 109 mercurial.exchange.parsebundlespec() for more details. |
44 | 110 |
47 apply. | 113 apply. |
48 | 114 |
49 The actual value doesn't impact client behavior beyond filtering: | 115 The actual value doesn't impact client behavior beyond filtering: |
50 clients will still sniff the bundle type from the header of downloaded | 116 clients will still sniff the bundle type from the header of downloaded |
51 files. | 117 files. |
118 | |
119 **Use of this key is highly recommended**, as it allows clients to | |
120 easily skip unsupported bundles. | |
52 | 121 |
53 REQUIRESNI | 122 REQUIRESNI |
54 Whether Server Name Indication (SNI) is required to connect to the URL. | 123 Whether Server Name Indication (SNI) is required to connect to the URL. |
55 SNI allows servers to use multiple certificates on the same IP. It is | 124 SNI allows servers to use multiple certificates on the same IP. It is |
56 somewhat common in CDNs and other hosting providers. Older Python | 125 somewhat common in CDNs and other hosting providers. Older Python |
57 versions do not support SNI. Defining this attribute enables clients | 126 versions do not support SNI. Defining this attribute enables clients |
58 with older Python versions to filter this entry. | 127 with older Python versions to filter this entry without experiencing |
128 an opaque SSL failure at connection time. | |
59 | 129 |
60 If this is defined, it is important to advertise a non-SNI fallback | 130 If this is defined, it is important to advertise a non-SNI fallback |
61 URL or clients running old Python releases may not be able to clone | 131 URL or clients running old Python releases may not be able to clone |
62 with the clonebundles facility. | 132 with the clonebundles facility. |
63 | 133 |
64 Value should be "true". | 134 Value should be "true". |
135 | |
136 Manifests can contain multiple entries. Assuming metadata is defined, clients | |
137 will filter entries from the manifest that they don't support. The remaining | |
138 entries are optionally sorted by client preferences | |
139 (``experimental.clonebundleprefers`` config option). The client then attempts | |
140 to fetch the bundle at the first URL in the remaining list. | |
141 | |
142 **Errors when downloading a bundle will fail the entire clone operation: | |
143 clients do not automatically fall back to a traditional clone.** The reason | |
144 for this is that if a server is using clone bundles, it is probably doing so | |
145 because the feature is necessary to help it scale. In other words, there | |
146 is an assumption that clone load will be offloaded to another service and | |
147 that the Mercurial server isn't responsible for serving this clone load. | |
148 If that other service experiences issues and clients start mass falling back to | |
149 the original Mercurial server, the added clone load could overwhelm the server | |
150 due to unexpected load and effectively take it offline. Not having clients | |
151 automatically fall back to cloning from the original server mitigates this | |
152 scenario. | |
153 | |
154 Because there is no automatic Mercurial server fallback on failure of the | |
155 bundle hosting service, it is important for server operators to view the bundle | |
156 hosting service as an extension of the Mercurial server in terms of | |
157 availability and service level agreements: if the bundle hosting service goes | |
158 down, so does the ability for clients to clone. Note: clients will see a | |
159 message informing them how to bypass the clone bundles facility when a failure | |
160 occurs. So server operators should prepare for some people to follow these | |
161 instructions when a failure occurs, thus driving more load to the original | |
162 Mercurial server when the bundle hosting service fails. | |
163 | |
164 The following config options influence the behavior of the clone bundles | |
165 feature: | |
166 | |
167 ui.clonebundleadvertise | |
168 Whether the server advertises the existence of the clone bundles feature | |
169 to compatible clients that aren't using it. | |
170 | |
171 When this is enabled (the default), a server will send a message to | |
172 compatible clients performing a traditional clone informing them of the | |
173 available clone bundles feature. Compatible clients are those that support | |
174 bundle2 and are advertising support for the clone bundles feature. | |
175 | |
176 ui.clonebundlefallback | |
177 Whether to automatically fall back to a traditional clone in case of | |
178 clone bundles failure. Defaults to false for reasons described above. | |
179 | |
180 experimental.clonebundles | |
181 Whether the clone bundles feature is enabled on clients. Defaults to true. | |
182 | |
183 experimental.clonebundleprefers | |
184 List of "key=value" properties the client prefers in bundles. Downloaded | |
185 bundle manifests will be sorted by the preferences in this list. e.g. | |
186 the value "BUNDLESPEC=gzip-v1, BUNDLESPEC=bzip2=v1" will prefer a gzipped | |
187 version 1 bundle type then bzip2 version 1 bundle type. | |
188 | |
189 If not defined, the order in the manifest will be used and the first | |
190 available bundle will be downloaded. | |
65 """ | 191 """ |
66 | 192 |
67 from mercurial.i18n import _ | 193 from mercurial.i18n import _ |
68 from mercurial.node import nullid | 194 from mercurial.node import nullid |
69 from mercurial import ( | 195 from mercurial import ( |