Mercurial > hg
comparison hgext/clonebundles.py @ 37498:aacfca6f9767
wireproto: support for pullbundles
Pullbundles are similar to clonebundles, but served as normal inline
bundle streams. They are almost transparent to the client -- the only
visible effect is that the client might get less changes than what it
asked for, i.e. not all requested head revisions are provided.
The client announces support for the necessary retries with the
partial-pull capability. After receiving a partial bundle, it updates
the set of revisions shared with the server and drops all now-known
heads from the request list. It will then rerun getbundle until
no changes are received or all remote heads are present.
Extend badserverext to support per-socket limit, i.e. don't assume that
the same limits should be applied to all sockets.
Differential Revision: https://phab.mercurial-scm.org/D1856
author | Joerg Sonnenberger <joerg@bec.de> |
---|---|
date | Thu, 18 Jan 2018 12:54:01 +0100 |
parents | d25802b0eef5 |
children | b4d85bc122bd |
comparison
equal
deleted
inserted
replaced
37497:1541e1a8e87d | 37498:aacfca6f9767 |
---|---|
4 """advertise pre-generated bundles to seed clones | 4 """advertise pre-generated bundles to seed clones |
5 | 5 |
6 "clonebundles" is a server-side extension used to advertise the existence | 6 "clonebundles" is a server-side extension used to advertise the existence |
7 of pre-generated, externally hosted bundle files to clients that are | 7 of pre-generated, externally hosted bundle files to clients that are |
8 cloning so that cloning can be faster, more reliable, and require less | 8 cloning so that cloning can be faster, more reliable, and require less |
9 resources on the server. | 9 resources on the server. "pullbundles" is a related feature for sending |
10 pre-generated bundle files to clients as part of pull operations. | |
10 | 11 |
11 Cloning can be a CPU and I/O intensive operation on servers. Traditionally, | 12 Cloning can be a CPU and I/O intensive operation on servers. Traditionally, |
12 the server, in response to a client's request to clone, dynamically generates | 13 the server, in response to a client's request to clone, dynamically generates |
13 a bundle containing the entire repository content and sends it to the client. | 14 a bundle containing the entire repository content and sends it to the client. |
14 There is no caching on the server and the server will have to redundantly | 15 There is no caching on the server and the server will have to redundantly |
15 generate the same outgoing bundle in response to each clone request. For | 16 generate the same outgoing bundle in response to each clone request. For |
16 servers with large repositories or with high clone volume, the load from | 17 servers with large repositories or with high clone volume, the load from |
17 clones can make scaling the server challenging and costly. | 18 clones can make scaling the server challenging and costly. |
18 | 19 |
19 This extension provides server operators the ability to offload potentially | 20 This extension provides server operators the ability to offload |
20 expensive clone load to an external service. Here's how it works. | 21 potentially expensive clone load to an external service. Pre-generated |
22 bundles also allow using more CPU intensive compression, reducing the | |
23 effective bandwidth requirements. | |
24 | |
25 Here's how clone bundles work: | |
21 | 26 |
22 1. A server operator establishes a mechanism for making bundle files available | 27 1. A server operator establishes a mechanism for making bundle files available |
23 on a hosting service where Mercurial clients can fetch them. | 28 on a hosting service where Mercurial clients can fetch them. |
24 2. A manifest file listing available bundle URLs and some optional metadata | 29 2. A manifest file listing available bundle URLs and some optional metadata |
25 is added to the Mercurial repository on the server. | 30 is added to the Mercurial repository on the server. |
31 6. The client downloads and applies an available bundle from the | 36 6. The client downloads and applies an available bundle from the |
32 server-specified URL. | 37 server-specified URL. |
33 7. The client reconnects to the original server and performs the equivalent | 38 7. The client reconnects to the original server and performs the equivalent |
34 of :hg:`pull` to retrieve all repository data not in the bundle. (The | 39 of :hg:`pull` to retrieve all repository data not in the bundle. (The |
35 repository could have been updated between when the bundle was created | 40 repository could have been updated between when the bundle was created |
36 and when the client started the clone.) | 41 and when the client started the clone.) This may use "pullbundles". |
37 | 42 |
38 Instead of the server generating full repository bundles for every clone | 43 Instead of the server generating full repository bundles for every clone |
39 request, it generates full bundles once and they are subsequently reused to | 44 request, it generates full bundles once and they are subsequently reused to |
40 bootstrap new clones. The server may still transfer data at clone time. | 45 bootstrap new clones. The server may still transfer data at clone time. |
41 However, this is only data that has been added/changed since the bundle was | 46 However, this is only data that has been added/changed since the bundle was |
42 created. For large, established repositories, this can reduce server load for | 47 created. For large, established repositories, this can reduce server load for |
43 clones to less than 1% of original. | 48 clones to less than 1% of original. |
44 | 49 |
50 Here's how pullbundles work: | |
51 | |
52 1. A manifest file listing available bundles and describing the revisions | |
53 is added to the Mercurial repository on the server. | |
54 2. A new-enough client informs the server that it supports partial pulls | |
55 and initiates a pull. | |
56 3. If the server has pull bundles enabled and sees the client advertising | |
57 partial pulls, it checks for a matching pull bundle in the manifest. | |
58 A bundle matches if the format is supported by the client, the client | |
59 has the required revisions already and needs something from the bundle. | |
60 4. If there is at least one matching bundle, the server sends it to the client. | |
61 5. The client applies the bundle and notices that the server reply was | |
62 incomplete. It initiates another pull. | |
63 | |
45 To work, this extension requires the following of server operators: | 64 To work, this extension requires the following of server operators: |
46 | 65 |
47 * Generating bundle files of repository content (typically periodically, | 66 * Generating bundle files of repository content (typically periodically, |
48 such as once per day). | 67 such as once per day). |
49 * A file server that clients have network access to and that Python knows | 68 * Clone bundles: A file server that clients have network access to and that |
50 how to talk to through its normal URL handling facility (typically an | 69 Python knows how to talk to through its normal URL handling facility |
51 HTTP server). | 70 (typically an HTTP/HTTPS server). |
52 * A process for keeping the bundles manifest in sync with available bundle | 71 * A process for keeping the bundles manifest in sync with available bundle |
53 files. | 72 files. |
54 | 73 |
55 Strictly speaking, using a static file hosting server isn't required: a server | 74 Strictly speaking, using a static file hosting server isn't required: a server |
56 operator could use a dynamic service for retrieving bundle data. However, | 75 operator could use a dynamic service for retrieving bundle data. However, |
59 | 78 |
60 Bundle files can be generated with the :hg:`bundle` command. Typically | 79 Bundle files can be generated with the :hg:`bundle` command. Typically |
61 :hg:`bundle --all` is used to produce a bundle of the entire repository. | 80 :hg:`bundle --all` is used to produce a bundle of the entire repository. |
62 | 81 |
63 :hg:`debugcreatestreamclonebundle` can be used to produce a special | 82 :hg:`debugcreatestreamclonebundle` can be used to produce a special |
64 *streaming clone bundle*. These are bundle files that are extremely efficient | 83 *streaming clonebundle*. These are bundle files that are extremely efficient |
65 to produce and consume (read: fast). However, they are larger than | 84 to produce and consume (read: fast). However, they are larger than |
66 traditional bundle formats and require that clients support the exact set | 85 traditional bundle formats and require that clients support the exact set |
67 of repository data store formats in use by the repository that created them. | 86 of repository data store formats in use by the repository that created them. |
68 Typically, a newer server can serve data that is compatible with older clients. | 87 Typically, a newer server can serve data that is compatible with older clients. |
69 However, *streaming clone bundles* don't have this guarantee. **Server | 88 However, *streaming clone bundles* don't have this guarantee. **Server |
71 streaming clone bundles incompatible with older Mercurial versions.** | 90 streaming clone bundles incompatible with older Mercurial versions.** |
72 | 91 |
73 A server operator is responsible for creating a ``.hg/clonebundles.manifest`` | 92 A server operator is responsible for creating a ``.hg/clonebundles.manifest`` |
74 file containing the list of available bundle files suitable for seeding | 93 file containing the list of available bundle files suitable for seeding |
75 clones. If this file does not exist, the repository will not advertise the | 94 clones. If this file does not exist, the repository will not advertise the |
76 existence of clone bundles when clients connect. | 95 existence of clone bundles when clients connect. For pull bundles, |
96 ``.hg/pullbundles.manifest`` is used. | |
77 | 97 |
78 The manifest file contains a newline (\\n) delimited list of entries. | 98 The manifest file contains a newline (\\n) delimited list of entries. |
79 | 99 |
80 Each line in this file defines an available bundle. Lines have the format: | 100 Each line in this file defines an available bundle. Lines have the format: |
81 | 101 |
82 <URL> [<key>=<value>[ <key>=<value>]] | 102 <URL> [<key>=<value>[ <key>=<value>]] |
83 | 103 |
84 That is, a URL followed by an optional, space-delimited list of key=value | 104 That is, a URL followed by an optional, space-delimited list of key=value |
85 pairs describing additional properties of this bundle. Both keys and values | 105 pairs describing additional properties of this bundle. Both keys and values |
86 are URI encoded. | 106 are URI encoded. |
107 | |
108 For pull bundles, the URL is a path under the ``.hg`` directory of the | |
109 repository. | |
87 | 110 |
88 Keys in UPPERCASE are reserved for use by Mercurial and are defined below. | 111 Keys in UPPERCASE are reserved for use by Mercurial and are defined below. |
89 All non-uppercase keys can be used by site installations. An example use | 112 All non-uppercase keys can be used by site installations. An example use |
90 for custom properties is to use the *datacenter* attribute to define which | 113 for custom properties is to use the *datacenter* attribute to define which |
91 data center a file is hosted in. Clients could then prefer a server in the | 114 data center a file is hosted in. Clients could then prefer a server in the |
130 If this is defined, it is important to advertise a non-SNI fallback | 153 If this is defined, it is important to advertise a non-SNI fallback |
131 URL or clients running old Python releases may not be able to clone | 154 URL or clients running old Python releases may not be able to clone |
132 with the clonebundles facility. | 155 with the clonebundles facility. |
133 | 156 |
134 Value should be "true". | 157 Value should be "true". |
158 | |
159 heads | |
160 Used for pull bundles. This contains the ``;`` separated changeset | |
161 hashes of the heads of the bundle content. | |
162 | |
163 bases | |
164 Used for pull bundles. This contains the ``;`` separated changeset | |
165 hashes of the roots of the bundle content. This can be skipped if | |
166 the bundle was created without ``--base``. | |
135 | 167 |
136 Manifests can contain multiple entries. Assuming metadata is defined, clients | 168 Manifests can contain multiple entries. Assuming metadata is defined, clients |
137 will filter entries from the manifest that they don't support. The remaining | 169 will filter entries from the manifest that they don't support. The remaining |
138 entries are optionally sorted by client preferences | 170 entries are optionally sorted by client preferences |
139 (``ui.clonebundleprefers`` config option). The client then attempts | 171 (``ui.clonebundleprefers`` config option). The client then attempts |