Mercurial > hg
annotate notes.txt @ 1730:0f1d2c75db5e
add prechangegroup and pretxnchangegroup hooks.
prechangegroup lets you stop push, pull or unbundle before it begins.
pretxnchangegroup lets you inspect changegroup before transaction is
committed, and roll back if you not like it.
author | Vadim Gelfer <vadim.gelfer@gmail.com> |
---|---|
date | Wed, 15 Feb 2006 10:49:30 -0800 |
parents | 2073e5a71008 |
children | 4f072bb06e89 |
rev | line source |
---|---|
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
1 Some notes about Mercurial's design |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
2 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
3 Revlogs: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
4 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
5 The fundamental storage type in Mercurial is a "revlog". A revlog is |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
6 the set of all revisions to a file. Each revision is either stored |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
7 compressed in its entirety or as a compressed binary delta against the |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
8 previous version. The decision of when to store a full version is made |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
9 based on how much data would be needed to reconstruct the file. This |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
10 lets us ensure that we never need to read huge amounts of data to |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
11 reconstruct a file, regardless of how many revisions of it we store. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
12 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
13 In fact, we should always be able to do it with a single read, |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
14 provided we know when and where to read. This is where the index comes |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
15 in. Each revlog has an index containing a special hash (nodeid) of the |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
16 text, hashes for its parents, and where and how much of the revlog |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
17 data we need to read to reconstruct it. Thus, with one read of the |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
18 index and one read of the data, we can reconstruct any version in time |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
19 proportional to the file size. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
20 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
21 Similarly, revlogs and their indices are append-only. This means that |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
22 adding a new version is also O(1) seeks. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
23 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
24 Generally revlogs are used to represent revisions of files, but they |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
25 also are used to represent manifests and changesets. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
26 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
27 Manifests: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
28 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
29 A manifest is simply a list of all files in a given revision of a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
30 project along with the nodeids of the corresponding file revisions. So |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
31 grabbing a given version of the project means simply looking up its |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
32 manifest and reconstruction all the file revisions pointed to by it. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
33 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
34 Changesets: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
35 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
36 A changeset is a list of all files changed in a check-in along with a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
37 change description and some metadata like user and date. It also |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
38 contains a nodeid to the relevent revision of the manifest. Changesets |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
39 and manifests are one-to-one, but contain different data for |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
40 convenience. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
41 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
42 Nodeids: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
43 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
44 Nodeids are unique ids that are used to represent the contents of a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
45 file AND its position in the project history. That is, if you change a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
46 file and then change it back, the result will have a different nodeid |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
47 because it has different history. This is accomplished by including |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
48 the parents of a given revision's nodeids with the revision's text |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
49 when calculating the hash. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
50 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
51 Graph merging: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
52 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
53 Nodeids are implemented as they are to simplify merging. Merging a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
54 pair of directed acyclic graphs (aka "the family tree" of the file |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
55 history) requires some method of determining if nodes in different |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
56 graphs correspond. Simply comparing the contents of the node (by |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
57 comparing text of given revisions or their hashes) can get confused by |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
58 identical revisions in the tree. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
59 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
60 The nodeid approach makes it trivial - the hash uniquely describes a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
61 revision's contents and its graph position relative to the root, so |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
62 merge is simply checking whether each nodeid in graph A is in the hash |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
63 table of graph B. If not, we pull them in, adding them sequentially to |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
64 the revlog. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
65 |
260 | 66 Branching and merging: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
67 |
260 | 68 Everything in Mercurial is potentially a branch and every user |
69 effectively works in their own branch. When you do a checkout, | |
70 Mercurial remembers what the parent changeset was and uses it for the | |
71 next check in. | |
72 | |
73 To do a merge of branches in Mercurial, you check out the heads of the | |
74 two branches into the same working directory which causes a merge to | |
75 be performed, and then check in the result once you're happy with it. | |
76 The resulting checkin will have two parents. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
77 |
260 | 78 It decides when a merge is necessary by first determining if there are |
79 any uncommitted changes in the working directory. This effectively | |
80 makes the working directory a branch off the checked in version it's | |
81 based on. Then it also determines if the working directory is a direct | |
82 ancestor or descendent of the second version we're attempting to | |
83 checkout. If neither is true, we simply replace the working directory | |
84 version with the new version. Otherwise we perform a merge between the | |
85 two versions. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
86 |
260 | 87 Merging files and manifests: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
88 |
260 | 89 We begin by comparing two versions manifests and deciding which files |
90 need to be added, deleted, and merged. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
91 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
92 Then for each file, we perform a graph merge and resolve as above. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
93 It's important to merge files using per-file DAGs rather than just |
1308
2073e5a71008
Cleanup of tabs and trailing spaces.
Thomas Arendsen Hein <thomas@intevation.de>
parents:
260
diff
changeset
|
94 changeset level DAGs as this diagram illustrates: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
95 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
96 M M1 M2 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
97 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
98 AB |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
99 |`-------v M2 clones M |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
100 aB AB file A is change in mainline |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
101 |`---v AB' file B is changed in M2 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
102 | aB / | M1 clones M |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
103 | ab/ | M1 changes B |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
104 | ab' | M1 merges from M2, changes to B conflict |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
105 | | A'B' M2 changes A |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
106 `---+--.| |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
107 | a'B' M2 merges from mainline, changes to A conflict |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
108 `--.| |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
109 ??? depending on which ancestor we choose, we will have |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
110 to redo A hand-merge, B hand-merge, or both |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
111 but if we look at the files independently, everything |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
112 is fine |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
113 |
260 | 114 The result is a merged version in the working directory, waiting for |
115 check-in. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
116 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
117 Rollback: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
118 |
47 | 119 When performing a commit or a merge, we order things so that the |
120 changeset entry gets added last. We keep a transaction log of the name | |
121 of each file touched and its length prior to the transaction. On | |
122 abort, we simply truncate each file to its prior length. This is one | |
123 of the nice properties of the append-only structure of the revlogs. | |
260 | 124 We can also reuse this journal for "undo". |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
125 |
260 | 126 Merging between repositories: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
127 |
260 | 128 One of the key features of Mercurial is the ability to merge between |
129 independent repositories in a decentralized fashion. Each repository | |
130 can act as a read-only server or a client. Clients operating by | |
131 pulling all branches that it hasn't seen from the server and adding | |
132 them into its graph. This is done in two steps: searching for new | |
133 "roots" and pulling a "changegroup" | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
134 |
260 | 135 Searching for new "roots" begins by finding all new heads and |
136 searching backwards from those heads to the first unknown nodes in | |
137 their respective branches. These nodes are the 'roots' that are used | |
138 to calculate the 'changegroup': the set of all changesets starting at | |
139 those roots. Mercurial takes pains to make this search efficient in | |
140 both bandwidth and round-trips. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
141 |
260 | 142 Once the roots are found, the changegroup can be transferred as a |
143 single streaming transfer. This is organized as an ordered set of | |
144 deltas for changesets, manifests, and files. Large chunks of deltas | |
145 can be directly added to the repository without unpacking so it's | |
146 fairly fast. |