Mercurial > hg
annotate notes.txt @ 1696:4a3d4843a1bc
Merge with Peter van Dijk
author | Thomas Arendsen Hein <thomas@intevation.de> |
---|---|
date | Fri, 03 Feb 2006 12:58:40 +0100 |
parents | 2073e5a71008 |
children | 4f072bb06e89 |
rev | line source |
---|---|
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
1 Some notes about Mercurial's design |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
2 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
3 Revlogs: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
4 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
5 The fundamental storage type in Mercurial is a "revlog". A revlog is |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
6 the set of all revisions to a file. Each revision is either stored |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
7 compressed in its entirety or as a compressed binary delta against the |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
8 previous version. The decision of when to store a full version is made |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
9 based on how much data would be needed to reconstruct the file. This |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
10 lets us ensure that we never need to read huge amounts of data to |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
11 reconstruct a file, regardless of how many revisions of it we store. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
12 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
13 In fact, we should always be able to do it with a single read, |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
14 provided we know when and where to read. This is where the index comes |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
15 in. Each revlog has an index containing a special hash (nodeid) of the |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
16 text, hashes for its parents, and where and how much of the revlog |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
17 data we need to read to reconstruct it. Thus, with one read of the |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
18 index and one read of the data, we can reconstruct any version in time |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
19 proportional to the file size. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
20 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
21 Similarly, revlogs and their indices are append-only. This means that |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
22 adding a new version is also O(1) seeks. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
23 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
24 Generally revlogs are used to represent revisions of files, but they |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
25 also are used to represent manifests and changesets. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
26 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
27 Manifests: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
28 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
29 A manifest is simply a list of all files in a given revision of a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
30 project along with the nodeids of the corresponding file revisions. So |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
31 grabbing a given version of the project means simply looking up its |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
32 manifest and reconstruction all the file revisions pointed to by it. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
33 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
34 Changesets: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
35 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
36 A changeset is a list of all files changed in a check-in along with a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
37 change description and some metadata like user and date. It also |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
38 contains a nodeid to the relevent revision of the manifest. Changesets |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
39 and manifests are one-to-one, but contain different data for |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
40 convenience. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
41 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
42 Nodeids: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
43 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
44 Nodeids are unique ids that are used to represent the contents of a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
45 file AND its position in the project history. That is, if you change a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
46 file and then change it back, the result will have a different nodeid |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
47 because it has different history. This is accomplished by including |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
48 the parents of a given revision's nodeids with the revision's text |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
49 when calculating the hash. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
50 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
51 Graph merging: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
52 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
53 Nodeids are implemented as they are to simplify merging. Merging a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
54 pair of directed acyclic graphs (aka "the family tree" of the file |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
55 history) requires some method of determining if nodes in different |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
56 graphs correspond. Simply comparing the contents of the node (by |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
57 comparing text of given revisions or their hashes) can get confused by |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
58 identical revisions in the tree. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
59 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
60 The nodeid approach makes it trivial - the hash uniquely describes a |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
61 revision's contents and its graph position relative to the root, so |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
62 merge is simply checking whether each nodeid in graph A is in the hash |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
63 table of graph B. If not, we pull them in, adding them sequentially to |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
64 the revlog. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
65 |
260 | 66 Branching and merging: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
67 |
260 | 68 Everything in Mercurial is potentially a branch and every user |
69 effectively works in their own branch. When you do a checkout, | |
70 Mercurial remembers what the parent changeset was and uses it for the | |
71 next check in. | |
72 | |
73 To do a merge of branches in Mercurial, you check out the heads of the | |
74 two branches into the same working directory which causes a merge to | |
75 be performed, and then check in the result once you're happy with it. | |
76 The resulting checkin will have two parents. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
77 |
260 | 78 It decides when a merge is necessary by first determining if there are |
79 any uncommitted changes in the working directory. This effectively | |
80 makes the working directory a branch off the checked in version it's | |
81 based on. Then it also determines if the working directory is a direct | |
82 ancestor or descendent of the second version we're attempting to | |
83 checkout. If neither is true, we simply replace the working directory | |
84 version with the new version. Otherwise we perform a merge between the | |
85 two versions. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
86 |
260 | 87 Merging files and manifests: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
88 |
260 | 89 We begin by comparing two versions manifests and deciding which files |
90 need to be added, deleted, and merged. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
91 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
92 Then for each file, we perform a graph merge and resolve as above. |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
93 It's important to merge files using per-file DAGs rather than just |
1308
2073e5a71008
Cleanup of tabs and trailing spaces.
Thomas Arendsen Hein <thomas@intevation.de>
parents:
260
diff
changeset
|
94 changeset level DAGs as this diagram illustrates: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
95 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
96 M M1 M2 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
97 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
98 AB |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
99 |`-------v M2 clones M |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
100 aB AB file A is change in mainline |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
101 |`---v AB' file B is changed in M2 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
102 | aB / | M1 clones M |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
103 | ab/ | M1 changes B |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
104 | ab' | M1 merges from M2, changes to B conflict |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
105 | | A'B' M2 changes A |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
106 `---+--.| |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
107 | a'B' M2 merges from mainline, changes to A conflict |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
108 `--.| |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
109 ??? depending on which ancestor we choose, we will have |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
110 to redo A hand-merge, B hand-merge, or both |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
111 but if we look at the files independently, everything |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
112 is fine |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
113 |
260 | 114 The result is a merged version in the working directory, waiting for |
115 check-in. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
116 |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
117 Rollback: |
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
118 |
47 | 119 When performing a commit or a merge, we order things so that the |
120 changeset entry gets added last. We keep a transaction log of the name | |
121 of each file touched and its length prior to the transaction. On | |
122 abort, we simply truncate each file to its prior length. This is one | |
123 of the nice properties of the append-only structure of the revlogs. | |
260 | 124 We can also reuse this journal for "undo". |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
125 |
260 | 126 Merging between repositories: |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
127 |
260 | 128 One of the key features of Mercurial is the ability to merge between |
129 independent repositories in a decentralized fashion. Each repository | |
130 can act as a read-only server or a client. Clients operating by | |
131 pulling all branches that it hasn't seen from the server and adding | |
132 them into its graph. This is done in two steps: searching for new | |
133 "roots" and pulling a "changegroup" | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
134 |
260 | 135 Searching for new "roots" begins by finding all new heads and |
136 searching backwards from those heads to the first unknown nodes in | |
137 their respective branches. These nodes are the 'roots' that are used | |
138 to calculate the 'changegroup': the set of all changesets starting at | |
139 those roots. Mercurial takes pains to make this search efficient in | |
140 both bandwidth and round-trips. | |
0
9117c6561b0b
Add back links from file revisions to changeset revisions
mpm@selenic.com
parents:
diff
changeset
|
141 |
260 | 142 Once the roots are found, the changegroup can be transferred as a |
143 single streaming transfer. This is organized as an ordered set of | |
144 deltas for changesets, manifests, and files. Large chunks of deltas | |
145 can be directly added to the repository without unpacking so it's | |
146 fairly fast. |