changeset 3506:da9506fe2710

Remove some old documentation that belongs on the wiki
author Matt Mackall <mpm@selenic.com>
date Wed, 25 Oct 2006 16:25:44 -0500
parents 1295ad66fac8
children 72efff4be2ad
files comparison.txt notes.txt
diffstat 2 files changed, 0 insertions(+), 177 deletions(-) [+]
line wrap: on
line diff
--- a/comparison.txt	Wed Oct 25 16:24:28 2006 -0500
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,31 +0,0 @@
-                    Mercurial      git                     BK (*)
-storage             revlog delta   compressed revisions    SCCS weave
-storage naming      by filename    by revision hash        by filename
-merge               file DAGs      changeset DAG           file DAGs?
-consistency         SHA1           SHA1                    CRC
-signable?           yes            yes                     no
-
-retrieve file tip   O(1)           O(1)                    O(revs)
-add rev             O(1)           O(1)                    O(revs)
-find prev file rev  O(1)           O(changesets)           O(revs)
-annotate file       O(revs)        O(changesets)           O(revs)
-find file changeset O(1)           O(changesets)           ?
-
-checkout            O(files)       O(files)                O(revs)?
-commit              O(changes)     O(changes)              ?
-                    6 patches/s    6 patches/s             slow
-diff working dir    O(changes)     O(changes)              ?
-                    < 1s           < 1s                    ?
-tree diff revs      O(changes)     O(changes)              ?
-                    < 1s           < 1s                    ?
-hardlink clone      O(files)       O(revisions)            O(files)
-
-find remote csets   O(log new)     rsync: O(revisions)     ?
-                                   git-http: O(changesets)
-pull remote csets   O(patch)       O(modified files)       O(patch)
-
-repo growth         O(patch)       O(revisions)            O(patch)
- kernel history     300M           3.5G?                   250M?
-lines of code       2500           6500 (+ cogito)         ??
-
-* I've never used BK so this is just guesses
--- a/notes.txt	Wed Oct 25 16:24:28 2006 -0500
+++ /dev/null	Thu Jan 01 00:00:00 1970 +0000
@@ -1,146 +0,0 @@
-Some notes about Mercurial's design
-
-Revlogs:
-
-The fundamental storage type in Mercurial is a "revlog". A revlog is
-the set of all revisions to a file. Each revision is either stored
-compressed in its entirety or as a compressed binary delta against the
-previous version. The decision of when to store a full version is made
-based on how much data would be needed to reconstruct the file. This
-lets us ensure that we never need to read huge amounts of data to
-reconstruct a file, regardless of how many revisions of it we store.
-
-In fact, we should always be able to do it with a single read,
-provided we know when and where to read. This is where the index comes
-in. Each revlog has an index containing a special hash (nodeid) of the
-text, hashes for its parents, and where and how much of the revlog
-data we need to read to reconstruct it. Thus, with one read of the
-index and one read of the data, we can reconstruct any version in time
-proportional to the file size.
-
-Similarly, revlogs and their indices are append-only. This means that
-adding a new version is also O(1) seeks.
-
-Generally revlogs are used to represent revisions of files, but they
-also are used to represent manifests and changesets.
-
-Manifests:
-
-A manifest is simply a list of all files in a given revision of a
-project along with the nodeids of the corresponding file revisions. So
-grabbing a given version of the project means simply looking up its
-manifest and reconstruction all the file revisions pointed to by it.
-
-Changesets:
-
-A changeset is a list of all files changed in a check-in along with a
-change description and some metadata like user and date. It also
-contains a nodeid to the relevent revision of the manifest. Changesets
-and manifests are one-to-one, but contain different data for
-convenience.
-
-Nodeids:
-
-Nodeids are unique ids that are used to represent the contents of a
-file AND its position in the project history. That is, if you change a
-file and then change it back, the result will have a different nodeid
-because it has different history. This is accomplished by including
-the parents of a given revision's nodeids with the revision's text
-when calculating the hash.
-
-Graph merging:
-
-Nodeids are implemented as they are to simplify merging. Merging a
-pair of directed acyclic graphs (aka "the family tree" of the file
-history) requires some method of determining if nodes in different
-graphs correspond. Simply comparing the contents of the node (by
-comparing text of given revisions or their hashes) can get confused by
-identical revisions in the tree.
-
-The nodeid approach makes it trivial - the hash uniquely describes a
-revision's contents and its graph position relative to the root, so
-merge is simply checking whether each nodeid in graph A is in the hash
-table of graph B. If not, we pull them in, adding them sequentially to
-the revlog.
-
-Branching and merging:
-
-Everything in Mercurial is potentially a branch and every user
-effectively works in their own branch. When you do a checkout,
-Mercurial remembers what the parent changeset was and uses it for the
-next check in.
-
-To do a merge of branches in Mercurial, you check out the heads of the
-two branches into the same working directory which causes a merge to
-be performed, and then check in the result once you're happy with it.
-The resulting checkin will have two parents.
-
-It decides when a merge is necessary by first determining if there are
-any uncommitted changes in the working directory. This effectively
-makes the working directory a branch off the checked in version it's
-based on. Then it also determines if the working directory is a direct
-ancestor or descendent of the second version we're attempting to
-checkout. If neither is true, we simply replace the working directory
-version with the new version. Otherwise we perform a merge between the
-two versions.
-
-Merging files and manifests:
-
-We begin by comparing two versions manifests and deciding which files
-need to be added, deleted, and merged.
-
-Then for each file, we perform a graph merge and resolve as above.
-It's important to merge files using per-file DAGs rather than just
-changeset level DAGs as this diagram illustrates:
-
-M   M1   M2
-
-AB
- |`-------v     M2 clones M
-aB       AB     file A is change in mainline
- |`---v  AB'    file B is changed in M2
- |   aB / |     M1 clones M
- |   ab/  |     M1 changes B
- |   ab'  |     M1 merges from M2, changes to B conflict
- |    |  A'B'   M2 changes A
-  `---+--.|
-      |  a'B'   M2 merges from mainline, changes to A conflict
-      `--.|
-         ???    depending on which ancestor we choose, we will have
-	        to redo A hand-merge, B hand-merge, or both
-                but if we look at the files independently, everything
-		is fine
-
-The result is a merged version in the working directory, waiting for
-check-in.
-
-Rollback:
-
-When performing a commit or a merge, we order things so that the
-changeset entry gets added last. We keep a transaction log of the name
-of each file touched and its length prior to the transaction. On
-abort, we simply truncate each file to its prior length. This is one
-of the nice properties of the append-only structure of the revlogs.
-We can also reuse this journal for "rollback".
-
-Merging between repositories:
-
-One of the key features of Mercurial is the ability to merge between
-independent repositories in a decentralized fashion. Each repository
-can act as a read-only server or a client. Clients operating by
-pulling all branches that it hasn't seen from the server and adding
-them into its graph. This is done in two steps: searching for new
-"roots" and pulling a "changegroup"
-
-Searching for new "roots" begins by finding all new heads and
-searching backwards from those heads to the first unknown nodes in
-their respective branches. These nodes are the 'roots' that are used
-to calculate the 'changegroup': the set of all changesets starting at
-those roots. Mercurial takes pains to make this search efficient in
-both bandwidth and round-trips.
-
-Once the roots are found, the changegroup can be transferred as a
-single streaming transfer. This is organized as an ordered set of
-deltas for changesets, manifests, and files. Large chunks of deltas
-can be directly added to the repository without unpacking so it's
-fairly fast.