Reduce the amount of stat traffic generated by a walk.
When we switched to the new walk code for commands, we no longer passed a
list of specific files to the repo or dirstate walk or changes methods.
This meant that we always walked and attempted to match everything,
which was not efficient.
Now, if we are given any patterns to match, or nothing at all, we still
walk everything. But if we are given only file names that contain no
glob characters, we only walk those.
Mercurial git BK (*)
storage revlog delta compressed revisions SCCS weave
storage naming by filename by revision hash by filename
merge file DAGs changeset DAG file DAGs?
consistency SHA1 SHA1 CRC
signable? yes yes no
retrieve file tip O(1) O(1) O(revs)
add rev O(1) O(1) O(revs)
find prev file rev O(1) O(changesets) O(revs)
annotate file O(revs) O(changesets) O(revs)
find file changeset O(1) O(changesets) ?
checkout O(files) O(files) O(revs)?
commit O(changes) O(changes) ?
6 patches/s 6 patches/s slow
diff working dir O(changes) O(changes) ?
< 1s < 1s ?
tree diff revs O(changes) O(changes) ?
< 1s < 1s ?
hardlink clone O(files) O(revisions) O(files)
find remote csets O(log new) rsync: O(revisions) ?
git-http: O(changesets)
pull remote csets O(patch) O(modified files) O(patch)
repo growth O(patch) O(revisions) O(patch)
kernel history 300M 3.5G? 250M?
lines of code 2500 6500 (+ cogito) ??
* I've never used BK so this is just guesses