strip: make tree stripping O(changes) instead of O(repo)
The old tree stripping logic iterated over every tree revlog in the repo looking
for commits that had revs to be stripped. That's very inefficient in large
repos. Instead, let's look at what files are touched by the strip and only
inspect those revlogs.
I don't have actual perf numbers, since internally we don't use a true
treemanifest, but simply iterating over hundreds of thousands of revlogs takes
many, many seconds, so this should help tremendously when stripping only a few
commits.
--- a/mercurial/repair.py Mon May 08 11:35:23 2017 -0700
+++ b/mercurial/repair.py Mon May 08 11:35:23 2017 -0700
@@ -238,11 +238,12 @@
def striptrees(repo, tr, striprev, files):
if 'treemanifest' in repo.requirements: # safe but unnecessary
# otherwise
- for unencoded, encoded, size in repo.store.datafiles():
- if (unencoded.startswith('meta/') and
- unencoded.endswith('00manifest.i')):
- dir = unencoded[5:-12]
- repo.manifestlog._revlog.dirlog(dir).strip(striprev, tr)
+ treerevlog = repo.manifestlog._revlog
+ for dir in util.dirs(files):
+ # If the revlog doesn't exist, this returns an empty revlog and is a
+ # no-op.
+ rl = treerevlog.dirlog(dir)
+ rl.strip(striprev, tr)
def rebuildfncache(ui, repo):
"""Rebuilds the fncache file from repo history.