changeset 32197:7bcc9a5ab96e

strip: make tree stripping O(changes) instead of O(repo) The old tree stripping logic iterated over every tree revlog in the repo looking for commits that had revs to be stripped. That's very inefficient in large repos. Instead, let's look at what files are touched by the strip and only inspect those revlogs. I don't have actual perf numbers, since internally we don't use a true treemanifest, but simply iterating over hundreds of thousands of revlogs takes many, many seconds, so this should help tremendously when stripping only a few commits.
author Durham Goode <durham@fb.com>
date Mon, 08 May 2017 11:35:23 -0700
parents a2be2abe9476
children 2bf62ca7072f
files mercurial/repair.py
diffstat 1 files changed, 6 insertions(+), 5 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/repair.py	Mon May 08 11:35:23 2017 -0700
+++ b/mercurial/repair.py	Mon May 08 11:35:23 2017 -0700
@@ -238,11 +238,12 @@
 def striptrees(repo, tr, striprev, files):
     if 'treemanifest' in repo.requirements: # safe but unnecessary
                                             # otherwise
-        for unencoded, encoded, size in repo.store.datafiles():
-            if (unencoded.startswith('meta/') and
-                unencoded.endswith('00manifest.i')):
-                dir = unencoded[5:-12]
-                repo.manifestlog._revlog.dirlog(dir).strip(striprev, tr)
+        treerevlog = repo.manifestlog._revlog
+        for dir in util.dirs(files):
+            # If the revlog doesn't exist, this returns an empty revlog and is a
+            # no-op.
+            rl = treerevlog.dirlog(dir)
+            rl.strip(striprev, tr)
 
 def rebuildfncache(ui, repo):
     """Rebuilds the fncache file from repo history.