diff mercurial/manifest.py @ 27343:c59647c6694d

treemanifest: don't iterate entire matching submanifests on match() Before 2773540c3650 (match: remove unnecessary optimization where visitdir() returns 'all', 2015-05-06), match.visitdir() used to return the special value 'all' to indicate that it was known that all subdirectories would also be included in the match. The purpose for that value was to avoid calling the matcher on all the paths. It turned out that calling the matcher was not a problem, so the special return value was removed and the code was simplified. However, if we use the same special value for not just avoiding calling the matcher on each file, but to avoid iterating over each file, it's a much bigger win. On commands like hg st --rev .^ --rev . dom/ we run the matcher (dom/) on the two manifests, then diff the narrowed manifest. If the size of the match is much larger than the size of the diff, this is wasteful. In the above case, we would end up iterating over the 15k-or-so files in dom/ for each of the manifests, only to later discover that they are mostly the same. This means that runningt the command above is usually slower than getting the status for the entire repo, because that code avoids calling treemanifest.match() and only calls treemanifest.diff(), which loads only what's needed for the diff. Let's fix this by reintroducing the 'all' value in match.visitdir() and making treemanifest.match() return a lazy copy of the manifest from dom/ and down (in the above case). This speeds up the above command on the Firefox repo from 0.357s to 0.137s (best of 5). The wider the match, the bigger the speedup.
author Martin von Zweigbergk <martinvonz@google.com>
date Sat, 12 Dec 2015 09:57:05 -0800
parents 2a31433a59ba
children f888676a23d0
line wrap: on
line diff
--- a/mercurial/manifest.py	Sat Dec 12 20:59:37 2015 -0800
+++ b/mercurial/manifest.py	Sat Dec 12 09:57:05 2015 -0800
@@ -740,9 +740,12 @@
     def _matches(self, match):
         '''recursively generate a new manifest filtered by the match argument.
         '''
+
+        visit = match.visitdir(self._dir[:-1] or '.')
+        if visit == 'all':
+            return self.copy()
         ret = treemanifest(self._dir)
-
-        if not match.visitdir(self._dir[:-1] or '.'):
+        if not visit:
             return ret
 
         self._load()