changeset 18878:3cfaace0441e

copies._forwardcopies: use set operations to find missing files This is a performance win for a number of reasons: - We don't iterate over contexts, which avoids a completely unnecessary sorted call + the O(number of files) abstraction cost of doing that. - We don't check membership in a context, which avoids another O(number of files) abstraction cost. - We iterate over the manifests in C instead of Python. For a large repo with 170,000 files, this improves perfpathcopies from 0.34 seconds to 0.07. Anything that uses pathcopies, such as rebase or diff --git between two revisions, benefits.
author Siddharth Agarwal <sid0@fb.com>
date Thu, 04 Apr 2013 20:22:29 -0700
parents 2e9fe9e2671f
children 565482e2ac6b 177774e4b04a
files mercurial/copies.py
diffstat 1 files changed, 7 insertions(+), 5 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/copies.py	Thu Apr 04 20:36:46 2013 -0700
+++ b/mercurial/copies.py	Thu Apr 04 20:22:29 2013 -0700
@@ -133,11 +133,13 @@
     # we currently don't try to find where old files went, too expensive
     # this means we can miss a case like 'hg rm b; hg cp a b'
     cm = {}
-    for f in b:
-        if f not in a:
-            ofctx = _tracefile(b[f], a)
-            if ofctx:
-                cm[f] = ofctx.path()
+    missing = set(b.manifest().iterkeys())
+    missing.difference_update(a.manifest().iterkeys())
+
+    for f in missing:
+        ofctx = _tracefile(b[f], a)
+        if ofctx:
+            cm[f] = ofctx.path()
 
     # combine copies from dirstate if necessary
     if w is not None: