Mercurial > hg
changeset 18878:3cfaace0441e
copies._forwardcopies: use set operations to find missing files
This is a performance win for a number of reasons:
- We don't iterate over contexts, which avoids a completely unnecessary sorted
call + the O(number of files) abstraction cost of doing that.
- We don't check membership in a context, which avoids another
O(number of files) abstraction cost.
- We iterate over the manifests in C instead of Python.
For a large repo with 170,000 files, this improves perfpathcopies from 0.34
seconds to 0.07. Anything that uses pathcopies, such as rebase or diff --git
between two revisions, benefits.
author | Siddharth Agarwal <sid0@fb.com> |
---|---|
date | Thu, 04 Apr 2013 20:22:29 -0700 |
parents | 2e9fe9e2671f |
children | 565482e2ac6b 177774e4b04a |
files | mercurial/copies.py |
diffstat | 1 files changed, 7 insertions(+), 5 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/copies.py Thu Apr 04 20:36:46 2013 -0700 +++ b/mercurial/copies.py Thu Apr 04 20:22:29 2013 -0700 @@ -133,11 +133,13 @@ # we currently don't try to find where old files went, too expensive # this means we can miss a case like 'hg rm b; hg cp a b' cm = {} - for f in b: - if f not in a: - ofctx = _tracefile(b[f], a) - if ofctx: - cm[f] = ofctx.path() + missing = set(b.manifest().iterkeys()) + missing.difference_update(a.manifest().iterkeys()) + + for f in missing: + ofctx = _tracefile(b[f], a) + if ofctx: + cm[f] = ofctx.path() # combine copies from dirstate if necessary if w is not None: