comparison mercurial/copies.py @ 18878:3cfaace0441e

copies._forwardcopies: use set operations to find missing files This is a performance win for a number of reasons: - We don't iterate over contexts, which avoids a completely unnecessary sorted call + the O(number of files) abstraction cost of doing that. - We don't check membership in a context, which avoids another O(number of files) abstraction cost. - We iterate over the manifests in C instead of Python. For a large repo with 170,000 files, this improves perfpathcopies from 0.34 seconds to 0.07. Anything that uses pathcopies, such as rebase or diff --git between two revisions, benefits.
author Siddharth Agarwal <sid0@fb.com>
date Thu, 04 Apr 2013 20:22:29 -0700
parents 5a4f220fbfca
children d8ff607ef721
comparison
equal deleted inserted replaced
18877:2e9fe9e2671f 18878:3cfaace0441e
131 131
132 # find where new files came from 132 # find where new files came from
133 # we currently don't try to find where old files went, too expensive 133 # we currently don't try to find where old files went, too expensive
134 # this means we can miss a case like 'hg rm b; hg cp a b' 134 # this means we can miss a case like 'hg rm b; hg cp a b'
135 cm = {} 135 cm = {}
136 for f in b: 136 missing = set(b.manifest().iterkeys())
137 if f not in a: 137 missing.difference_update(a.manifest().iterkeys())
138 ofctx = _tracefile(b[f], a) 138
139 if ofctx: 139 for f in missing:
140 cm[f] = ofctx.path() 140 ofctx = _tracefile(b[f], a)
141 if ofctx:
142 cm[f] = ofctx.path()
141 143
142 # combine copies from dirstate if necessary 144 # combine copies from dirstate if necessary
143 if w is not None: 145 if w is not None:
144 cm = _chain(a, w, cm, _dirstatecopies(w)) 146 cm = _chain(a, w, cm, _dirstatecopies(w))
145 147