Mercurial > hg-stable
comparison mercurial/copies.py @ 43300:ffd04bc9f57d
copies: move from a copy on branchpoint to a copy on write approach
Before this changes, any branch points results in a copy of the dictionary containing the
copy information. This can be very costly for branchy history with few rename
information. Instead, we take a "copy on write" approach. Copying the input data
only when we are about to update them.
In practice we where already doing the copying in half of these case (because
`_chain` makes a copy), so we don't add a significant cost here even in the
linear case. However the speed up in branchy case is very significant. Here are
some timing on the pypy repository.
revision: large amount; added files: large amount; rename small amount; c3b14617fbd7 9ba6ab77fd29
before: ! wall 1.399863 comb 1.400000 user 1.370000 sys 0.030000 (median of 10)
after: ! wall 0.766453 comb 0.770000 user 0.750000 sys 0.020000 (median of 11)
revision: large amount; added files: small amount; rename small amount; c3b14617fbd7 f650a9b140d2
before: ! wall 1.876748 comb 1.890000 user 1.870000 sys 0.020000 (median of 10)
after: ! wall 1.167223 comb 1.170000 user 1.150000 sys 0.020000 (median of 10)
revision: large amount; added files: large amount; rename large amount; 08ea3258278e d9fa043f30c0
before: ! wall 0.242457 comb 0.240000 user 0.240000 sys 0.000000 (median of 39)
after: ! wall 0.211476 comb 0.210000 user 0.210000 sys 0.000000 (median of 45)
revision: small amount; added files: large amount; rename large amount; df6f7a526b60 a83dc6a2d56f
before: ! wall 0.013193 comb 0.020000 user 0.020000 sys 0.000000 (median of 224)
after: ! wall 0.013290 comb 0.010000 user 0.010000 sys 0.000000 (median of 222)
revision: small amount; added files: large amount; rename small amount; 4aa4e1f8e19a 169138063d63
before: ! wall 0.001673 comb 0.000000 user 0.000000 sys 0.000000 (median of 1000)
after: ! wall 0.001677 comb 0.000000 user 0.000000 sys 0.000000 (median of 1000)
revision: small amount; added files: small amount; rename small amount; 4bc173b045a6 964879152e2e
before: ! wall 0.000119 comb 0.000000 user 0.000000 sys 0.000000 (median of 8023)
after: ! wall 0.000119 comb 0.000000 user 0.000000 sys 0.000000 (median of 7997)
revision: medium amount; added files: large amount; rename medium amount; c95f1ced15f2 2c68e87c3efe
before: ! wall 0.201898 comb 0.210000 user 0.200000 sys 0.010000 (median of 48)
after: ! wall 0.167415 comb 0.170000 user 0.160000 sys 0.010000 (median of 58)
revision: medium amount; added files: medium amount; rename small amount; d343da0c55a8 d7746d32bf9d
before: ! wall 0.036820 comb 0.040000 user 0.040000 sys 0.000000 (median of 100)
after: ! wall 0.035797 comb 0.040000 user 0.040000 sys 0.000000 (median of 100)
The extra cost in the linear case can be reclaimed later with some extra logic.
Differential Revision: https://phab.mercurial-scm.org/D7124
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Tue, 15 Oct 2019 18:23:34 +0200 |
parents | 83bb1e89ab9b |
children | 90213d027154 |
comparison
equal
deleted
inserted
replaced
43299:83bb1e89ab9b | 43300:ffd04bc9f57d |
---|---|
268 childcopies = p2copies | 268 childcopies = p2copies |
269 if not alwaysmatch: | 269 if not alwaysmatch: |
270 childcopies = { | 270 childcopies = { |
271 dst: src for dst, src in childcopies.items() if match(dst) | 271 dst: src for dst, src in childcopies.items() if match(dst) |
272 } | 272 } |
273 # Copy the dict only if later iterations will also need it | 273 newcopies = copies |
274 if i != len(children[r]) - 1: | |
275 newcopies = copies.copy() | |
276 else: | |
277 newcopies = copies | |
278 if childcopies: | 274 if childcopies: |
279 newcopies = _chain(newcopies, childcopies) | 275 newcopies = _chain(newcopies, childcopies) |
276 # _chain makes a copies, we can avoid doing so in some | |
277 # simple/linear cases. | |
278 assert newcopies is not copies | |
280 for f in removed: | 279 for f in removed: |
281 if f in newcopies: | 280 if f in newcopies: |
281 if newcopies is copies: | |
282 # copy on write to avoid affecting potential other | |
283 # branches. when there are no other branches, this | |
284 # could be avoided. | |
285 newcopies = copies.copy() | |
282 del newcopies[f] | 286 del newcopies[f] |
283 othercopies = all_copies.get(c) | 287 othercopies = all_copies.get(c) |
284 if othercopies is None: | 288 if othercopies is None: |
285 all_copies[c] = newcopies | 289 all_copies[c] = newcopies |
286 else: | 290 else: |