diff mercurial/scmutil.py @ 38779:d750a6c9248d stable

scmutil: avoid quadratic membership testing (issue5969) tr.changes['revs'] is an xrange, which has an O(n) __contains__ implementation. The `rev not in newrevs` lookup a few lines below will therefore be O(n^2) if all incoming changesets are public. This issue isn't present on @ because 45e05d39d9ce introduced a custom type implementing an xrange primitive with O(1) contains and switched tr.changes['revs'] to be an instance of that type. We work around the problem on the stable branch by casting the xrange to a set. This is a bit hacky because it requires allocating memory to hold each integer in the range. But we are already holding the full set of pulled revision numbers in memory multiple times (such as in `tr.changes['phases']`). So this is a relatively minor problem. This issue has been present since the phases reporting code was introduced in the 4.7 cycle by eb9835014d20. This change should be reverted/ignored when stable is merged into default. On the mozilla-unified repository with 483492 changesets, `hg clone` time improves substantially: before: 1843.700s user; 29.810s sys after: 461.170s user; 29.360s sys
author Gregory Szorc <gregory.szorc@gmail.com>
date Fri, 24 Aug 2018 18:21:55 -0700
parents f8cbff2184d7
children f98d3c57906f
line wrap: on
line diff
--- a/mercurial/scmutil.py	Sat Aug 18 10:24:57 2018 +0200
+++ b/mercurial/scmutil.py	Fri Aug 24 18:21:55 2018 -0700
@@ -1565,7 +1565,10 @@
             """Report statistics of phase changes for changesets pre-existing
             pull/unbundle.
             """
-            newrevs = tr.changes.get('revs', xrange(0, 0))
+            # TODO set() is only appropriate for 4.7 since revs post
+            # 45e05d39d9ce is a pycompat.membershiprange, which has O(n)
+            # membership testing.
+            newrevs = set(tr.changes.get('revs', xrange(0, 0)))
             phasetracking = tr.changes.get('phases', {})
             if not phasetracking:
                 return