Mercurial > hg
changeset 39570:f296c0b366c8
util: lower water mark when removing nodes after cost limit reached
See the inline comment for the reasoning here. This is a pretty
common strategy for garbage collectors, other cache-like primtives.
The performance impact is substantial:
$ hg perflrucachedict --size 4 --gets 1000000 --sets 1000000 --mixed 1000000 --costlimit 100
! inserts w/ cost limit
! wall 1.659181 comb 1.650000 user 1.650000 sys 0.000000 (best of 7)
! wall 1.722122 comb 1.720000 user 1.720000 sys 0.000000 (best of 6)
! mixed w/ cost limit
! wall 1.139955 comb 1.140000 user 1.140000 sys 0.000000 (best of 9)
! wall 1.182513 comb 1.180000 user 1.180000 sys 0.000000 (best of 9)
$ hg perflrucachedict --size 1000 --gets 1000000 --sets 1000000 --mixed 1000000 --costlimit 10000
! inserts
! wall 0.679546 comb 0.680000 user 0.680000 sys 0.000000 (best of 15)
! sets
! wall 0.825147 comb 0.830000 user 0.830000 sys 0.000000 (best of 13)
! inserts w/ cost limit
! wall 25.105273 comb 25.080000 user 25.080000 sys 0.000000 (best of 3)
! wall 1.724397 comb 1.720000 user 1.720000 sys 0.000000 (best of 6)
! mixed
! wall 0.807096 comb 0.810000 user 0.810000 sys 0.000000 (best of 13)
! mixed w/ cost limit
! wall 12.104470 comb 12.070000 user 12.070000 sys 0.000000 (best of 3)
! wall 1.190563 comb 1.190000 user 1.190000 sys 0.000000 (best of 9)
$ hg perflrucachedict --size 1000 --gets 1000000 --sets 1000000 --mixed 1000000 --costlimit 10000 --mixedgetfreq 90
! inserts
! wall 0.711177 comb 0.710000 user 0.710000 sys 0.000000 (best of 14)
! sets
! wall 0.846992 comb 0.850000 user 0.850000 sys 0.000000 (best of 12)
! inserts w/ cost limit
! wall 25.963028 comb 25.960000 user 25.960000 sys 0.000000 (best of 3)
! wall 2.184311 comb 2.180000 user 2.180000 sys 0.000000 (best of 5)
! mixed
! wall 0.728256 comb 0.730000 user 0.730000 sys 0.000000 (best of 14)
! mixed w/ cost limit
! wall 3.174256 comb 3.170000 user 3.170000 sys 0.000000 (best of 4)
! wall 0.773186 comb 0.770000 user 0.770000 sys 0.000000 (best of 13)
$ hg perflrucachedict --size 100000 --gets 1000000 --sets 1000000 --mixed 1000000 --mixedgetfreq 90 --costlimit 5000000
! gets
! wall 1.191368 comb 1.190000 user 1.190000 sys 0.000000 (best of 9)
! wall 1.195304 comb 1.190000 user 1.190000 sys 0.000000 (best of 9)
! inserts
! wall 0.950995 comb 0.950000 user 0.950000 sys 0.000000 (best of 11)
! inserts w/ cost limit
! wall 1.589732 comb 1.590000 user 1.590000 sys 0.000000 (best of 7)
! sets
! wall 1.094941 comb 1.100000 user 1.090000 sys 0.010000 (best of 9)
! mixed
! wall 0.936420 comb 0.940000 user 0.930000 sys 0.010000 (best of 10)
! mixed w/ cost limit
! wall 0.882780 comb 0.870000 user 0.870000 sys 0.000000 (best of 11)
This puts us ~2x slower than caches without cost accounting. And for
read-heavy workloads (the prime use cases for caches), performance is
nearly identical.
In the worst case (pure write workloads with cost accounting enabled),
we're looking at ~1.5us per insert on large caches. That seems "fast
enough."
Differential Revision: https://phab.mercurial-scm.org/D4505
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Thu, 06 Sep 2018 18:04:27 -0700 |
parents | cc23c09bc562 |
children | 8f2c0d1b454c |
files | mercurial/util.py tests/test-lrucachedict.py |
diffstat | 2 files changed, 13 insertions(+), 3 deletions(-) [+] |
line wrap: on
line diff
--- a/mercurial/util.py Thu Sep 06 12:40:30 2018 -0700 +++ b/mercurial/util.py Thu Sep 06 18:04:27 2018 -0700 @@ -1472,11 +1472,21 @@ # to walk the linked list and doing this in a loop would be # quadratic. So we find the first non-empty node and then # walk nodes until we free up enough capacity. + # + # If we only removed the minimum number of nodes to free enough + # cost at insert time, chances are high that the next insert would + # also require pruning. This would effectively constitute quadratic + # behavior for insert-heavy workloads. To mitigate this, we set a + # target cost that is a percentage of the max cost. This will tend + # to free more nodes when the high water mark is reached, which + # lowers the chances of needing to prune on the subsequent insert. + targetcost = int(self.maxcost * 0.75) + n = self._head.prev while n.key is _notset: n = n.prev - while len(self) > 1 and self.totalcost > self.maxcost: + while len(self) > 1 and self.totalcost > targetcost: del self._cache[n.key] self.totalcost -= n.cost n.markempty()
--- a/tests/test-lrucachedict.py Thu Sep 06 12:40:30 2018 -0700 +++ b/tests/test-lrucachedict.py Thu Sep 06 18:04:27 2018 -0700 @@ -314,10 +314,10 @@ # Inserting new element should free multiple elements so we hit # low water mark. d.insert('e', 'vd', cost=25) - self.assertEqual(len(d), 3) + self.assertEqual(len(d), 2) self.assertNotIn('a', d) self.assertNotIn('b', d) - self.assertIn('c', d) + self.assertNotIn('c', d) self.assertIn('d', d) self.assertIn('e', d)