snapshot: refine candidate snapshot base upward
Once we found a suitable snapshot base it is useful to check if it has a
"children" snapshot that would provide a better diff. This is useful when base
not directly related to stored revision are picked. In those case, we "jumped"
to this new chain at an arbitrary point, checking if a higher point is more
appropriate will help to provide better results and increase snapshot reuse.
--- a/mercurial/revlogutils/deltas.py Fri Sep 07 11:17:35 2018 -0400
+++ b/mercurial/revlogutils/deltas.py Fri Sep 07 11:17:36 2018 -0400
@@ -658,6 +658,19 @@
if base == nullrev:
break
good = yield (base,)
+ # refine snapshot up
+ #
+ # XXX the _findsnapshots call can be expensive and is "duplicated" with
+ # the one done in `_rawgroups`. Once we start working on performance,
+ # we should make the two logics share this computation.
+ snapshots = collections.defaultdict(list)
+ _findsnapshots(revlog, snapshots, good + 1)
+ previous = None
+ while good != previous:
+ previous = good
+ children = tuple(sorted(c for c in snapshots[good]))
+ good = yield children
+
# we have found nothing
yield None
--- a/tests/test-sparse-revlog.t Fri Sep 07 11:17:35 2018 -0400
+++ b/tests/test-sparse-revlog.t Fri Sep 07 11:17:36 2018 -0400
@@ -77,7 +77,7 @@
$ f -s .hg/store/data/*.d
- .hg/store/data/_s_p_a_r_s_e-_r_e_v_l_o_g-_t_e_s_t-_f_i_l_e.d: size=59303048
+ .hg/store/data/_s_p_a_r_s_e-_r_e_v_l_o_g-_t_e_s_t-_f_i_l_e.d: size=59302280
$ hg debugrevlog *
format : 1
flags : generaldelta
@@ -89,45 +89,45 @@
empty : 0 ( 0.00%)
text : 0 (100.00%)
delta : 0 (100.00%)
- snapshot : 165 ( 3.30%)
+ snapshot : 168 ( 3.36%)
lvl-0 : 4 ( 0.08%)
- lvl-1 : 17 ( 0.34%)
- lvl-2 : 46 ( 0.92%)
- lvl-3 : 62 ( 1.24%)
- lvl-4 : 36 ( 0.72%)
- deltas : 4836 (96.70%)
- revision size : 59303048
- snapshot : 6105443 (10.30%)
- lvl-0 : 804187 ( 1.36%)
- lvl-1 : 1476228 ( 2.49%)
- lvl-2 : 1752567 ( 2.96%)
- lvl-3 : 1461776 ( 2.46%)
- lvl-4 : 610685 ( 1.03%)
- deltas : 53197605 (89.70%)
+ lvl-1 : 18 ( 0.36%)
+ lvl-2 : 39 ( 0.78%)
+ lvl-3 : 54 ( 1.08%)
+ lvl-4 : 53 ( 1.06%)
+ deltas : 4833 (96.64%)
+ revision size : 59302280
+ snapshot : 5833942 ( 9.84%)
+ lvl-0 : 804068 ( 1.36%)
+ lvl-1 : 1378470 ( 2.32%)
+ lvl-2 : 1608138 ( 2.71%)
+ lvl-3 : 1222158 ( 2.06%)
+ lvl-4 : 821108 ( 1.38%)
+ deltas : 53468338 (90.16%)
chunks : 5001
0x78 (x) : 5001 (100.00%)
- chunks size : 59303048
- 0x78 (x) : 59303048 (100.00%)
+ chunks size : 59302280
+ 0x78 (x) : 59302280 (100.00%)
avg chain length : 17
max chain length : 45
- max chain reach : 26194433
+ max chain reach : 22744720
compression ratio : 29
uncompressed data size (min/max/avg) : 346468 / 346472 / 346471
- full revision size (min/max/avg) : 200992 / 201080 / 201046
- inter-snapshot size (min/max/avg) : 11610 / 172762 / 32927
- level-1 (min/max/avg) : 15619 / 172762 / 86836
- level-2 (min/max/avg) : 13055 / 85219 / 38099
- level-3 (min/max/avg) : 11610 / 42645 / 23577
- level-4 (min/max/avg) : 12928 / 20205 / 16963
- delta size (min/max/avg) : 10649 / 106863 / 11000
+ full revision size (min/max/avg) : 200985 / 201050 / 201017
+ inter-snapshot size (min/max/avg) : 11598 / 163304 / 30669
+ level-1 (min/max/avg) : 15616 / 163304 / 76581
+ level-2 (min/max/avg) : 11602 / 86428 / 41234
+ level-3 (min/max/avg) : 11598 / 42390 / 22632
+ level-4 (min/max/avg) : 11603 / 19649 / 15492
+ delta size (min/max/avg) : 10649 / 105465 / 11063
- deltas against prev : 4162 (86.06%)
- where prev = p1 : 4120 (98.99%)
+ deltas against prev : 4167 (86.22%)
+ where prev = p1 : 4129 (99.09%)
where prev = p2 : 0 ( 0.00%)
- other : 42 ( 1.01%)
- deltas against p1 : 653 (13.50%)
- deltas against p2 : 21 ( 0.43%)
+ other : 38 ( 0.91%)
+ deltas against p1 : 643 (13.30%)
+ deltas against p2 : 23 ( 0.48%)
deltas against other : 0 ( 0.00%)