revlog: ignore empty trailing chunks when reading segments
When a merge commit creates an empty diff in the revlog, its offset may still
be quite far from the end of the previous chunk.
Skipping these empty chunks may reduce read size significantly.
In most cases, there is no gain, and in some cases, little gain.
On my clone of pypy, `hg manifest` reads 65% less bytes (96140 i/o 275943) for
revision 4260 by ignoring the only empty trailing diff.
For revision 2229, 35% (34557 i/o 53435)
Sadly, this is difficult to reproduce, as hg clone can make its own different
structure every time.
--- a/mercurial/revlog.py Wed Sep 20 19:17:37 2017 +0200
+++ b/mercurial/revlog.py Mon Oct 09 15:13:41 2017 +0200
@@ -1327,8 +1327,14 @@
l = []
ladd = l.append
+ firstrev = revs[0]
+ # Skip trailing revisions with empty diff
+ for lastrev in revs[::-1]:
+ if length(lastrev) != 0:
+ break
+
try:
- offset, data = self._getsegmentforrevs(revs[0], revs[-1], df=df)
+ offset, data = self._getsegmentforrevs(firstrev, lastrev, df=df)
except OverflowError:
# issue4215 - we can't cache a run of chunks greater than
# 2G on Windows