revlog: ignore empty trailing chunks when reading segments
authorPaul Morelle <paul.morelle@octobus.net>
Mon, 09 Oct 2017 15:13:41 +0200
changeset 34823 7891d243d821
parent 34822 c1e7ce11db9b
child 34824 e2ad93bcc084
revlog: ignore empty trailing chunks when reading segments When a merge commit creates an empty diff in the revlog, its offset may still be quite far from the end of the previous chunk. Skipping these empty chunks may reduce read size significantly. In most cases, there is no gain, and in some cases, little gain. On my clone of pypy, `hg manifest` reads 65% less bytes (96140 i/o 275943) for revision 4260 by ignoring the only empty trailing diff. For revision 2229, 35% (34557 i/o 53435) Sadly, this is difficult to reproduce, as hg clone can make its own different structure every time.
mercurial/revlog.py
--- a/mercurial/revlog.py	Wed Sep 20 19:17:37 2017 +0200
+++ b/mercurial/revlog.py	Mon Oct 09 15:13:41 2017 +0200
@@ -1327,8 +1327,14 @@
         l = []
         ladd = l.append
 
+        firstrev = revs[0]
+        # Skip trailing revisions with empty diff
+        for lastrev in revs[::-1]:
+            if length(lastrev) != 0:
+                break
+
         try:
-            offset, data = self._getsegmentforrevs(revs[0], revs[-1], df=df)
+            offset, data = self._getsegmentforrevs(firstrev, lastrev, df=df)
         except OverflowError:
             # issue4215 - we can't cache a run of chunks greater than
             # 2G on Windows