diff mercurial/changegroup.py @ 30211:6b0741d6d234

changegroup: skip delta when the underlying revlog do not use them Revlog can now be configured to store full snapshot only. This is used on the changelog. However, the changegroup packing was still recomputing deltas to be sent over the wire. We now just reuse the full snapshot directly in this case, skipping delta computation. This provides use with a large speed up(-30%): # perfchangegroupchangelog on mercurial ! wall 2.010326 comb 2.020000 user 2.000000 sys 0.020000 (best of 5) ! wall 1.382039 comb 1.380000 user 1.370000 sys 0.010000 (best of 8) # perfchangegroupchangelog on pypy ! wall 5.792589 comb 5.780000 user 5.780000 sys 0.000000 (best of 3) ! wall 3.911158 comb 3.920000 user 3.900000 sys 0.020000 (best of 3) # perfchangegroupchangelog on mozilla central ! wall 20.683727 comb 20.680000 user 20.630000 sys 0.050000 (best of 3) ! wall 14.190204 comb 14.190000 user 14.150000 sys 0.040000 (best of 3) Many tests have to be updated because of the change in bundle content. All theses update have been verified. Because diffing changelog was not very valuable, the resulting bundle have similar size (often a bit smaller): # full bundle of mozilla central with delta: 1142740533B without delta: 1142173300B So this is a win all over the board.
author Pierre-Yves David <pierre-yves.david@ens-lyon.org>
date Fri, 14 Oct 2016 01:31:11 +0200
parents edb49a90723c
children 260af19891f2
line wrap: on
line diff
--- a/mercurial/changegroup.py	Fri Oct 14 02:25:08 2016 +0200
+++ b/mercurial/changegroup.py	Fri Oct 14 01:31:11 2016 +0200
@@ -818,18 +818,24 @@
 
     def deltaparent(self, revlog, rev, p1, p2, prev):
         dp = revlog.deltaparent(rev)
-        # Avoid sending full revisions when delta parent is null. Pick
-        # prev in that case. It's tempting to pick p1 in this case, as p1
-        # will be smaller in the common case. However, computing a delta
-        # against p1 may require resolving the raw text of p1, which could
-        # be expensive. The revlog caches should have prev cached, meaning
-        # less CPU for changegroup generation. There is likely room to add
-        # a flag and/or config option to control this behavior.
-        #
-        # Pick prev when we can't be sure remote has the base revision.
-        if dp == nullrev or (dp != p1 and dp != p2 and dp != prev):
+        if dp == nullrev and revlog.storedeltachains:
+            # Avoid sending full revisions when delta parent is null. Pick prev
+            # in that case. It's tempting to pick p1 in this case, as p1 will
+            # be smaller in the common case. However, computing a delta against
+            # p1 may require resolving the raw text of p1, which could be
+            # expensive. The revlog caches should have prev cached, meaning
+            # less CPU for changegroup generation. There is likely room to add
+            # a flag and/or config option to control this behavior.
             return prev
-        return dp
+        elif dp == nullrev:
+            # revlog is configured to use full snapshot for a reason,
+            # stick to full snapshot.
+            return nullrev
+        elif dp not in (p1, p2, prev):
+            # Pick prev when we can't be sure remote has the base revision.
+            return prev
+        else:
+            return dp
 
     def builddeltaheader(self, node, p1n, p2n, basenode, linknode, flags):
         # Do nothing with flags, it is implicitly 0 in cg1 and cg2