changeset 30153:edb49a90723c

changegroup: document deltaparent's choice of previous revision As part of debugging low-level changegroup generation, I came across what I initially thought was a weird behavior: changegroup v2 is choosing the previous revision in the changegroup as a delta base instead of p1. I was tempted to rewrite this to use p1, as p1 will delta better than prev in the common case. However, I realized that taking p1 as the base would potentially require resolving a revision fulltext and thus require more CPU for e.g. server-side processing of getbundle requests. This patch tweaks the code comment to note the choice of behavior. It also notes there is room for a flag or config option to tweak this behavior later: using p1 as the delta base would likely make changegroups smaller at the expense of more CPU, which could be beneficial for things like clone bundles.
author Gregory Szorc <gregory.szorc@gmail.com>
date Thu, 13 Oct 2016 12:49:47 +0200
parents d65e246100ed
children 5e72129d75ed
files mercurial/changegroup.py
diffstat 1 files changed, 9 insertions(+), 2 deletions(-) [+]
line wrap: on
line diff
--- a/mercurial/changegroup.py	Sun Oct 09 03:11:18 2016 +0200
+++ b/mercurial/changegroup.py	Thu Oct 13 12:49:47 2016 +0200
@@ -818,8 +818,15 @@
 
     def deltaparent(self, revlog, rev, p1, p2, prev):
         dp = revlog.deltaparent(rev)
-        # avoid storing full revisions; pick prev in those cases
-        # also pick prev when we can't be sure remote has dp
+        # Avoid sending full revisions when delta parent is null. Pick
+        # prev in that case. It's tempting to pick p1 in this case, as p1
+        # will be smaller in the common case. However, computing a delta
+        # against p1 may require resolving the raw text of p1, which could
+        # be expensive. The revlog caches should have prev cached, meaning
+        # less CPU for changegroup generation. There is likely room to add
+        # a flag and/or config option to control this behavior.
+        #
+        # Pick prev when we can't be sure remote has the base revision.
         if dp == nullrev or (dp != p1 and dp != p2 and dp != prev):
             return prev
         return dp