changegroup: document deltaparent's choice of previous revision
As part of debugging low-level changegroup generation, I came across
what I initially thought was a weird behavior: changegroup v2 is
choosing the previous revision in the changegroup as a delta base
instead of p1. I was tempted to rewrite this to use p1, as p1
will delta better than prev in the common case. However, I realized
that taking p1 as the base would potentially require resolving a
revision fulltext and thus require more CPU for e.g. server-side
processing of getbundle requests.
This patch tweaks the code comment to note the choice of behavior.
It also notes there is room for a flag or config option to tweak
this behavior later: using p1 as the delta base would likely make
changegroups smaller at the expense of more CPU, which could be
beneficial for things like clone bundles.
--- a/mercurial/changegroup.py Sun Oct 09 03:11:18 2016 +0200
+++ b/mercurial/changegroup.py Thu Oct 13 12:49:47 2016 +0200
@@ -818,8 +818,15 @@
def deltaparent(self, revlog, rev, p1, p2, prev):
dp = revlog.deltaparent(rev)
- # avoid storing full revisions; pick prev in those cases
- # also pick prev when we can't be sure remote has dp
+ # Avoid sending full revisions when delta parent is null. Pick
+ # prev in that case. It's tempting to pick p1 in this case, as p1
+ # will be smaller in the common case. However, computing a delta
+ # against p1 may require resolving the raw text of p1, which could
+ # be expensive. The revlog caches should have prev cached, meaning
+ # less CPU for changegroup generation. There is likely room to add
+ # a flag and/or config option to control this behavior.
+ #
+ # Pick prev when we can't be sure remote has the base revision.
if dp == nullrev or (dp != p1 and dp != p2 and dp != prev):
return prev
return dp