Mon, 21 Jan 2019 22:36:16 +0100 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Mon, 21 Jan 2019 22:36:16 +0100] rev 42465
delta: move some delta chain related computation earlier in deltainfo They are some more optimization change that will make use of this in the function. So we retrieve the data earlier.
Thu, 25 Apr 2019 22:50:33 +0200 deltas: skip if projected delta size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Thu, 25 Apr 2019 22:50:33 +0200] rev 42464
deltas: skip if projected delta size is bigger than previous snapshot Before computing any delta, we get a basic estimation of the delta size we can expect and the resulted compressed value. We then checks this projected size against the `size(snapshotⁿ) > size(snapshotⁿ⁺¹)` constraint. This allows to exclude potential base candidates before doing any expensive computation. This only apply to the intermediate-snapshot case since this constraint only apply to them. For some pathological cases of a private repository this step provide a significant performance boost (timing from `hg perfrevlogwrite`): before: 14.115908 seconds after: 3.145906 seconds
Thu, 25 Apr 2019 22:30:14 +0200 deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 25 Apr 2019 22:30:14 +0200] rev 42463
deltas: skip if projected delta size does not match text size constraint Before computing any delta, we get a basic estimation of the delta size we can expect and the resulted compressed value. We then checks this projected size against the ½ⁿ size constraints. This allows to exclude potential base candidates before doing any expensive computation. This only apply to the intermediate-snapshot case since this constraint only apply to them. In practice we only perform this new checks for the manifestlog. Manifest log combine two property: it is likely to have delta chain issue and its diffing/compression is fairly predictable. The initial author of this changeset is Valentin Gatien-Baron providing the initial idea and initial testing, Pierre-Yves David later consolidated the code in the right location and run more extensive testing.
Fri, 26 Apr 2019 00:28:22 +0200 revlog: add the option to track the expected compression upper bound
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 26 Apr 2019 00:28:22 +0200] rev 42462
revlog: add the option to track the expected compression upper bound There are various optimization we can do if we can estimate the size of delta before actually spending CPU compressing them. So we add a attributed dedicated to tracking that. We only use it on Manifest because (1) it structure is quite stable across all Mercurial repository so its compression ratio is fairly universal. This is the revlog with most extreme delta (cf the sparse-revlog optimization). This will be put to use in later changesets. Right now the compression upper bound is set to 10. This is a fairly conservative value (observed value is more around 3), but I prefer to be safe while introducing the optimization principles. We can tune the optimization threshold later.
(0) -30000 -10000 -3000 -1000 -300 -100 -30 -10 -4 +4 +10 +30 +100 +300 +1000 +3000 tip