Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 01:49:10 +0100] rev 41886
discovery: explicitly use `undecided` for the children mapping
Recent performance achievements makes the assumption that the `undecided` set
is used for sampling. That assumption is always true in practice. We stop
pretending that taking anything else would make sense and we directly use the
`undecided` set from the object. This provides a more honest API.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 01:48:20 +0100] rev 41885
discovery: cache the children mapping used during each discovery
During discovery, the `undecided` set keep shrinking. Therefore, the map
computed for an iteration N will be valid for iteration N+1. Instead of
computing the same data over and over we cache it the first time.
Our private pathological case speed up from about 7.5 seconds to about 6.3
seconds.
(starting from over 70s at the start of the full series)
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 01:15:45 +0100] rev 41884
discovery: move children computation in its own method
This clarifies the main logic and starts to pave the way to some caching.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 05 Mar 2019 15:39:54 +0100] rev 41883
discovery: simplify the building of the children mapping
Since we only care about the revisions inside the set we are sampling, we can
use simpler code (and probably sightly faster).
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 05 Mar 2019 15:52:14 +0100] rev 41882
discovery: simply walk the undecided revs when building the children mapping
The sampling only care about revisions in the undecided set, so building children
relationship within this set is sufficient.
The set of undecided changesets can be much smaller than the full span from its
smallest item to the tip of the repository. This restriction can significantly
speed up operations in some cases.
For example, on our private pathological case, this speeds things up from about
53 seconds to about 7.5 seconds.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 00:56:27 +0100] rev 41881
discovery: use a lower level but faster way to retrieve parents
We already know that no revision in the undecided set are filtered, so we can
skip multiple checks and directly access lower level data.
In a private pathological case, this improves the timing from about 70 seconds
to about 50 seconds. There are other actions to be taken to improve that case,
however this gives an idea of the general overhead.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 00:12:12 +0100] rev 41880
discovery: avoid computing identical sets of heads twice
The very same set of heads is computed in the previous statement, it seems more
efficient to just copy that result.