Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 01:48:20 +0100] rev 41885
discovery: cache the children mapping used during each discovery
During discovery, the `undecided` set keep shrinking. Therefore, the map
computed for an iteration N will be valid for iteration N+1. Instead of
computing the same data over and over we cache it the first time.
Our private pathological case speed up from about 7.5 seconds to about 6.3
seconds.
(starting from over 70s at the start of the full series)
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 01:15:45 +0100] rev 41884
discovery: move children computation in its own method
This clarifies the main logic and starts to pave the way to some caching.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 05 Mar 2019 15:39:54 +0100] rev 41883
discovery: simplify the building of the children mapping
Since we only care about the revisions inside the set we are sampling, we can
use simpler code (and probably sightly faster).
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 05 Mar 2019 15:52:14 +0100] rev 41882
discovery: simply walk the undecided revs when building the children mapping
The sampling only care about revisions in the undecided set, so building children
relationship within this set is sufficient.
The set of undecided changesets can be much smaller than the full span from its
smallest item to the tip of the repository. This restriction can significantly
speed up operations in some cases.
For example, on our private pathological case, this speeds things up from about
53 seconds to about 7.5 seconds.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 00:56:27 +0100] rev 41881
discovery: use a lower level but faster way to retrieve parents
We already know that no revision in the undecided set are filtered, so we can
skip multiple checks and directly access lower level data.
In a private pathological case, this improves the timing from about 70 seconds
to about 50 seconds. There are other actions to be taken to improve that case,
however this gives an idea of the general overhead.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 28 Feb 2019 00:12:12 +0100] rev 41880
discovery: avoid computing identical sets of heads twice
The very same set of heads is computed in the previous statement, it seems more
efficient to just copy that result.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Feb 2019 23:55:19 +0100] rev 41879
discovery: moved sampling functions inside discovery object
In this patch, we transform the sampling functions into
methods of the `partialdiscovery` class in the Python case.
This will provide multiple benefit. For example we can keep some cache from one
sampling to another. In addition this will help the Oxidation work as all graph
traversal related logic will be contained in a single object.
Georges Racinet <georges.racinet@octobus.net> [Wed, 27 Feb 2019 23:45:06 +0100] rev 41878
discovery: rename `srvheads` to `knownsrvheads`
The `srvheads` variable only contains the known set of remove heads. Renaming
the variable make it clearer.