dirstate-tree: Fold "tracked descendants" counter update in main walk
For the purpose of implementing `has_tracked_dir` (which means "has tracked
descendants) without an expensive sub-tree traversal, we maintaing a counter
of tracked descendants on each "directory" node of the tree-shaped dirstate.
Before this changeset, mutating or inserting a node at a given path would
involve:
* Walking the tree from root through ancestors to find the node or the spot
where to insert it
* Looking at the previous node if any to decide what counter update is needed
* Performing any node mutation
* Walking the tree *again* to update counters in ancestor nodes
When profiling `hg status` on a large repo, this second walk takes times
while loading a the dirstate from disk.
It turns out we have enough information to decide before he first tree walk
what counter update is needed. This changeset merges the two walks, gaining
~10% of the total time for `hg update` (in the same hyperfine benchmark as
the previous changeset).
---
Profiling was done by compiling with this `.cargo/config`:
[profile.release]
debug = true
then running with:
py-spy record -r 500 -n -o /tmp/hg.json --format speedscope -- \
./hg status -R $REPO --config experimental.dirstate-tree.in-memory=1
then visualizing the recorded JSON file in https://www.speedscope.app/
Differential Revision: https://phab.mercurial-scm.org/D10554
from __future__ import absolute_import
import unittest
try:
from mercurial import rustext
rustext.__name__ # trigger immediate actual import
except ImportError:
rustext = None
else:
from mercurial.rustext import revlog
# this would fail already without appropriate ancestor.__package__
from mercurial.rustext.ancestor import LazyAncestors
from mercurial.testing import revlog as revlogtesting
@unittest.skipIf(
rustext is None,
"rustext module revlog relies on is not available",
)
class RustRevlogIndexTest(revlogtesting.RevlogBasedTestBase):
def test_heads(self):
idx = self.parseindex()
rustidx = revlog.MixedIndex(idx)
self.assertEqual(rustidx.headrevs(), idx.headrevs())
def test_get_cindex(self):
# drop me once we no longer need the method for shortest node
idx = self.parseindex()
rustidx = revlog.MixedIndex(idx)
cidx = rustidx.get_cindex()
self.assertTrue(idx is cidx)
def test_len(self):
idx = self.parseindex()
rustidx = revlog.MixedIndex(idx)
self.assertEqual(len(rustidx), len(idx))
def test_ancestors(self):
idx = self.parseindex()
rustidx = revlog.MixedIndex(idx)
lazy = LazyAncestors(rustidx, [3], 0, True)
# we have two more references to the index:
# - in its inner iterator for __contains__ and __bool__
# - in the LazyAncestors instance itself (to spawn new iterators)
self.assertTrue(2 in lazy)
self.assertTrue(bool(lazy))
self.assertEqual(list(lazy), [3, 2, 1, 0])
# a second time to validate that we spawn new iterators
self.assertEqual(list(lazy), [3, 2, 1, 0])
# let's check bool for an empty one
self.assertFalse(LazyAncestors(idx, [0], 0, False))
if __name__ == '__main__':
import silenttestrunner
silenttestrunner.main(__name__)