annotate contrib/undumprevlog @ 47120:7109a38830c9

dirstate-tree: Fold "tracked descendants" counter update in main walk For the purpose of implementing `has_tracked_dir` (which means "has tracked descendants) without an expensive sub-tree traversal, we maintaing a counter of tracked descendants on each "directory" node of the tree-shaped dirstate. Before this changeset, mutating or inserting a node at a given path would involve: * Walking the tree from root through ancestors to find the node or the spot where to insert it * Looking at the previous node if any to decide what counter update is needed * Performing any node mutation * Walking the tree *again* to update counters in ancestor nodes When profiling `hg status` on a large repo, this second walk takes times while loading a the dirstate from disk. It turns out we have enough information to decide before he first tree walk what counter update is needed. This changeset merges the two walks, gaining ~10% of the total time for `hg update` (in the same hyperfine benchmark as the previous changeset). --- Profiling was done by compiling with this `.cargo/config`: [profile.release] debug = true then running with: py-spy record -r 500 -n -o /tmp/hg.json --format speedscope -- \ ./hg status -R $REPO --config experimental.dirstate-tree.in-memory=1 then visualizing the recorded JSON file in https://www.speedscope.app/ Differential Revision: https://phab.mercurial-scm.org/D10554
author Simon Sapin <simon.sapin@octobus.net>
date Fri, 30 Apr 2021 14:22:14 +0200
parents 4c041c71ec01
children 8d3c2f9d4af7
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
45830
c102b704edb5 global: use python3 in shebangs
Gregory Szorc <gregory.szorc@gmail.com>
parents: 45055
diff changeset
1 #!/usr/bin/env python3
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
2 # Undump a dump from dumprevlog
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
3 # $ hg init
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
4 # $ undumprevlog < repo.dump
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
5
33872
5d9890d8ca77 undumprevlog: update to valid Python 3 syntax
Augie Fackler <raf@durin42.com>
parents: 31248
diff changeset
6 from __future__ import absolute_import, print_function
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
7
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
8 import sys
46113
59fa3890d40a node: import symbols explicitly
Joerg Sonnenberger <joerg@bec.de>
parents: 45830
diff changeset
9 from mercurial.node import bin
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
10 from mercurial import (
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
11 encoding,
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
12 revlog,
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
13 transaction,
31248
8d3e8c8c9049 vfs: use 'vfs' module directly in 'contrib/undumprevlog'
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 31216
diff changeset
14 vfs as vfsmod,
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
15 )
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39947
diff changeset
16 from mercurial.utils import procutil
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
17
47072
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
18 from mercurial.revlogutils import (
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
19 constants as revlog_constants,
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
20 )
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
21
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
22 for fp in (sys.stdin, sys.stdout, sys.stderr):
37120
a8a902d7176e procutil: bulk-replace function calls to point to new module
Yuya Nishihara <yuya@tcha.org>
parents: 33872
diff changeset
23 procutil.setbinary(fp)
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
24
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
25 opener = vfsmod.vfs(b'.', False)
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39947
diff changeset
26 tr = transaction.transaction(
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39947
diff changeset
27 sys.stderr.write, opener, {b'store': opener}, b"undump.journal"
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39947
diff changeset
28 )
19022
cba222f01056 tests: run check-code on Python files without .py extension
Mads Kiilerich <madski@unity3d.com>
parents: 14233
diff changeset
29 while True:
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
30 l = sys.stdin.readline()
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
31 if not l:
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
32 break
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
33 if l.startswith("file:"):
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
34 f = encoding.strtolocal(l[6:-1])
47072
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
35 r = revlog.revlog(
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
36 opener,
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
37 target=(revlog_constants.KIND_OTHER, b'undump-revlog'),
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
38 indexfile=f,
4c041c71ec01 revlog: introduce an explicit tracking of what the revlog is about
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46113
diff changeset
39 )
45055
4c1b4805db57 pycompat: change users of pycompat.{stdin,stdout,stderr} to use procutil.std*
Manuel Jacob <me@manueljacob.de>
parents: 43659
diff changeset
40 procutil.stdout.write(b'%s\n' % f)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
41 elif l.startswith("node:"):
46113
59fa3890d40a node: import symbols explicitly
Joerg Sonnenberger <joerg@bec.de>
parents: 45830
diff changeset
42 n = bin(l[6:-1])
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
43 elif l.startswith("linkrev:"):
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
44 lr = int(l[9:-1])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
45 elif l.startswith("parents:"):
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
46 p = l[9:-1].split()
46113
59fa3890d40a node: import symbols explicitly
Joerg Sonnenberger <joerg@bec.de>
parents: 45830
diff changeset
47 p1 = bin(p[0])
59fa3890d40a node: import symbols explicitly
Joerg Sonnenberger <joerg@bec.de>
parents: 45830
diff changeset
48 p2 = bin(p[1])
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
49 elif l.startswith("length:"):
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
50 length = int(l[8:-1])
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39947
diff changeset
51 sys.stdin.readline() # start marker
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
52 d = encoding.strtolocal(sys.stdin.read(length))
43659
99e231afc29c black: blacken scripts
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39947
diff changeset
53 sys.stdin.readline() # end marker
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
54 r.addrevision(d, tr, lr, p1, p2)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
55
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
56 tr.close()