annotate contrib/undumprevlog @ 40056:324b4b10351e

revlog: rewrite censoring logic I was able to corrupt a revlog relatively easily with the existing censoring code. The underlying problem is that the existing code doesn't fully take delta chains into account. When copying revisions that occur after the censored revision, the delta base can refer to a censored revision. Then at read time, things blow up due to the revision data not being a compressed delta. This commit rewrites the revlog censoring code to take a higher-level approach. We now create a new revlog instance pointing at temp files. We iterate through each revision in the source revlog and insert those revisions into the new revlog, replacing the censored revision's data along the way. The new implementation isn't as efficient as the old one. This is because it will fully engage delta computation on insertion. But I don't think it matters. The new implementation is a bit hacky because it attempts to reload the revlog instance with a new revlog index/data file. This is fragile. But this is needed because the index (which could be backed by C) would have a cached copy of the old, possibly changed data and that could lead to problems accessing index or revision data later. One benefit of the new approach is that we integrate with the transaction. The old revlog is backed up and if the transaction is rolled back, the original revlog is restored. As part of this, we had to teach the transaction about the store vfs. I'm not super keen about this. But this was the easiest way to hook things up to the transaction. We /could/ just ignore the transaction like we were doing before. But any file mutation should be governed by transaction semantics, including undo during rollback. Differential Revision: https://phab.mercurial-scm.org/D4869
author Gregory Szorc <gregory.szorc@gmail.com>
date Tue, 02 Oct 2018 17:34:34 -0700
parents a063b84ce064
children 99e231afc29c
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
1 #!/usr/bin/env python
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
2 # Undump a dump from dumprevlog
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
3 # $ hg init
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
4 # $ undumprevlog < repo.dump
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
5
33872
5d9890d8ca77 undumprevlog: update to valid Python 3 syntax
Augie Fackler <raf@durin42.com>
parents: 31248
diff changeset
6 from __future__ import absolute_import, print_function
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
7
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
8 import sys
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
9 from mercurial import (
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
10 encoding,
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
11 node,
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
12 pycompat,
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
13 revlog,
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
14 transaction,
31248
8d3e8c8c9049 vfs: use 'vfs' module directly in 'contrib/undumprevlog'
Pierre-Yves David <pierre-yves.david@ens-lyon.org>
parents: 31216
diff changeset
15 vfs as vfsmod,
29167
4f76c0c490b3 py3: make contrib/undumprevlog use absolute_import
Pulkit Goyal <7895pulkit@gmail.com>
parents: 23310
diff changeset
16 )
37120
a8a902d7176e procutil: bulk-replace function calls to point to new module
Yuya Nishihara <yuya@tcha.org>
parents: 33872
diff changeset
17 from mercurial.utils import (
a8a902d7176e procutil: bulk-replace function calls to point to new module
Yuya Nishihara <yuya@tcha.org>
parents: 33872
diff changeset
18 procutil,
a8a902d7176e procutil: bulk-replace function calls to point to new module
Yuya Nishihara <yuya@tcha.org>
parents: 33872
diff changeset
19 )
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
20
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
21 for fp in (sys.stdin, sys.stdout, sys.stderr):
37120
a8a902d7176e procutil: bulk-replace function calls to point to new module
Yuya Nishihara <yuya@tcha.org>
parents: 33872
diff changeset
22 procutil.setbinary(fp)
6466
9c426da6b03b contrib: fix binary file issues with dumprevlog on Windows
Adrian Buehlmann <adrian@cadifra.com>
parents: 6433
diff changeset
23
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
24 opener = vfsmod.vfs(b'.', False)
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
25 tr = transaction.transaction(sys.stderr.write, opener, {b'store': opener},
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
26 b"undump.journal")
19022
cba222f01056 tests: run check-code on Python files without .py extension
Mads Kiilerich <madski@unity3d.com>
parents: 14233
diff changeset
27 while True:
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
28 l = sys.stdin.readline()
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
29 if not l:
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
30 break
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
31 if l.startswith("file:"):
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
32 f = encoding.strtolocal(l[6:-1])
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
33 r = revlog.revlog(opener, f)
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
34 pycompat.stdout.write(b'%s\n' % f)
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
35 elif l.startswith("node:"):
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
36 n = node.bin(l[6:-1])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
37 elif l.startswith("linkrev:"):
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
38 lr = int(l[9:-1])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
39 elif l.startswith("parents:"):
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
40 p = l[9:-1].split()
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
41 p1 = node.bin(p[0])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
42 p2 = node.bin(p[1])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
43 elif l.startswith("length:"):
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
44 length = int(l[8:-1])
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
45 sys.stdin.readline() # start marker
39947
a063b84ce064 py3: byteify contrib/dumprevlog
Matt Harbison <matt_harbison@yahoo.com>
parents: 37120
diff changeset
46 d = encoding.strtolocal(sys.stdin.read(length))
6433
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
47 sys.stdin.readline() # end marker
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
48 r.addrevision(d, tr, lr, p1, p2)
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
49
ec5d77eb3431 add simple dump and undump scripts to contrib/
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
50 tr.close()