annotate mercurial/revlogutils/sidedata.py @ 46709:3d740058b467

sidedata: move to new sidedata storage in revlogv2 The current (experimental) sidedata system uses flagprocessors to signify the presence and store/retrieve sidedata from the raw revlog data. This proved to be quite fragile from an exchange perspective and a lot more complex than simply having a dedicated space in the new revlog format. This change does not handle exchange (ironically), so the test for amend - that uses a bundle - is broken. This functionality is split into the next patches. Differential Revision: https://phab.mercurial-scm.org/D9993
author Raphaël Gomès <rgomes@octobus.net>
date Mon, 18 Jan 2021 11:44:51 +0100
parents d6a9e690d620
children 3aab2330b7d3
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
43033
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
1 # sidedata.py - Logic around store extra data alongside revlog revisions
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
2 #
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
3 # Copyright 2019 Pierre-Yves David <pierre-yves.david@octobus.net)
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
4 #
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
5 # This software may be used and distributed according to the terms of the
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
6 # GNU General Public License version 2 or any later version.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
7 """core code for "sidedata" support
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
8
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
9 The "sidedata" are stored alongside the revision without actually being part of
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
10 its content and not affecting its hash. It's main use cases is to cache
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
11 important information related to a changesets.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
12
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
13 The current implementation is experimental and subject to changes. Do not rely
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
14 on it in production.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
15
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
16 Sidedata are stored in the revlog itself, thanks to a new version of the
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
17 revlog. The following format is currently used::
43033
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
18
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
19 initial header:
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
20 <number of sidedata; 2 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
21 sidedata (repeated N times):
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
22 <sidedata-key; 2 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
23 <sidedata-entry-length: 4 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
24 <sidedata-content-sha1-digest: 20 bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
25 <sidedata-content; X bytes>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
26 normal raw text:
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
27 <all bytes remaining in the rawtext>
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
28
46195
d6a9e690d620 comments: fix typos
Joerg Sonnenberger <joerg@bec.de>
parents: 45634
diff changeset
29 This is a simple and effective format. It should be enough to experiment with
43033
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
30 the concept.
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
31 """
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
32
21025a4107d4 sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff changeset
33 from __future__ import absolute_import
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
34
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
35 import struct
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
36
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
37 from .. import error
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
38 from ..utils import hashutil
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
39
43040
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
40 ## sidedata type constant
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
41 # reserve a block for testing purposes.
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
42 SD_TEST1 = 1
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
43 SD_TEST2 = 2
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
44 SD_TEST3 = 3
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
45 SD_TEST4 = 4
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
46 SD_TEST5 = 5
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
47 SD_TEST6 = 6
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
48 SD_TEST7 = 7
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
49
43142
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
50 # key to store copies related information
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
51 SD_P1COPIES = 8
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
52 SD_P2COPIES = 9
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
53 SD_FILESADDED = 10
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
54 SD_FILESREMOVED = 11
45634
9a6b409b8ebc changing-files: rework the way we store changed files in side-data
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 44060
diff changeset
55 SD_FILES = 12
43142
beed7ce61681 sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43077
diff changeset
56
43040
ba4072c0a911 sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43037
diff changeset
57 # internal format constant
43506
9f70512ae2cf cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents: 43142
diff changeset
58 SIDEDATA_HEADER = struct.Struct('>H')
9f70512ae2cf cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents: 43142
diff changeset
59 SIDEDATA_ENTRY = struct.Struct('>HL20s')
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
60
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
61
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
62 def serialize_sidedata(sidedata):
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
63 sidedata = list(sidedata.items())
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
64 sidedata.sort()
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
65 buf = [SIDEDATA_HEADER.pack(len(sidedata))]
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
66 for key, value in sidedata:
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
67 digest = hashutil.sha1(value).digest()
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
68 buf.append(SIDEDATA_ENTRY.pack(key, len(value), digest))
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
69 for key, value in sidedata:
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
70 buf.append(value)
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
71 buf = b''.join(buf)
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
72 return buf
43035
ea83abf95630 sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43034
diff changeset
73
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 43048
diff changeset
74
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
75 def deserialize_sidedata(blob):
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
76 sidedata = {}
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
77 offset = 0
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
78 (nbentry,) = SIDEDATA_HEADER.unpack(blob[: SIDEDATA_HEADER.size])
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
79 offset += SIDEDATA_HEADER.size
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
80 dataoffset = SIDEDATA_HEADER.size + (SIDEDATA_ENTRY.size * nbentry)
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
81 for i in range(nbentry):
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
82 nextoffset = offset + SIDEDATA_ENTRY.size
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
83 key, size, storeddigest = SIDEDATA_ENTRY.unpack(blob[offset:nextoffset])
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
84 offset = nextoffset
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
85 # read the data associated with that entry
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
86 nextdataoffset = dataoffset + size
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
87 entrytext = bytes(blob[dataoffset:nextdataoffset])
44060
a61287a95dc3 core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents: 43506
diff changeset
88 readdigest = hashutil.sha1(entrytext).digest()
43034
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
89 if storeddigest != readdigest:
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
90 raise error.SidedataHashError(key, storeddigest, readdigest)
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
91 sidedata[key] = entrytext
294afb982a88 sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 43033
diff changeset
92 dataoffset = nextdataoffset
46709
3d740058b467 sidedata: move to new sidedata storage in revlogv2
Raphaël Gomès <rgomes@octobus.net>
parents: 46195
diff changeset
93 return sidedata