Mercurial > hg
annotate mercurial/revlogutils/sidedata.py @ 44339:c7eebdb15139
nodemap: never read more than the expected data amount
Since we are tracking this number we can use it to detect corrupted rawdata file
and to only read the correct amount of data when possible.
Differential Revision: https://phab.mercurial-scm.org/D7892
author | Pierre-Yves David <pierre-yves.david@octobus.net> |
---|---|
date | Wed, 15 Jan 2020 15:50:52 +0100 |
parents | a61287a95dc3 |
children | 9a6b409b8ebc |
rev | line source |
---|---|
43033
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
1 # sidedata.py - Logic around store extra data alongside revlog revisions |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
2 # |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
3 # Copyright 2019 Pierre-Yves David <pierre-yves.david@octobus.net) |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
4 # |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
5 # This software may be used and distributed according to the terms of the |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
6 # GNU General Public License version 2 or any later version. |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
7 """core code for "sidedata" support |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
8 |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
9 The "sidedata" are stored alongside the revision without actually being part of |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
10 its content and not affecting its hash. It's main use cases is to cache |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
11 important information related to a changesets. |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
12 |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
13 The current implementation is experimental and subject to changes. Do not rely |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
14 on it in production. |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
15 |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
16 Sidedata are stored in the revlog itself, withing the revision rawtext. They |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
17 are inserted, removed from it using the flagprocessors mechanism. The following |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
18 format is currently used:: |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
19 |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
20 initial header: |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
21 <number of sidedata; 2 bytes> |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
22 sidedata (repeated N times): |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
23 <sidedata-key; 2 bytes> |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
24 <sidedata-entry-length: 4 bytes> |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
25 <sidedata-content-sha1-digest: 20 bytes> |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
26 <sidedata-content; X bytes> |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
27 normal raw text: |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
28 <all bytes remaining in the rawtext> |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
29 |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
30 This is a simple and effective format. It should be enought to experiment with |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
31 the concept. |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
32 """ |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
33 |
21025a4107d4
sidedata: add a new module with basic documentation
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
diff
changeset
|
34 from __future__ import absolute_import |
43034
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
35 |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
36 import struct |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
37 |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
38 from .. import error |
44060
a61287a95dc3
core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents:
43506
diff
changeset
|
39 from ..utils import hashutil |
43034
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
40 |
43040
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
41 ## sidedata type constant |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
42 # reserve a block for testing purposes. |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
43 SD_TEST1 = 1 |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
44 SD_TEST2 = 2 |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
45 SD_TEST3 = 3 |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
46 SD_TEST4 = 4 |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
47 SD_TEST5 = 5 |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
48 SD_TEST6 = 6 |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
49 SD_TEST7 = 7 |
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
50 |
43142
beed7ce61681
sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43077
diff
changeset
|
51 # key to store copies related information |
beed7ce61681
sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43077
diff
changeset
|
52 SD_P1COPIES = 8 |
beed7ce61681
sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43077
diff
changeset
|
53 SD_P2COPIES = 9 |
beed7ce61681
sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43077
diff
changeset
|
54 SD_FILESADDED = 10 |
beed7ce61681
sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43077
diff
changeset
|
55 SD_FILESREMOVED = 11 |
beed7ce61681
sidedatacopies: write copies information in sidedata when applicable
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43077
diff
changeset
|
56 |
43040
ba4072c0a911
sidedata: test we can successfully write sidedata
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43037
diff
changeset
|
57 # internal format constant |
43506
9f70512ae2cf
cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents:
43142
diff
changeset
|
58 SIDEDATA_HEADER = struct.Struct('>H') |
9f70512ae2cf
cleanup: remove pointless r-prefixes on single-quoted strings
Augie Fackler <augie@google.com>
parents:
43142
diff
changeset
|
59 SIDEDATA_ENTRY = struct.Struct('>HL20s') |
43034
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
60 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
43048
diff
changeset
|
61 |
43035
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
62 def sidedatawriteprocessor(rl, text, sidedata): |
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
63 sidedata = list(sidedata.items()) |
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
64 sidedata.sort() |
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
65 rawtext = [SIDEDATA_HEADER.pack(len(sidedata))] |
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
66 for key, value in sidedata: |
44060
a61287a95dc3
core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents:
43506
diff
changeset
|
67 digest = hashutil.sha1(value).digest() |
43035
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
68 rawtext.append(SIDEDATA_ENTRY.pack(key, len(value), digest)) |
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
69 for key, value in sidedata: |
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
70 rawtext.append(value) |
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
71 rawtext.append(bytes(text)) |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
72 return b''.join(rawtext), False |
43035
ea83abf95630
sidedata: add a function to write sidedata into a raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43034
diff
changeset
|
73 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
43048
diff
changeset
|
74 |
43034
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
75 def sidedatareadprocessor(rl, text): |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
76 sidedata = {} |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
77 offset = 0 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
43048
diff
changeset
|
78 (nbentry,) = SIDEDATA_HEADER.unpack(text[: SIDEDATA_HEADER.size]) |
43034
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
79 offset += SIDEDATA_HEADER.size |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
80 dataoffset = SIDEDATA_HEADER.size + (SIDEDATA_ENTRY.size * nbentry) |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
81 for i in range(nbentry): |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
82 nextoffset = offset + SIDEDATA_ENTRY.size |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
83 key, size, storeddigest = SIDEDATA_ENTRY.unpack(text[offset:nextoffset]) |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
84 offset = nextoffset |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
85 # read the data associated with that entry |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
86 nextdataoffset = dataoffset + size |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
87 entrytext = text[dataoffset:nextdataoffset] |
44060
a61287a95dc3
core: migrate uses of hashlib.sha1 to hashutil.sha1
Augie Fackler <augie@google.com>
parents:
43506
diff
changeset
|
88 readdigest = hashutil.sha1(entrytext).digest() |
43034
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
89 if storeddigest != readdigest: |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
90 raise error.SidedataHashError(key, storeddigest, readdigest) |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
91 sidedata[key] = entrytext |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
92 dataoffset = nextdataoffset |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
93 text = text[dataoffset:] |
294afb982a88
sidedata: add a function to read sidedata from revlog raw text
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43033
diff
changeset
|
94 return text, True, sidedata |
43036
e8bc4c3d9a0b
sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43035
diff
changeset
|
95 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
43048
diff
changeset
|
96 |
43036
e8bc4c3d9a0b
sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43035
diff
changeset
|
97 def sidedatarawprocessor(rl, text): |
e8bc4c3d9a0b
sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43035
diff
changeset
|
98 # side data modifies rawtext and prevent rawtext hash validation |
e8bc4c3d9a0b
sidedata: add a function for _processflagsraw usage
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43035
diff
changeset
|
99 return False |
43037
142deb539ccf
sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43036
diff
changeset
|
100 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
43048
diff
changeset
|
101 |
43037
142deb539ccf
sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43036
diff
changeset
|
102 processors = ( |
142deb539ccf
sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43036
diff
changeset
|
103 sidedatareadprocessor, |
142deb539ccf
sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43036
diff
changeset
|
104 sidedatawriteprocessor, |
142deb539ccf
sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43036
diff
changeset
|
105 sidedatarawprocessor, |
142deb539ccf
sidedata: register the flag processors if the repository allows for it
Pierre-Yves David <pierre-yves.david@octobus.net>
parents:
43036
diff
changeset
|
106 ) |