annotate mercurial/revlogutils/deltas.py @ 47457:f8330a3fc39f

censor: implement censoring for revlogv2 It is a bit verbose and rough, but it works. Most of that logic can be common for `stripping`, so we can expect more refactoring of that code to accommodate both needs. However I wanted to keep this changesets "simple enough" and before moving forward. We also need to properly delete the older index/data/sidedata file, but this has implication for streaming clone and transaction, so this will come later. Differential Revision: https://phab.mercurial-scm.org/D10869
author Pierre-Yves David <pierre-yves.david@octobus.net>
date Mon, 07 Jun 2021 11:59:27 +0200
parents 93f4e183b3f5
children c514936d92b4
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
1 # revlogdeltas.py - Logic around delta computation for revlog
8226
8b2cd04a6e97 put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
2 #
46819
d4ba4d51f85f contributor: change mentions of mpm to olivia
Raphaël Gomès <rgomes@octobus.net>
parents: 43089
diff changeset
3 # Copyright 2005-2007 Olivia Mackall <olivia@selenic.com>
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
4 # Copyright 2018 Octobus <contact@octobus.net>
8226
8b2cd04a6e97 put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
5 #
8b2cd04a6e97 put license and copyright info into comment blocks
Martin Geisler <mg@lazybytes.net>
parents: 8225
diff changeset
6 # This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 10047
diff changeset
7 # GNU General Public License version 2 or any later version.
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
8 """Helper class to compute deltas stored inside revlogs"""
8227
0a9542703300 turn some comments back into module docstrings
Martin Geisler <mg@lazybytes.net>
parents: 8226
diff changeset
9
27361
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
10 from __future__ import absolute_import
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
11
39493
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
12 import collections
27361
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
13 import struct
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
14
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
15 # import stuff from node for others to import from revlog
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
16 from ..node import nullrev
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
17 from ..i18n import _
43089
c59eb1560c44 py3: manually import getattr where it is needed
Gregory Szorc <gregory.szorc@gmail.com>
parents: 43077
diff changeset
18 from ..pycompat import getattr
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
19
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
20 from .constants import (
47452
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
21 COMP_MODE_DEFAULT,
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
22 COMP_MODE_INLINE,
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
23 COMP_MODE_PLAIN,
39329
729082bb9938 revlog: split constants into a new `revlogutils.constants` module
Boris Feld <boris.feld@octobus.net>
parents: 39232
diff changeset
24 REVIDX_ISCENSORED,
729082bb9938 revlog: split constants into a new `revlogutils.constants` module
Boris Feld <boris.feld@octobus.net>
parents: 39232
diff changeset
25 REVIDX_RAWTEXT_CHANGING_FLAGS,
729082bb9938 revlog: split constants into a new `revlogutils.constants` module
Boris Feld <boris.feld@octobus.net>
parents: 39232
diff changeset
26 )
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
27
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
28 from ..thirdparty import attr
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
29
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
30 from .. import (
27361
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
31 error,
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
32 mdiff,
41108
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
33 util,
27361
29f50344fa83 revlog: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 27251
diff changeset
34 )
10913
f2ecc5733c89 revlog: factor out _maxinline global.
Greg Ward <greg-hg@gerg.ca>
parents: 10404
diff changeset
35
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
36 from . import flagutil
42992
dff95420480f flagprocessors: make `processflagsraw` a module level function
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42878
diff changeset
37
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
38 # maximum <delta-chain-data>/<revision-text-length> ratio
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
39 LIMIT_DELTA2TEXT = 2
1091
d62130f99a73 Move hash function back to revlog from node
mpm@selenic.com
parents: 1089
diff changeset
40
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
41
38637
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
42 class _testrevlog(object):
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
43 """minimalist fake revlog to use in doctests"""
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
44
40641
85b14f0dc334 doctest: add a `issnapshot` method to _testrevlog
Boris Feld <boris.feld@octobus.net>
parents: 40608
diff changeset
45 def __init__(self, data, density=0.5, mingap=0, snapshot=()):
38637
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
46 """data is an list of revision payload boundaries"""
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
47 self._data = data
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
48 self._srdensitythreshold = density
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
49 self._srmingapsize = mingap
40641
85b14f0dc334 doctest: add a `issnapshot` method to _testrevlog
Boris Feld <boris.feld@octobus.net>
parents: 40608
diff changeset
50 self._snapshot = set(snapshot)
40709
39d29542fe40 sparse-revlog: put the native implementation of slicechunktodensity to use
Boris Feld <boris.feld@octobus.net>
parents: 40657
diff changeset
51 self.index = None
38637
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
52
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
53 def start(self, rev):
41079
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
54 if rev == nullrev:
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
55 return 0
38637
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
56 if rev == 0:
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
57 return 0
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
58 return self._data[rev - 1]
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
59
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
60 def end(self, rev):
41079
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
61 if rev == nullrev:
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
62 return 0
38637
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
63 return self._data[rev]
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
64
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
65 def length(self, rev):
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
66 return self.end(rev) - self.start(rev)
e33f784f2a44 revlog: introduce a tiny mock of a revlog class
Boris Feld <boris.feld@octobus.net>
parents: 38636
diff changeset
67
38718
f8762ea73e0d sparse-revlog: implement algorithm to write sparse delta chains (issue5480)
Paul Morelle <paul.morelle@octobus.net>
parents: 38717
diff changeset
68 def __len__(self):
f8762ea73e0d sparse-revlog: implement algorithm to write sparse delta chains (issue5480)
Paul Morelle <paul.morelle@octobus.net>
parents: 38717
diff changeset
69 return len(self._data)
f8762ea73e0d sparse-revlog: implement algorithm to write sparse delta chains (issue5480)
Paul Morelle <paul.morelle@octobus.net>
parents: 38717
diff changeset
70
40641
85b14f0dc334 doctest: add a `issnapshot` method to _testrevlog
Boris Feld <boris.feld@octobus.net>
parents: 40608
diff changeset
71 def issnapshot(self, rev):
41079
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
72 if rev == nullrev:
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
73 return True
40641
85b14f0dc334 doctest: add a `issnapshot` method to _testrevlog
Boris Feld <boris.feld@octobus.net>
parents: 40608
diff changeset
74 return rev in self._snapshot
85b14f0dc334 doctest: add a `issnapshot` method to _testrevlog
Boris Feld <boris.feld@octobus.net>
parents: 40608
diff changeset
75
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
76
40604
3ac23dad6364 sparse-revlog: drop unused deltainfo parameter from _slicechunktodensity
Boris Feld <boris.feld@octobus.net>
parents: 40603
diff changeset
77 def slicechunk(revlog, revs, targetsize=None):
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
78 """slice revs to reduce the amount of unrelated data to be read from disk.
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
79
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
80 ``revs`` is sliced into groups that should be read in one time.
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
81 Assume that revs are sorted.
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
82
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
83 The initial chunk is sliced until the overall density (payload/chunks-span
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
84 ratio) is above `revlog._srdensitythreshold`. No gap smaller than
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
85 `revlog._srmingapsize` is skipped.
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
86
38643
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
87 If `targetsize` is set, no chunk larger than `targetsize` will be yield.
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
88 For consistency with other slicing choice, this limit won't go lower than
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
89 `revlog._srmingapsize`.
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
90
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
91 If individual revisions chunk are larger than this limit, they will still
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
92 be raised individually.
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
93
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
94 >>> data = [
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
95 ... 5, #00 (5)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
96 ... 10, #01 (5)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
97 ... 12, #02 (2)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
98 ... 12, #03 (empty)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
99 ... 27, #04 (15)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
100 ... 31, #05 (4)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
101 ... 31, #06 (empty)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
102 ... 42, #07 (11)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
103 ... 47, #08 (5)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
104 ... 47, #09 (empty)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
105 ... 48, #10 (1)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
106 ... 51, #11 (3)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
107 ... 74, #12 (23)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
108 ... 85, #13 (11)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
109 ... 86, #14 (1)
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
110 ... 91, #15 (5)
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
111 ... ]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
112 >>> revlog = _testrevlog(data, snapshot=range(16))
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
113
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
114 >>> list(slicechunk(revlog, list(range(16))))
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
115 [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
116 >>> list(slicechunk(revlog, [0, 15]))
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
117 [[0], [15]]
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
118 >>> list(slicechunk(revlog, [0, 11, 15]))
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
119 [[0], [11], [15]]
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
120 >>> list(slicechunk(revlog, [0, 11, 13, 15]))
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
121 [[0], [11, 13, 15]]
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
122 >>> list(slicechunk(revlog, [1, 2, 3, 5, 8, 10, 11, 14]))
38640
f62b8fb0a484 revlog: document and test _slicechunk
Boris Feld <boris.feld@octobus.net>
parents: 38639
diff changeset
123 [[1, 2], [5, 8, 10, 11], [14]]
38643
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
124
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
125 Slicing with a maximum chunk size
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
126 >>> list(slicechunk(revlog, [0, 11, 13, 15], targetsize=15))
38643
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
127 [[0], [11], [13], [15]]
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
128 >>> list(slicechunk(revlog, [0, 11, 13, 15], targetsize=20))
38643
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
129 [[0], [11], [13, 15]]
41079
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
130
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
131 Slicing involving nullrev
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
132 >>> list(slicechunk(revlog, [-1, 0, 11, 13, 15], targetsize=20))
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
133 [[-1, 0], [11], [13, 15]]
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
134 >>> list(slicechunk(revlog, [-1, 13, 15], targetsize=5))
88d813cd9acd revlog: fix pure python slicing test when chain contains nullrev
Boris Feld <boris.feld@octobus.net>
parents: 41033
diff changeset
135 [[-1], [13], [15]]
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
136 """
38643
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
137 if targetsize is not None:
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
138 targetsize = max(targetsize, revlog._srmingapsize)
38718
f8762ea73e0d sparse-revlog: implement algorithm to write sparse delta chains (issue5480)
Paul Morelle <paul.morelle@octobus.net>
parents: 38717
diff changeset
139 # targetsize should not be specified when evaluating delta candidates:
f8762ea73e0d sparse-revlog: implement algorithm to write sparse delta chains (issue5480)
Paul Morelle <paul.morelle@octobus.net>
parents: 38717
diff changeset
140 # * targetsize is used to ensure we stay within specification when reading,
40709
39d29542fe40 sparse-revlog: put the native implementation of slicechunktodensity to use
Boris Feld <boris.feld@octobus.net>
parents: 40657
diff changeset
141 densityslicing = getattr(revlog.index, 'slicechunktodensity', None)
39d29542fe40 sparse-revlog: put the native implementation of slicechunktodensity to use
Boris Feld <boris.feld@octobus.net>
parents: 40657
diff changeset
142 if densityslicing is None:
39d29542fe40 sparse-revlog: put the native implementation of slicechunktodensity to use
Boris Feld <boris.feld@octobus.net>
parents: 40657
diff changeset
143 densityslicing = lambda x, y, z: _slicechunktodensity(revlog, x, y, z)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
144 for chunk in densityslicing(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
145 revs, revlog._srdensitythreshold, revlog._srmingapsize
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
146 ):
38643
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
147 for subchunk in _slicechunktosize(revlog, chunk, targetsize):
967fee55e8d9 revlog: postprocess chunk to slice them down to a certain size
Boris Feld <boris.feld@octobus.net>
parents: 38642
diff changeset
148 yield subchunk
38641
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
149
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
150
38718
f8762ea73e0d sparse-revlog: implement algorithm to write sparse delta chains (issue5480)
Paul Morelle <paul.morelle@octobus.net>
parents: 38717
diff changeset
151 def _slicechunktosize(revlog, revs, targetsize=None):
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
152 """slice revs to match the target size
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
153
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
154 This is intended to be used on chunk that density slicing selected by that
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
155 are still too large compared to the read garantee of revlog. This might
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
156 happens when "minimal gap size" interrupted the slicing or when chain are
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
157 built in a way that create large blocks next to each other.
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
158
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
159 >>> data = [
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
160 ... 3, #0 (3)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
161 ... 5, #1 (2)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
162 ... 6, #2 (1)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
163 ... 8, #3 (2)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
164 ... 8, #4 (empty)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
165 ... 11, #5 (3)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
166 ... 12, #6 (1)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
167 ... 13, #7 (1)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
168 ... 14, #8 (1)
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
169 ... ]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
170
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
171 == All snapshots cases ==
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
172 >>> revlog = _testrevlog(data, snapshot=range(9))
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
173
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
174 Cases where chunk is already small enough
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
175 >>> list(_slicechunktosize(revlog, [0], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
176 [[0]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
177 >>> list(_slicechunktosize(revlog, [6, 7], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
178 [[6, 7]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
179 >>> list(_slicechunktosize(revlog, [0], None))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
180 [[0]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
181 >>> list(_slicechunktosize(revlog, [6, 7], None))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
182 [[6, 7]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
183
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
184 cases where we need actual slicing
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
185 >>> list(_slicechunktosize(revlog, [0, 1], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
186 [[0], [1]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
187 >>> list(_slicechunktosize(revlog, [1, 3], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
188 [[1], [3]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
189 >>> list(_slicechunktosize(revlog, [1, 2, 3], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
190 [[1, 2], [3]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
191 >>> list(_slicechunktosize(revlog, [3, 5], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
192 [[3], [5]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
193 >>> list(_slicechunktosize(revlog, [3, 4, 5], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
194 [[3], [5]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
195 >>> list(_slicechunktosize(revlog, [5, 6, 7, 8], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
196 [[5], [6, 7, 8]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
197 >>> list(_slicechunktosize(revlog, [0, 1, 2, 3, 4, 5, 6, 7, 8], 3))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
198 [[0], [1, 2], [3], [5], [6, 7, 8]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
199
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
200 Case with too large individual chunk (must return valid chunk)
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
201 >>> list(_slicechunktosize(revlog, [0, 1], 2))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
202 [[0], [1]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
203 >>> list(_slicechunktosize(revlog, [1, 3], 1))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
204 [[1], [3]]
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
205 >>> list(_slicechunktosize(revlog, [3, 4, 5], 2))
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
206 [[3], [5]]
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
207
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
208 == No Snapshot cases ==
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
209 >>> revlog = _testrevlog(data)
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
210
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
211 Cases where chunk is already small enough
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
212 >>> list(_slicechunktosize(revlog, [0], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
213 [[0]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
214 >>> list(_slicechunktosize(revlog, [6, 7], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
215 [[6, 7]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
216 >>> list(_slicechunktosize(revlog, [0], None))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
217 [[0]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
218 >>> list(_slicechunktosize(revlog, [6, 7], None))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
219 [[6, 7]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
220
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
221 cases where we need actual slicing
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
222 >>> list(_slicechunktosize(revlog, [0, 1], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
223 [[0], [1]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
224 >>> list(_slicechunktosize(revlog, [1, 3], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
225 [[1], [3]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
226 >>> list(_slicechunktosize(revlog, [1, 2, 3], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
227 [[1], [2, 3]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
228 >>> list(_slicechunktosize(revlog, [3, 5], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
229 [[3], [5]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
230 >>> list(_slicechunktosize(revlog, [3, 4, 5], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
231 [[3], [4, 5]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
232 >>> list(_slicechunktosize(revlog, [5, 6, 7, 8], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
233 [[5], [6, 7, 8]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
234 >>> list(_slicechunktosize(revlog, [0, 1, 2, 3, 4, 5, 6, 7, 8], 3))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
235 [[0], [1, 2], [3], [5], [6, 7, 8]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
236
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
237 Case with too large individual chunk (must return valid chunk)
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
238 >>> list(_slicechunktosize(revlog, [0, 1], 2))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
239 [[0], [1]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
240 >>> list(_slicechunktosize(revlog, [1, 3], 1))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
241 [[1], [3]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
242 >>> list(_slicechunktosize(revlog, [3, 4, 5], 2))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
243 [[3], [5]]
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
244
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
245 == mixed case ==
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
246 >>> revlog = _testrevlog(data, snapshot=[0, 1, 2])
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
247 >>> list(_slicechunktosize(revlog, list(range(9)), 5))
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
248 [[0, 1], [2], [3, 4, 5], [6, 7, 8]]
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
249 """
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
250 assert targetsize is None or 0 <= targetsize
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
251 startdata = revlog.start(revs[0])
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
252 enddata = revlog.end(revs[-1])
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
253 fullspan = enddata - startdata
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
254 if targetsize is None or fullspan <= targetsize:
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
255 yield revs
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
256 return
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
257
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
258 startrevidx = 0
40657
2eb48aa0acce sparse-revlog: align endrevidx usages in the _slicechunktosize
Boris Feld <boris.feld@octobus.net>
parents: 40654
diff changeset
259 endrevidx = 1
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
260 iterrevs = enumerate(revs)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
261 next(iterrevs) # skip first rev.
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
262 # first step: get snapshots out of the way
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
263 for idx, r in iterrevs:
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
264 span = revlog.end(r) - startdata
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
265 snapshot = revlog.issnapshot(r)
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
266 if span <= targetsize and snapshot:
40657
2eb48aa0acce sparse-revlog: align endrevidx usages in the _slicechunktosize
Boris Feld <boris.feld@octobus.net>
parents: 40654
diff changeset
267 endrevidx = idx + 1
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
268 else:
40657
2eb48aa0acce sparse-revlog: align endrevidx usages in the _slicechunktosize
Boris Feld <boris.feld@octobus.net>
parents: 40654
diff changeset
269 chunk = _trimchunk(revlog, revs, startrevidx, endrevidx)
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
270 if chunk:
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
271 yield chunk
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
272 startrevidx = idx
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
273 startdata = revlog.start(r)
40657
2eb48aa0acce sparse-revlog: align endrevidx usages in the _slicechunktosize
Boris Feld <boris.feld@octobus.net>
parents: 40654
diff changeset
274 endrevidx = idx + 1
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
275 if not snapshot:
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
276 break
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
277
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
278 # for the others, we use binary slicing to quickly converge toward valid
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
279 # chunks (otherwise, we might end up looking for start/end of many
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
280 # revisions). This logic is not looking for the perfect slicing point, it
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
281 # focuses on quickly converging toward valid chunks.
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
282 nbitem = len(revs)
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
283 while (enddata - startdata) > targetsize:
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
284 endrevidx = nbitem
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
285 if nbitem - startrevidx <= 1:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
286 break # protect against individual chunk larger than limit
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
287 localenddata = revlog.end(revs[endrevidx - 1])
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
288 span = localenddata - startdata
40654
fd1d41ccbe38 sparse-revlog: use `span` variable as intended
Boris Feld <boris.feld@octobus.net>
parents: 40642
diff changeset
289 while span > targetsize:
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
290 if endrevidx - startrevidx <= 1:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
291 break # protect against individual chunk larger than limit
40642
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
292 endrevidx -= (endrevidx - startrevidx) // 2
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
293 localenddata = revlog.end(revs[endrevidx - 1])
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
294 span = localenddata - startdata
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
295 chunk = _trimchunk(revlog, revs, startrevidx, endrevidx)
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
296 if chunk:
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
297 yield chunk
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
298 startrevidx = endrevidx
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
299 startdata = revlog.start(revs[startrevidx])
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
300
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
301 chunk = _trimchunk(revlog, revs, startrevidx)
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
302 if chunk:
9c3c697267db sparse-revlog: rework the way we enforce chunk size limit
Boris Feld <boris.feld@octobus.net>
parents: 40641
diff changeset
303 yield chunk
38642
e59e27e52297 revlog: add function to slice chunk down to a given size
Boris Feld <boris.feld@octobus.net>
parents: 38641
diff changeset
304
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
305
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
306 def _slicechunktodensity(revlog, revs, targetdensity=0.5, mingapsize=0):
38641
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
307 """slice revs to reduce the amount of unrelated data to be read from disk.
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
308
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
309 ``revs`` is sliced into groups that should be read in one time.
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
310 Assume that revs are sorted.
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
311
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
312 The initial chunk is sliced until the overall density (payload/chunks-span
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
313 ratio) is above `targetdensity`. No gap smaller than `mingapsize` is
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
314 skipped.
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
315
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
316 >>> revlog = _testrevlog([
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
317 ... 5, #00 (5)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
318 ... 10, #01 (5)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
319 ... 12, #02 (2)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
320 ... 12, #03 (empty)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
321 ... 27, #04 (15)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
322 ... 31, #05 (4)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
323 ... 31, #06 (empty)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
324 ... 42, #07 (11)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
325 ... 47, #08 (5)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
326 ... 47, #09 (empty)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
327 ... 48, #10 (1)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
328 ... 51, #11 (3)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
329 ... 74, #12 (23)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
330 ... 85, #13 (11)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
331 ... 86, #14 (1)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
332 ... 91, #15 (5)
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
333 ... ])
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
334
38655
cd1c484e31e8 revlog: adjust doctest examples to be portable to Python 3
Augie Fackler <augie@google.com>
parents: 38644
diff changeset
335 >>> list(_slicechunktodensity(revlog, list(range(16))))
38641
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
336 [[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
337 >>> list(_slicechunktodensity(revlog, [0, 15]))
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
338 [[0], [15]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
339 >>> list(_slicechunktodensity(revlog, [0, 11, 15]))
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
340 [[0], [11], [15]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
341 >>> list(_slicechunktodensity(revlog, [0, 11, 13, 15]))
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
342 [[0], [11, 13, 15]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
343 >>> list(_slicechunktodensity(revlog, [1, 2, 3, 5, 8, 10, 11, 14]))
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
344 [[1, 2], [5, 8, 10, 11], [14]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
345 >>> list(_slicechunktodensity(revlog, [1, 2, 3, 5, 8, 10, 11, 14],
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
346 ... mingapsize=20))
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
347 [[1, 2, 3, 5, 8, 10, 11], [14]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
348 >>> list(_slicechunktodensity(revlog, [1, 2, 3, 5, 8, 10, 11, 14],
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
349 ... targetdensity=0.95))
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
350 [[1, 2], [5], [8, 10, 11], [14]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
351 >>> list(_slicechunktodensity(revlog, [1, 2, 3, 5, 8, 10, 11, 14],
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
352 ... targetdensity=0.95, mingapsize=12))
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
353 [[1, 2], [5, 8, 10, 11], [14]]
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
354 """
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
355 start = revlog.start
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
356 length = revlog.length
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
357
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
358 if len(revs) <= 1:
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
359 yield revs
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
360 return
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
361
40604
3ac23dad6364 sparse-revlog: drop unused deltainfo parameter from _slicechunktodensity
Boris Feld <boris.feld@octobus.net>
parents: 40603
diff changeset
362 deltachainspan = segmentspan(revlog, revs)
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
363
38641
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
364 if deltachainspan < mingapsize:
38635
d083ae26c325 revlog: early return in _slicechunk when span is already small enough
Boris Feld <boris.feld@octobus.net>
parents: 38634
diff changeset
365 yield revs
d083ae26c325 revlog: early return in _slicechunk when span is already small enough
Boris Feld <boris.feld@octobus.net>
parents: 38634
diff changeset
366 return
d083ae26c325 revlog: early return in _slicechunk when span is already small enough
Boris Feld <boris.feld@octobus.net>
parents: 38634
diff changeset
367
38718
f8762ea73e0d sparse-revlog: implement algorithm to write sparse delta chains (issue5480)
Paul Morelle <paul.morelle@octobus.net>
parents: 38717
diff changeset
368 readdata = deltachainspan
40606
bfbfd15d65bd sparse-revlog: fast-path before computing payload size
Boris Feld <boris.feld@octobus.net>
parents: 40605
diff changeset
369 chainpayload = sum(length(r) for r in revs)
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
370
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
371 if deltachainspan:
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
372 density = chainpayload / float(deltachainspan)
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
373 else:
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
374 density = 1.0
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
375
38641
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
376 if density >= targetdensity:
38634
f0ea8b847831 revlog: early return in _slicechunk when density is already good
Paul Morelle <paul.morelle@octobus.net>
parents: 38632
diff changeset
377 yield revs
f0ea8b847831 revlog: early return in _slicechunk when density is already good
Paul Morelle <paul.morelle@octobus.net>
parents: 38632
diff changeset
378 return
f0ea8b847831 revlog: early return in _slicechunk when density is already good
Paul Morelle <paul.morelle@octobus.net>
parents: 38632
diff changeset
379
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
380 # Store the gaps in a heap to have them sorted by decreasing size
40607
54de23400b2a sparse-revlog: stop using a heap to track gaps
Boris Feld <boris.feld@octobus.net>
parents: 40606
diff changeset
381 gaps = []
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
382 prevend = None
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
383 for i, rev in enumerate(revs):
40604
3ac23dad6364 sparse-revlog: drop unused deltainfo parameter from _slicechunktodensity
Boris Feld <boris.feld@octobus.net>
parents: 40603
diff changeset
384 revstart = start(rev)
3ac23dad6364 sparse-revlog: drop unused deltainfo parameter from _slicechunktodensity
Boris Feld <boris.feld@octobus.net>
parents: 40603
diff changeset
385 revlen = length(rev)
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
386
34898
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
387 # Skip empty revisions to form larger holes
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
388 if revlen == 0:
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
389 continue
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
390
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
391 if prevend is not None:
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
392 gapsize = revstart - prevend
34881
8c9b08a0c48c sparse-read: skip gaps too small to be worth splitting
Paul Morelle <paul.morelle@octobus.net>
parents: 34880
diff changeset
393 # only consider holes that are large enough
38641
feba6be0941b revlog: extract density based slicing into its own function
Boris Feld <boris.feld@octobus.net>
parents: 38640
diff changeset
394 if gapsize > mingapsize:
40607
54de23400b2a sparse-revlog: stop using a heap to track gaps
Boris Feld <boris.feld@octobus.net>
parents: 40606
diff changeset
395 gaps.append((gapsize, i))
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
396
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
397 prevend = revstart + revlen
40607
54de23400b2a sparse-revlog: stop using a heap to track gaps
Boris Feld <boris.feld@octobus.net>
parents: 40606
diff changeset
398 # sort the gaps to pop them from largest to small
54de23400b2a sparse-revlog: stop using a heap to track gaps
Boris Feld <boris.feld@octobus.net>
parents: 40606
diff changeset
399 gaps.sort()
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
400
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
401 # Collect the indices of the largest holes until the density is acceptable
40608
526ee887c4d5 sparse-revlog: stop using a heap to track selected gap
Boris Feld <boris.feld@octobus.net>
parents: 40607
diff changeset
402 selected = []
40607
54de23400b2a sparse-revlog: stop using a heap to track gaps
Boris Feld <boris.feld@octobus.net>
parents: 40606
diff changeset
403 while gaps and density < targetdensity:
54de23400b2a sparse-revlog: stop using a heap to track gaps
Boris Feld <boris.feld@octobus.net>
parents: 40606
diff changeset
404 gapsize, gapidx = gaps.pop()
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
405
40608
526ee887c4d5 sparse-revlog: stop using a heap to track selected gap
Boris Feld <boris.feld@octobus.net>
parents: 40607
diff changeset
406 selected.append(gapidx)
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
407
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
408 # the gap sizes are stored as negatives to be sorted decreasingly
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
409 # by the heap
40607
54de23400b2a sparse-revlog: stop using a heap to track gaps
Boris Feld <boris.feld@octobus.net>
parents: 40606
diff changeset
410 readdata -= gapsize
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
411 if readdata > 0:
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
412 density = chainpayload / float(readdata)
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
413 else:
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
414 density = 1.0
40608
526ee887c4d5 sparse-revlog: stop using a heap to track selected gap
Boris Feld <boris.feld@octobus.net>
parents: 40607
diff changeset
415 selected.sort()
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
416
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
417 # Cut the revs at collected indices
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
418 previdx = 0
40608
526ee887c4d5 sparse-revlog: stop using a heap to track selected gap
Boris Feld <boris.feld@octobus.net>
parents: 40607
diff changeset
419 for idx in selected:
34898
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
420
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
421 chunk = _trimchunk(revlog, revs, previdx, idx)
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
422 if chunk:
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
423 yield chunk
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
424
34880
9e18ab7f7240 sparse-read: move from a recursive-based approach to a heap-based one
Boris Feld <boris.feld@octobus.net>
parents: 34825
diff changeset
425 previdx = idx
34898
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
426
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
427 chunk = _trimchunk(revlog, revs, previdx)
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
428 if chunk:
1bde8e8e5de0 sparse-read: ignore trailing empty revs in each read chunk
Paul Morelle <paul.morelle@octobus.net>
parents: 34881
diff changeset
429 yield chunk
34824
e2ad93bcc084 revlog: introduce an experimental flag to slice chunks reads when too sparse
Paul Morelle <paul.morelle@octobus.net>
parents: 34823
diff changeset
430
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
431
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
432 def _trimchunk(revlog, revs, startidx, endidx=None):
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
433 """returns revs[startidx:endidx] without empty trailing revs
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
434
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
435 Doctest Setup
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
436 >>> revlog = _testrevlog([
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
437 ... 5, #0
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
438 ... 10, #1
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
439 ... 12, #2
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
440 ... 12, #3 (empty)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
441 ... 17, #4
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
442 ... 21, #5
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
443 ... 21, #6 (empty)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
444 ... ])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
445
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
446 Contiguous cases:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
447 >>> _trimchunk(revlog, [0, 1, 2, 3, 4, 5, 6], 0)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
448 [0, 1, 2, 3, 4, 5]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
449 >>> _trimchunk(revlog, [0, 1, 2, 3, 4, 5, 6], 0, 5)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
450 [0, 1, 2, 3, 4]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
451 >>> _trimchunk(revlog, [0, 1, 2, 3, 4, 5, 6], 0, 4)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
452 [0, 1, 2]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
453 >>> _trimchunk(revlog, [0, 1, 2, 3, 4, 5, 6], 2, 4)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
454 [2]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
455 >>> _trimchunk(revlog, [0, 1, 2, 3, 4, 5, 6], 3)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
456 [3, 4, 5]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
457 >>> _trimchunk(revlog, [0, 1, 2, 3, 4, 5, 6], 3, 5)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
458 [3, 4]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
459
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
460 Discontiguous cases:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
461 >>> _trimchunk(revlog, [1, 3, 5, 6], 0)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
462 [1, 3, 5]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
463 >>> _trimchunk(revlog, [1, 3, 5, 6], 0, 2)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
464 [1]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
465 >>> _trimchunk(revlog, [1, 3, 5, 6], 1, 3)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
466 [3, 5]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
467 >>> _trimchunk(revlog, [1, 3, 5, 6], 1)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
468 [3, 5]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
469 """
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
470 length = revlog.length
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
471
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
472 if endidx is None:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
473 endidx = len(revs)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
474
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
475 # If we have a non-emtpy delta candidate, there are nothing to trim
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
476 if revs[endidx - 1] < len(revlog):
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
477 # Trim empty revs at the end, except the very first revision of a chain
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
478 while (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
479 endidx > 1 and endidx > startidx and length(revs[endidx - 1]) == 0
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
480 ):
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
481 endidx -= 1
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
482
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
483 return revs[startidx:endidx]
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
484
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
485
40605
a32ccd32982b sparse-revlog: drop unused deltainfo parameter from segmentspan
Boris Feld <boris.feld@octobus.net>
parents: 40604
diff changeset
486 def segmentspan(revlog, revs):
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
487 """Get the byte span of a segment of revisions
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
488
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
489 revs is a sorted array of revision numbers
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
490
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
491 >>> revlog = _testrevlog([
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
492 ... 5, #0
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
493 ... 10, #1
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
494 ... 12, #2
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
495 ... 12, #3 (empty)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
496 ... 17, #4
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
497 ... ])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
498
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
499 >>> segmentspan(revlog, [0, 1, 2, 3, 4])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
500 17
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
501 >>> segmentspan(revlog, [0, 4])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
502 17
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
503 >>> segmentspan(revlog, [3, 4])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
504 5
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
505 >>> segmentspan(revlog, [1, 2, 3,])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
506 7
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
507 >>> segmentspan(revlog, [1, 3])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
508 7
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
509 """
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
510 if not revs:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
511 return 0
40605
a32ccd32982b sparse-revlog: drop unused deltainfo parameter from segmentspan
Boris Feld <boris.feld@octobus.net>
parents: 40604
diff changeset
512 end = revlog.end(revs[-1])
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
513 return end - revlog.start(revs[0])
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
514
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
515
39331
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
516 def _textfromdelta(fh, revlog, baserev, delta, p1, p2, flags, expectednode):
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
517 """build full text from a (base, delta) pair and other metadata"""
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
518 # special case deltas which replace entire base; no need to decode
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
519 # base revision. this neatly avoids censored bases, which throw when
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
520 # they're decoded.
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
521 hlen = struct.calcsize(b">lll")
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
522 if delta[:hlen] == mdiff.replacediffheader(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
523 revlog.rawsize(baserev), len(delta) - hlen
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
524 ):
39331
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
525 fulltext = delta[hlen:]
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
526 else:
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
527 # deltabase is rawtext before changed by flag processors, which is
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
528 # equivalent to non-raw text
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
529 basetext = revlog.revision(baserev, _df=fh, raw=False)
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
530 fulltext = mdiff.patch(basetext, delta)
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
531
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
532 try:
42992
dff95420480f flagprocessors: make `processflagsraw` a module level function
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42878
diff changeset
533 validatehash = flagutil.processflagsraw(revlog, fulltext, flags)
39331
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
534 if validatehash:
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
535 revlog.checkhash(fulltext, expectednode, p1=p1, p2=p2)
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
536 if flags & REVIDX_ISCENSORED:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
537 raise error.StorageError(
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
538 _(b'node %s is not censored') % expectednode
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
539 )
39774
4a2466b2a434 revlog: drop some more error aliases (API)
Gregory Szorc <gregory.szorc@gmail.com>
parents: 39773
diff changeset
540 except error.CensoredNodeError:
39331
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
541 # must pass the censored index flag to add censored revisions
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
542 if not flags & REVIDX_ISCENSORED:
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
543 raise
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
544 return fulltext
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
545
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
546
35638
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
547 @attr.s(slots=True, frozen=True)
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
548 class _deltainfo(object):
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
549 distance = attr.ib()
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
550 deltalen = attr.ib()
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
551 data = attr.ib()
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
552 base = attr.ib()
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
553 chainbase = attr.ib()
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
554 chainlen = attr.ib()
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
555 compresseddeltalen = attr.ib()
39154
e0da43e2f71f revlog: compute snapshot depth on delta info
Boris Feld <boris.feld@octobus.net>
parents: 39152
diff changeset
556 snapshotdepth = attr.ib()
35638
edc9330acac1 revlog: introduce 'deltainfo' to distinguish from 'delta'
Paul Morelle <paul.morelle@octobus.net>
parents: 35637
diff changeset
557
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
558
47253
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
559 def drop_u_compression(delta):
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
560 """turn into a "u" (no-compression) into no-compression without header
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
561
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
562 This is useful for revlog format that has better compression method.
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
563 """
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
564 assert delta.data[0] == b'u', delta.data[0]
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
565 return _deltainfo(
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
566 delta.distance,
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
567 delta.deltalen - 1,
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
568 (b'', delta.data[1]),
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
569 delta.base,
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
570 delta.chainbase,
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
571 delta.chainlen,
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
572 delta.compresseddeltalen,
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
573 delta.snapshotdepth,
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
574 )
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
575
b876f0bf7366 revlog: introduce a plain compression mode
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 46819
diff changeset
576
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
577 def isgooddeltainfo(revlog, deltainfo, revinfo):
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
578 """Returns True if the given delta is good. Good means that it is within
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
579 the disk span, disk size, and chain length bounds that we know to be
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
580 performant."""
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
581 if deltainfo is None:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
582 return False
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
583
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
584 # - 'deltainfo.distance' is the distance from the base revision --
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
585 # bounding it limits the amount of I/O we need to do.
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
586 # - 'deltainfo.compresseddeltalen' is the sum of the total size of
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
587 # deltas we need to apply -- bounding it limits the amount of CPU
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
588 # we consume.
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
589
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
590 textlen = revinfo.textlen
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
591 defaultmax = textlen * 4
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
592 maxdist = revlog._maxdeltachainspan
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
593 if not maxdist:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
594 maxdist = deltainfo.distance # ensure the conditional pass
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
595 maxdist = max(maxdist, defaultmax)
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
596
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
597 # Bad delta from read span:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
598 #
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
599 # If the span of data read is larger than the maximum allowed.
40603
2f7e531ef3e7 sparse-revlog: skip the span check in the sparse-revlog case
Boris Feld <boris.feld@octobus.net>
parents: 40451
diff changeset
600 #
2f7e531ef3e7 sparse-revlog: skip the span check in the sparse-revlog case
Boris Feld <boris.feld@octobus.net>
parents: 40451
diff changeset
601 # In the sparse-revlog case, we rely on the associated "sparse reading"
2f7e531ef3e7 sparse-revlog: skip the span check in the sparse-revlog case
Boris Feld <boris.feld@octobus.net>
parents: 40451
diff changeset
602 # to avoid issue related to the span of data. In theory, it would be
2f7e531ef3e7 sparse-revlog: skip the span check in the sparse-revlog case
Boris Feld <boris.feld@octobus.net>
parents: 40451
diff changeset
603 # possible to build pathological revlog where delta pattern would lead
2f7e531ef3e7 sparse-revlog: skip the span check in the sparse-revlog case
Boris Feld <boris.feld@octobus.net>
parents: 40451
diff changeset
604 # to too many reads. However, they do not happen in practice at all. So
2f7e531ef3e7 sparse-revlog: skip the span check in the sparse-revlog case
Boris Feld <boris.feld@octobus.net>
parents: 40451
diff changeset
605 # we skip the span check entirely.
2f7e531ef3e7 sparse-revlog: skip the span check in the sparse-revlog case
Boris Feld <boris.feld@octobus.net>
parents: 40451
diff changeset
606 if not revlog._sparserevlog and maxdist < deltainfo.distance:
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
607 return False
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
608
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
609 # Bad delta from new delta size:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
610 #
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
611 # If the delta size is larger than the target text, storing the
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
612 # delta will be inefficient.
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
613 if textlen < deltainfo.deltalen:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
614 return False
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
615
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
616 # Bad delta from cumulated payload size:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
617 #
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
618 # If the sum of delta get larger than K * target text length.
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
619 if textlen * LIMIT_DELTA2TEXT < deltainfo.compresseddeltalen:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
620 return False
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
621
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
622 # Bad delta from chain length:
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
623 #
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
624 # If the number of delta in the chain gets too high.
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
625 if revlog._maxchainlen and revlog._maxchainlen < deltainfo.chainlen:
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
626 return False
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
627
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
628 # bad delta from intermediate snapshot size limit
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
629 #
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
630 # If an intermediate snapshot size is higher than the limit. The
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
631 # limit exist to prevent endless chain of intermediate delta to be
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
632 # created.
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
633 if (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
634 deltainfo.snapshotdepth is not None
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
635 and (textlen >> deltainfo.snapshotdepth) < deltainfo.deltalen
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
636 ):
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
637 return False
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
638
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
639 # bad delta if new intermediate snapshot is larger than the previous
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
640 # snapshot
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
641 if (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
642 deltainfo.snapshotdepth
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
643 and revlog.length(deltainfo.base) < deltainfo.deltalen
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
644 ):
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
645 return False
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
646
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
647 return True
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
648
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
649
40978
42f59d3f714d delta: exclude base candidate much smaller than the target
Boris Feld <boris.feld@octobus.net>
parents: 40957
diff changeset
650 # If a revision's full text is that much bigger than a base candidate full
42f59d3f714d delta: exclude base candidate much smaller than the target
Boris Feld <boris.feld@octobus.net>
parents: 40957
diff changeset
651 # text's, it is very unlikely that it will produce a valid delta. We no longer
42f59d3f714d delta: exclude base candidate much smaller than the target
Boris Feld <boris.feld@octobus.net>
parents: 40957
diff changeset
652 # consider these candidates.
41033
b373477948df revlog: limit base to rev size ratio to 500 instead of 50
Boris Feld <boris.feld@octobus.net>
parents: 40979
diff changeset
653 LIMIT_BASE2TEXT = 500
40978
42f59d3f714d delta: exclude base candidate much smaller than the target
Boris Feld <boris.feld@octobus.net>
parents: 40957
diff changeset
654
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
655
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
656 def _candidategroups(revlog, textlen, p1, p2, cachedelta):
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
657 """Provides group of revision to be tested as delta base
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
658
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
659 This top level function focus on emitting groups with unique and worthwhile
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
660 content. See _raw_candidate_groups for details about the group order.
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
661 """
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
662 # should we try to build a delta?
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
663 if not (len(revlog) and revlog._storedeltachains):
39497
5b308a4e6d03 snapshot: use None as a stop value when looking for a good delta
Boris Feld <boris.feld@octobus.net>
parents: 39496
diff changeset
664 yield None
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
665 return
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
666
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
667 deltalength = revlog.length
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
668 deltaparent = revlog.deltaparent
40978
42f59d3f714d delta: exclude base candidate much smaller than the target
Boris Feld <boris.feld@octobus.net>
parents: 40957
diff changeset
669 sparse = revlog._sparserevlog
39498
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
670 good = None
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
671
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
672 deltas_limit = textlen * LIMIT_DELTA2TEXT
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
673
42057
566daffc607d cleanup: use set literals where possible
Martin von Zweigbergk <martinvonz@google.com>
parents: 41819
diff changeset
674 tested = {nullrev}
39499
51cec7fb672e snapshot: also use None as a stop value for `_refinegroup`
Boris Feld <boris.feld@octobus.net>
parents: 39498
diff changeset
675 candidates = _refinedgroups(revlog, p1, p2, cachedelta)
51cec7fb672e snapshot: also use None as a stop value for `_refinegroup`
Boris Feld <boris.feld@octobus.net>
parents: 39498
diff changeset
676 while True:
39500
cc85ebb68ff9 snapshot: turn _refinedgroups into a coroutine
Boris Feld <boris.feld@octobus.net>
parents: 39499
diff changeset
677 temptative = candidates.send(good)
39499
51cec7fb672e snapshot: also use None as a stop value for `_refinegroup`
Boris Feld <boris.feld@octobus.net>
parents: 39498
diff changeset
678 if temptative is None:
51cec7fb672e snapshot: also use None as a stop value for `_refinegroup`
Boris Feld <boris.feld@octobus.net>
parents: 39498
diff changeset
679 break
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
680 group = []
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
681 for rev in temptative:
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
682 # skip over empty delta (no need to include them in a chain)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
683 while revlog._generaldelta and not (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
684 rev == nullrev or rev in tested or deltalength(rev)
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
685 ):
39594
bdb41eaa8b59 snapshot: fix line order when skipping over empty deltas
Boris Feld <boris.feld@octobus.net>
parents: 39505
diff changeset
686 tested.add(rev)
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
687 rev = deltaparent(rev)
40957
f960c51eebf3 delta: filter nullrev out first
Boris Feld <boris.feld@octobus.net>
parents: 40709
diff changeset
688 # no need to try a delta against nullrev, this will be done as a
f960c51eebf3 delta: filter nullrev out first
Boris Feld <boris.feld@octobus.net>
parents: 40709
diff changeset
689 # last resort.
f960c51eebf3 delta: filter nullrev out first
Boris Feld <boris.feld@octobus.net>
parents: 40709
diff changeset
690 if rev == nullrev:
f960c51eebf3 delta: filter nullrev out first
Boris Feld <boris.feld@octobus.net>
parents: 40709
diff changeset
691 continue
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
692 # filter out revision we tested already
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
693 if rev in tested:
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
694 continue
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
695 tested.add(rev)
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
696 # filter out delta base that will never produce good delta
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
697 if deltas_limit < revlog.length(rev):
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
698 continue
40978
42f59d3f714d delta: exclude base candidate much smaller than the target
Boris Feld <boris.feld@octobus.net>
parents: 40957
diff changeset
699 if sparse and revlog.rawsize(rev) < (textlen // LIMIT_BASE2TEXT):
42f59d3f714d delta: exclude base candidate much smaller than the target
Boris Feld <boris.feld@octobus.net>
parents: 40957
diff changeset
700 continue
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
701 # no delta for rawtext-changing revs (see "candelta" for why)
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
702 if revlog.flags(rev) & REVIDX_RAWTEXT_CHANGING_FLAGS:
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
703 continue
40979
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
704 # If we reach here, we are about to build and test a delta.
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
705 # The delta building process will compute the chaininfo in all
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
706 # case, since that computation is cached, it is fine to access it
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
707 # here too.
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
708 chainlen, chainsize = revlog._chaininfo(rev)
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
709 # if chain will be too long, skip base
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
710 if revlog._maxchainlen and chainlen >= revlog._maxchainlen:
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
711 continue
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
712 # if chain already have too much data, skip base
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
713 if deltas_limit < chainsize:
ba09db267cb6 delta: ignore base whose chains already don't match expectations
Boris Feld <boris.feld@octobus.net>
parents: 40978
diff changeset
714 continue
42463
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
715 if sparse and revlog.upperboundcomp is not None:
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
716 maxcomp = revlog.upperboundcomp
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
717 basenotsnap = (p1, p2, nullrev)
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
718 if rev not in basenotsnap and revlog.issnapshot(rev):
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
719 snapshotdepth = revlog.snapshotdepth(rev)
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
720 # If text is significantly larger than the base, we can
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
721 # expect the resulting delta to be proportional to the size
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
722 # difference
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
723 revsize = revlog.rawsize(rev)
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
724 rawsizedistance = max(textlen - revsize, 0)
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
725 # use an estimate of the compression upper bound.
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
726 lowestrealisticdeltalen = rawsizedistance // maxcomp
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
727
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
728 # check the absolute constraint on the delta size
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
729 snapshotlimit = textlen >> snapshotdepth
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
730 if snapshotlimit < lowestrealisticdeltalen:
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
731 # delta lower bound is larger than accepted upper bound
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
732 continue
a0b26fc8fbba deltas: skip if projected delta size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>, Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42057
diff changeset
733
42464
66c27df1be84 deltas: skip if projected delta size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42463
diff changeset
734 # check the relative constraint on the delta size
66c27df1be84 deltas: skip if projected delta size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42463
diff changeset
735 revlength = revlog.length(rev)
66c27df1be84 deltas: skip if projected delta size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42463
diff changeset
736 if revlength < lowestrealisticdeltalen:
66c27df1be84 deltas: skip if projected delta size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42463
diff changeset
737 # delta probable lower bound is larger than target base
66c27df1be84 deltas: skip if projected delta size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42463
diff changeset
738 continue
66c27df1be84 deltas: skip if projected delta size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42463
diff changeset
739
39337
37957e07138c revlogdeltas: move finddeltainfo filtering inside _candidategroups
Boris Feld <boris.feld@octobus.net>
parents: 39336
diff changeset
740 group.append(rev)
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
741 if group:
39493
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
742 # XXX: in the sparse revlog case, group can become large,
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
743 # impacting performances. Some bounding or slicing mecanism
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
744 # would help to reduce this impact.
39498
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
745 good = yield tuple(group)
39497
5b308a4e6d03 snapshot: use None as a stop value when looking for a good delta
Boris Feld <boris.feld@octobus.net>
parents: 39496
diff changeset
746 yield None
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
747
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
748
39493
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
749 def _findsnapshots(revlog, cache, start_rev):
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
750 """find snapshot from start_rev to tip"""
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
751 if util.safehasattr(revlog.index, b'findsnapshots'):
41108
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
752 revlog.index.findsnapshots(cache, start_rev)
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
753 else:
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
754 deltaparent = revlog.deltaparent
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
755 issnapshot = revlog.issnapshot
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
756 for rev in revlog.revs(start_rev):
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
757 if issnapshot(rev):
38e88450138c delta: have a native implementation of _findsnapshot
Boris Feld <boris.feld@octobus.net>
parents: 41079
diff changeset
758 cache[deltaparent(rev)].append(rev)
39493
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
759
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
760
39496
2f9f7889549b snapshot: introduce an intermediate `_refinedgroups` generator
Boris Feld <boris.feld@octobus.net>
parents: 39495
diff changeset
761 def _refinedgroups(revlog, p1, p2, cachedelta):
2f9f7889549b snapshot: introduce an intermediate `_refinedgroups` generator
Boris Feld <boris.feld@octobus.net>
parents: 39495
diff changeset
762 good = None
39501
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
763 # First we try to reuse a the delta contained in the bundle.
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
764 # (or from the source revlog)
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
765 #
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
766 # This logic only applies to general delta repositories and can be disabled
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
767 # through configuration. Disabling reuse source delta is useful when
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
768 # we want to make sure we recomputed "optimal" deltas.
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
769 if cachedelta and revlog._generaldelta and revlog._lazydeltabase:
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
770 # Assume what we received from the server is a good choice
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
771 # build delta will reuse the cache
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
772 good = yield (cachedelta[0],)
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
773 if good is not None:
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
774 yield None
993d7e2c8b79 snapshot: make sure we'll never refine delta base from a reused source
Boris Feld <boris.feld@octobus.net>
parents: 39500
diff changeset
775 return
41109
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
776 snapshots = collections.defaultdict(list)
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
777 for candidates in _rawgroups(revlog, p1, p2, cachedelta, snapshots):
39496
2f9f7889549b snapshot: introduce an intermediate `_refinedgroups` generator
Boris Feld <boris.feld@octobus.net>
parents: 39495
diff changeset
778 good = yield candidates
2f9f7889549b snapshot: introduce an intermediate `_refinedgroups` generator
Boris Feld <boris.feld@octobus.net>
parents: 39495
diff changeset
779 if good is not None:
2f9f7889549b snapshot: introduce an intermediate `_refinedgroups` generator
Boris Feld <boris.feld@octobus.net>
parents: 39495
diff changeset
780 break
39502
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
781
40428
bafa1c4bb7a8 sparse-revlog: only refine delta candidates in the sparse case (issue6006)
Boris Feld <boris.feld@octobus.net>
parents: 39777
diff changeset
782 # If sparse revlog is enabled, we can try to refine the available deltas
bafa1c4bb7a8 sparse-revlog: only refine delta candidates in the sparse case (issue6006)
Boris Feld <boris.feld@octobus.net>
parents: 39777
diff changeset
783 if not revlog._sparserevlog:
bafa1c4bb7a8 sparse-revlog: only refine delta candidates in the sparse case (issue6006)
Boris Feld <boris.feld@octobus.net>
parents: 39777
diff changeset
784 yield None
bafa1c4bb7a8 sparse-revlog: only refine delta candidates in the sparse case (issue6006)
Boris Feld <boris.feld@octobus.net>
parents: 39777
diff changeset
785 return
bafa1c4bb7a8 sparse-revlog: only refine delta candidates in the sparse case (issue6006)
Boris Feld <boris.feld@octobus.net>
parents: 39777
diff changeset
786
39502
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
787 # if we have a refinable value, try to refine it
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
788 if good is not None and good not in (p1, p2) and revlog.issnapshot(good):
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
789 # refine snapshot down
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
790 previous = None
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
791 while previous != good:
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
792 previous = good
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
793 base = revlog.deltaparent(good)
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
794 if base == nullrev:
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
795 break
e4d4361d0bcd snapshot: try to refine new snapshot base down the chain
Boris Feld <boris.feld@octobus.net>
parents: 39501
diff changeset
796 good = yield (base,)
39503
5aef5afa8654 snapshot: refine candidate snapshot base upward
Boris Feld <boris.feld@octobus.net>
parents: 39502
diff changeset
797 # refine snapshot up
41109
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
798 if not snapshots:
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
799 _findsnapshots(revlog, snapshots, good + 1)
39503
5aef5afa8654 snapshot: refine candidate snapshot base upward
Boris Feld <boris.feld@octobus.net>
parents: 39502
diff changeset
800 previous = None
5aef5afa8654 snapshot: refine candidate snapshot base upward
Boris Feld <boris.feld@octobus.net>
parents: 39502
diff changeset
801 while good != previous:
5aef5afa8654 snapshot: refine candidate snapshot base upward
Boris Feld <boris.feld@octobus.net>
parents: 39502
diff changeset
802 previous = good
5aef5afa8654 snapshot: refine candidate snapshot base upward
Boris Feld <boris.feld@octobus.net>
parents: 39502
diff changeset
803 children = tuple(sorted(c for c in snapshots[good]))
5aef5afa8654 snapshot: refine candidate snapshot base upward
Boris Feld <boris.feld@octobus.net>
parents: 39502
diff changeset
804 good = yield children
5aef5afa8654 snapshot: refine candidate snapshot base upward
Boris Feld <boris.feld@octobus.net>
parents: 39502
diff changeset
805
39499
51cec7fb672e snapshot: also use None as a stop value for `_refinegroup`
Boris Feld <boris.feld@octobus.net>
parents: 39498
diff changeset
806 # we have found nothing
51cec7fb672e snapshot: also use None as a stop value for `_refinegroup`
Boris Feld <boris.feld@octobus.net>
parents: 39498
diff changeset
807 yield None
39496
2f9f7889549b snapshot: introduce an intermediate `_refinedgroups` generator
Boris Feld <boris.feld@octobus.net>
parents: 39495
diff changeset
808
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
809
41109
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
810 def _rawgroups(revlog, p1, p2, cachedelta, snapshots=None):
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
811 """Provides group of revision to be tested as delta base
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
812
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
813 This lower level function focus on emitting delta theorically interresting
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
814 without looking it any practical details.
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
815
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
816 The group order aims at providing fast or small candidates first.
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
817 """
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
818 gdelta = revlog._generaldelta
41447
189e06b2d719 revlog: make sure we never use sparserevlog without general delta (issue6056)
Boris Feld <boris.feld@octobus.net>
parents: 41109
diff changeset
819 # gate sparse behind general-delta because of issue6056
189e06b2d719 revlog: make sure we never use sparserevlog without general delta (issue6056)
Boris Feld <boris.feld@octobus.net>
parents: 41109
diff changeset
820 sparse = gdelta and revlog._sparserevlog
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
821 curr = len(revlog)
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
822 prev = curr - 1
39492
a33f394b2bfd snapshot: try intermediate snapshot against parents' base
Boris Feld <boris.feld@octobus.net>
parents: 39488
diff changeset
823 deltachain = lambda rev: revlog._deltachain(rev)[0]
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
824
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
825 if gdelta:
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
826 # exclude already lazy tested base if any
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
827 parents = [p for p in (p1, p2) if p != nullrev]
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
828
39336
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
829 if not revlog._deltabothparents and len(parents) == 2:
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
830 parents.sort()
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
831 # To minimize the chance of having to build a fulltext,
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
832 # pick first whichever parent is closest to us (max rev)
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
833 yield (parents[1],)
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
834 # then the other one (min rev) if the first did not fit
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
835 yield (parents[0],)
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
836 elif len(parents) > 0:
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
837 # Test all parents (1 or 2), and keep the best candidate
1c6ff52fe9cf revlogdeltas: split candidate groups selection from the filtering logic
Boris Feld <boris.feld@octobus.net>
parents: 39335
diff changeset
838 yield parents
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
839
39492
a33f394b2bfd snapshot: try intermediate snapshot against parents' base
Boris Feld <boris.feld@octobus.net>
parents: 39488
diff changeset
840 if sparse and parents:
41109
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
841 if snapshots is None:
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
842 # map: base-rev: snapshot-rev
3e1960e23e6b delta: reuse _findsnapshot call from previous stage
Boris Feld <boris.feld@octobus.net>
parents: 41108
diff changeset
843 snapshots = collections.defaultdict(list)
39492
a33f394b2bfd snapshot: try intermediate snapshot against parents' base
Boris Feld <boris.feld@octobus.net>
parents: 39488
diff changeset
844 # See if we can use an existing snapshot in the parent chains to use as
a33f394b2bfd snapshot: try intermediate snapshot against parents' base
Boris Feld <boris.feld@octobus.net>
parents: 39488
diff changeset
845 # a base for a new intermediate-snapshot
39494
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
846 #
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
847 # search for snapshot in parents delta chain
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
848 # map: snapshot-level: snapshot-rev
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
849 parents_snaps = collections.defaultdict(set)
39504
05a165dc4f55 snapshot: extract parent chain computation
Boris Feld <boris.feld@octobus.net>
parents: 39503
diff changeset
850 candidate_chains = [deltachain(p) for p in parents]
05a165dc4f55 snapshot: extract parent chain computation
Boris Feld <boris.feld@octobus.net>
parents: 39503
diff changeset
851 for chain in candidate_chains:
05a165dc4f55 snapshot: extract parent chain computation
Boris Feld <boris.feld@octobus.net>
parents: 39503
diff changeset
852 for idx, s in enumerate(chain):
39494
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
853 if not revlog.issnapshot(s):
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
854 break
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
855 parents_snaps[idx].add(s)
39495
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
856 snapfloor = min(parents_snaps[0]) + 1
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
857 _findsnapshots(revlog, snapshots, snapfloor)
39505
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
858 # search for the highest "unrelated" revision
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
859 #
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
860 # Adding snapshots used by "unrelated" revision increase the odd we
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
861 # reuse an independant, yet better snapshot chain.
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
862 #
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
863 # XXX instead of building a set of revisions, we could lazily enumerate
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
864 # over the chains. That would be more efficient, however we stick to
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
865 # simple code for now.
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
866 all_revs = set()
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
867 for chain in candidate_chains:
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
868 all_revs.update(chain)
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
869 other = None
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
870 for r in revlog.revs(prev, snapfloor):
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
871 if r not in all_revs:
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
872 other = r
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
873 break
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
874 if other is not None:
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
875 # To avoid unfair competition, we won't use unrelated intermediate
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
876 # snapshot that are deeper than the ones from the parent delta
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
877 # chain.
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
878 max_depth = max(parents_snaps.keys())
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
879 chain = deltachain(other)
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
880 for idx, s in enumerate(chain):
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
881 if s < snapfloor:
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
882 continue
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
883 if max_depth < idx:
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
884 break
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
885 if not revlog.issnapshot(s):
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
886 break
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
887 parents_snaps[idx].add(s)
39494
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
888 # Test them as possible intermediate snapshot base
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
889 # We test them from highest to lowest level. High level one are more
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
890 # likely to result in small delta
39495
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
891 floor = None
39494
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
892 for idx, snaps in sorted(parents_snaps.items(), reverse=True):
39495
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
893 siblings = set()
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
894 for s in snaps:
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
895 siblings.update(snapshots[s])
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
896 # Before considering making a new intermediate snapshot, we check
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
897 # if an existing snapshot, children of base we consider, would be
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
898 # suitable.
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
899 #
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
900 # It give a change to reuse a delta chain "unrelated" to the
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
901 # current revision instead of starting our own. Without such
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
902 # re-use, topological branches would keep reopening new chains.
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
903 # Creating more and more snapshot as the repository grow.
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
904
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
905 if floor is not None:
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
906 # We only do this for siblings created after the one in our
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
907 # parent's delta chain. Those created before has less chances
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
908 # to be valid base since our ancestors had to create a new
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
909 # snapshot.
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
910 siblings = [r for r in siblings if floor < r]
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
911 yield tuple(sorted(siblings))
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
912 # then test the base from our parent's delta chain.
39494
e72130f58f5d snapshot: consider all snapshots in the parents' chains
Boris Feld <boris.feld@octobus.net>
parents: 39493
diff changeset
913 yield tuple(sorted(snaps))
39495
6a53842727c1 snapshot: consider unrelated snapshots at a similar level first
Boris Feld <boris.feld@octobus.net>
parents: 39494
diff changeset
914 floor = min(snaps)
39493
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
915 # No suitable base found in the parent chain, search if any full
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
916 # snapshots emitted since parent's base would be a suitable base for an
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
917 # intermediate snapshot.
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
918 #
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
919 # It give a chance to reuse a delta chain unrelated to the current
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
920 # revisions instead of starting our own. Without such re-use,
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
921 # topological branches would keep reopening new full chains. Creating
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
922 # more and more snapshot as the repository grow.
3ca144f1c8dd snapshot: search for unrelated but reusable full-snapshot
Boris Feld <boris.feld@octobus.net>
parents: 39492
diff changeset
923 yield tuple(snapshots[nullrev])
39492
a33f394b2bfd snapshot: try intermediate snapshot against parents' base
Boris Feld <boris.feld@octobus.net>
parents: 39488
diff changeset
924
39505
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
925 if not sparse:
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
926 # other approach failed try against prev to hopefully save us a
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
927 # fulltext.
c6b8eab5db19 snapshot: also consider the snapshot chain of one unrelated revision
Boris Feld <boris.feld@octobus.net>
parents: 39504
diff changeset
928 yield (prev,)
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
929
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
930
39330
655b5b465953 revlog: split functionality related to deltas computation in a new module
Boris Feld <boris.feld@octobus.net>
parents: 39329
diff changeset
931 class deltacomputer(object):
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
932 def __init__(self, revlog):
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
933 self.revlog = revlog
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
934
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
935 def buildtext(self, revinfo, fh):
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
936 """Builds a fulltext version of a revision
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
937
47399
34cc102c73f5 revlog: move `revisioninfo` in `revlogutils`
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47253
diff changeset
938 revinfo: revisioninfo instance that contains all needed info
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
939 fh: file handle to either the .i or the .d revlog file,
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
940 depending on whether it is inlined or not
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
941 """
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
942 btext = revinfo.btext
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
943 if btext[0] is not None:
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
944 return btext[0]
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
945
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
946 revlog = self.revlog
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
947 cachedelta = revinfo.cachedelta
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
948 baserev = cachedelta[0]
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
949 delta = cachedelta[1]
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
950
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
951 fulltext = btext[0] = _textfromdelta(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
952 fh,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
953 revlog,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
954 baserev,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
955 delta,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
956 revinfo.p1,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
957 revinfo.p2,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
958 revinfo.flags,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
959 revinfo.node,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
960 )
39331
fd0150a3c2fe revlogdeltas: extra fulltext building in its own function
Boris Feld <boris.feld@octobus.net>
parents: 39330
diff changeset
961 return fulltext
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
962
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
963 def _builddeltadiff(self, base, revinfo, fh):
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
964 revlog = self.revlog
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
965 t = self.buildtext(revinfo, fh)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
966 if revlog.iscensored(base):
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
967 # deltas based on a censored revision must replace the
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
968 # full content in one patch, so delta works everywhere
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
969 header = mdiff.replacediffheader(revlog.rawsize(base), len(t))
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
970 delta = header + t
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
971 else:
42780
7a89b044eea4 rawdata: update callers in delta utils
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 42468
diff changeset
972 ptext = revlog.rawdata(base, _df=fh)
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
973 delta = mdiff.textdiff(ptext, t)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
974
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
975 return delta
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
976
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
977 def _builddeltainfo(self, revinfo, base, fh):
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
978 # can we use the cached delta?
42465
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
979 revlog = self.revlog
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
980 chainbase = revlog.chainbase(base)
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
981 if revlog._generaldelta:
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
982 deltabase = base
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
983 else:
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
984 deltabase = chainbase
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
985 snapshotdepth = None
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
986 if revlog._sparserevlog and deltabase == nullrev:
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
987 snapshotdepth = 0
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
988 elif revlog._sparserevlog and revlog.issnapshot(deltabase):
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
989 # A delta chain should always be one full snapshot,
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
990 # zero or more semi-snapshots, and zero or more deltas
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
991 p1, p2 = revlog.rev(revinfo.p1), revlog.rev(revinfo.p2)
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
992 if deltabase not in (p1, p2) and revlog.issnapshot(deltabase):
6e9ba867a946 delta: move some delta chain related computation earlier in deltainfo
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42464
diff changeset
993 snapshotdepth = len(revlog._deltachain(deltabase)[0])
39595
a911932d5003 revlog: reuse cached delta for identical base revision (issue5975)
Boris Feld <boris.feld@octobus.net>
parents: 39594
diff changeset
994 delta = None
a911932d5003 revlog: reuse cached delta for identical base revision (issue5975)
Boris Feld <boris.feld@octobus.net>
parents: 39594
diff changeset
995 if revinfo.cachedelta:
a911932d5003 revlog: reuse cached delta for identical base revision (issue5975)
Boris Feld <boris.feld@octobus.net>
parents: 39594
diff changeset
996 cachebase, cachediff = revinfo.cachedelta
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
997 # check if the diff still apply
39595
a911932d5003 revlog: reuse cached delta for identical base revision (issue5975)
Boris Feld <boris.feld@octobus.net>
parents: 39594
diff changeset
998 currentbase = cachebase
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
999 while (
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1000 currentbase != nullrev
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1001 and currentbase != base
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1002 and self.revlog.length(currentbase) == 0
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1003 ):
39595
a911932d5003 revlog: reuse cached delta for identical base revision (issue5975)
Boris Feld <boris.feld@octobus.net>
parents: 39594
diff changeset
1004 currentbase = self.revlog.deltaparent(currentbase)
41819
688fc33e105d storage: introduce a `revlog.reuse-external-delta` config
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 41447
diff changeset
1005 if self.revlog._lazydelta and currentbase == base:
39595
a911932d5003 revlog: reuse cached delta for identical base revision (issue5975)
Boris Feld <boris.feld@octobus.net>
parents: 39594
diff changeset
1006 delta = revinfo.cachedelta[1]
a911932d5003 revlog: reuse cached delta for identical base revision (issue5975)
Boris Feld <boris.feld@octobus.net>
parents: 39594
diff changeset
1007 if delta is None:
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1008 delta = self._builddeltadiff(base, revinfo, fh)
42467
c1c1872d25d1 deltas: skip if projected compressed size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42466
diff changeset
1009 # snapshotdept need to be neither None nor 0 level snapshot
c1c1872d25d1 deltas: skip if projected compressed size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42466
diff changeset
1010 if revlog.upperboundcomp is not None and snapshotdepth:
c1c1872d25d1 deltas: skip if projected compressed size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42466
diff changeset
1011 lowestrealisticdeltalen = len(delta) // revlog.upperboundcomp
c1c1872d25d1 deltas: skip if projected compressed size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42466
diff changeset
1012 snapshotlimit = revinfo.textlen >> snapshotdepth
c1c1872d25d1 deltas: skip if projected compressed size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42466
diff changeset
1013 if snapshotlimit < lowestrealisticdeltalen:
c1c1872d25d1 deltas: skip if projected compressed size does not match text size constraint
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42466
diff changeset
1014 return None
42468
9b5fbe5ead89 deltas: skip if projected compressed size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42467
diff changeset
1015 if revlog.length(base) < lowestrealisticdeltalen:
9b5fbe5ead89 deltas: skip if projected compressed size is bigger than previous snapshot
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42467
diff changeset
1016 return None
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1017 header, data = revlog.compress(delta)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1018 deltalen = len(header) + len(data)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1019 offset = revlog.end(len(revlog) - 1)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1020 dist = deltalen + offset - revlog.start(chainbase)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1021 chainlen, compresseddeltalen = revlog._chaininfo(base)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1022 chainlen += 1
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1023 compresseddeltalen += deltalen
39154
e0da43e2f71f revlog: compute snapshot depth on delta info
Boris Feld <boris.feld@octobus.net>
parents: 39152
diff changeset
1024
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1025 return _deltainfo(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1026 dist,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1027 deltalen,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1028 (header, data),
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1029 deltabase,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1030 chainbase,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1031 chainlen,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1032 compresseddeltalen,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1033 snapshotdepth,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1034 )
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1035
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1036 def _fullsnapshotinfo(self, fh, revinfo, curr):
39333
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1037 rawtext = self.buildtext(revinfo, fh)
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1038 data = self.revlog.compress(rawtext)
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1039 compresseddeltalen = deltalen = dist = len(data[1]) + len(data[0])
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1040 deltabase = chainbase = curr
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1041 snapshotdepth = 0
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1042 chainlen = 1
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1043
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1044 return _deltainfo(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1045 dist,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1046 deltalen,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1047 data,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1048 deltabase,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1049 chainbase,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1050 chainlen,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1051 compresseddeltalen,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1052 snapshotdepth,
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1053 )
39333
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1054
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1055 def finddeltainfo(self, revinfo, fh, excluded_bases=None, target_rev=None):
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1056 """Find an acceptable delta against a candidate revision
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1057
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1058 revinfo: information about the revision (instance of _revisioninfo)
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1059 fh: file handle to either the .i or the .d revlog file,
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1060 depending on whether it is inlined or not
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1061
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1062 Returns the first acceptable candidate revision, as ordered by
39334
507f5b1dd7c8 revlogdeltas: extract _getcandidaterevs in a function
Boris Feld <boris.feld@octobus.net>
parents: 39333
diff changeset
1063 _candidategroups
39333
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1064
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1065 If no suitable deltabase is found, we return delta info for a full
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1066 snapshot.
47401
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1067
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1068 `excluded_bases` is an optional set of revision that cannot be used as
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1069 a delta base. Use this to recompute delta suitable in censor or strip
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1070 context.
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1071 """
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1072 if target_rev is None:
47457
f8330a3fc39f censor: implement censoring for revlogv2
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47456
diff changeset
1073 target_rev = len(self.revlog)
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1074
39085
dbb3e9e44fce revlog: do not search for delta for empty content
Boris Feld <boris.feld@octobus.net>
parents: 39084
diff changeset
1075 if not revinfo.textlen:
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1076 return self._fullsnapshotinfo(fh, revinfo, target_rev)
39085
dbb3e9e44fce revlog: do not search for delta for empty content
Boris Feld <boris.feld@octobus.net>
parents: 39084
diff changeset
1077
47401
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1078 if excluded_bases is None:
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1079 excluded_bases = set()
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1080
39332
6f4b8f607a31 revlogdeltas: move special cases around raw revisions in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39331
diff changeset
1081 # no delta for flag processor revision (see "candelta" for why)
6f4b8f607a31 revlogdeltas: move special cases around raw revisions in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39331
diff changeset
1082 # not calling candelta since only one revision needs test, also to
6f4b8f607a31 revlogdeltas: move special cases around raw revisions in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39331
diff changeset
1083 # avoid overhead fetching flags again.
6f4b8f607a31 revlogdeltas: move special cases around raw revisions in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39331
diff changeset
1084 if revinfo.flags & REVIDX_RAWTEXT_CHANGING_FLAGS:
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1085 return self._fullsnapshotinfo(fh, revinfo, target_rev)
39332
6f4b8f607a31 revlogdeltas: move special cases around raw revisions in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39331
diff changeset
1086
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1087 cachedelta = revinfo.cachedelta
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1088 p1 = revinfo.p1
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1089 p2 = revinfo.p2
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1090 revlog = self.revlog
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1091
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1092 deltainfo = None
39335
1441eb38849f revlogdeltas: pass revision number to _candidatesgroups
Boris Feld <boris.feld@octobus.net>
parents: 39334
diff changeset
1093 p1r, p2r = revlog.rev(p1), revlog.rev(p2)
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1094 groups = _candidategroups(
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1095 self.revlog, revinfo.textlen, p1r, p2r, cachedelta
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 42992
diff changeset
1096 )
39497
5b308a4e6d03 snapshot: use None as a stop value when looking for a good delta
Boris Feld <boris.feld@octobus.net>
parents: 39496
diff changeset
1097 candidaterevs = next(groups)
5b308a4e6d03 snapshot: use None as a stop value when looking for a good delta
Boris Feld <boris.feld@octobus.net>
parents: 39496
diff changeset
1098 while candidaterevs is not None:
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1099 nominateddeltas = []
39498
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1100 if deltainfo is not None:
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1101 # if we already found a good delta,
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1102 # challenge it against refined candidates
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1103 nominateddeltas.append(deltainfo)
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1104 for candidaterev in candidaterevs:
47401
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1105 if candidaterev in excluded_bases:
1efe3cdef53a revlog: add a ways to blacklist some revision when searching for a delta
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47399
diff changeset
1106 continue
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1107 if candidaterev >= target_rev:
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1108 continue
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1109 candidatedelta = self._builddeltainfo(revinfo, candidaterev, fh)
42466
465f2d0df9ae deltas: accept and skip None return for delta info
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42465
diff changeset
1110 if candidatedelta is not None:
465f2d0df9ae deltas: accept and skip None return for delta info
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42465
diff changeset
1111 if isgooddeltainfo(self.revlog, candidatedelta, revinfo):
465f2d0df9ae deltas: accept and skip None return for delta info
Valentin Gatien-Baron <vgatien-baron@janestreet.com>
parents: 42465
diff changeset
1112 nominateddeltas.append(candidatedelta)
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1113 if nominateddeltas:
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1114 deltainfo = min(nominateddeltas, key=lambda x: x.deltalen)
39498
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1115 if deltainfo is not None:
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1116 candidaterevs = groups.send(deltainfo.base)
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1117 else:
04b75f3a3f2a snapshot: add refining logic at the findeltainfo level
Boris Feld <boris.feld@octobus.net>
parents: 39497
diff changeset
1118 candidaterevs = next(groups)
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1119
39333
5d343a24bff5 revlogdeltas: always return a delta info object in finddeltainfo
Boris Feld <boris.feld@octobus.net>
parents: 39332
diff changeset
1120 if deltainfo is None:
47456
93f4e183b3f5 deltas: at a `target_rev` parameter to finddeltainfo
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47452
diff changeset
1121 deltainfo = self._fullsnapshotinfo(fh, revinfo, target_rev)
35738
f90f6fd130c1 revlog: group delta computation methods under _deltacomputer object
Paul Morelle <paul.morelle@octobus.net>
parents: 35737
diff changeset
1122 return deltainfo
47452
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1123
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1124
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1125 def delta_compression(default_compression_header, deltainfo):
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1126 """return (COMPRESSION_MODE, deltainfo)
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1127
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1128 used by revlog v2+ format to dispatch between PLAIN and DEFAULT
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1129 compression.
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1130 """
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1131 h, d = deltainfo.data
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1132 compression_mode = COMP_MODE_INLINE
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1133 if not h and not d:
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1134 # not data to store at all... declare them uncompressed
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1135 compression_mode = COMP_MODE_PLAIN
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1136 elif not h:
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1137 t = d[0:1]
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1138 if t == b'\0':
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1139 compression_mode = COMP_MODE_PLAIN
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1140 elif t == default_compression_header:
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1141 compression_mode = COMP_MODE_DEFAULT
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1142 elif h == b'u':
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1143 # we have a more efficient way to declare uncompressed
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1144 h = b''
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1145 compression_mode = COMP_MODE_PLAIN
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1146 deltainfo = drop_u_compression(deltainfo)
c6844912c327 revlog: factor the logic to determine the delta compression out
Pierre-Yves David <pierre-yves.david@octobus.net>
parents: 47401
diff changeset
1147 return compression_mode, deltainfo