Mercurial > hg
annotate mercurial/pure/mpatch.py @ 48178:f12a19d03d2c
fix: reduce number of tool executions
By grouping together (path, ctx) pairs according to the inputs they would
provide to fixer tools, we can deduplicate executions of fixer tools to
significantly reduce the amount of time spent running slow tools.
This change does not handle clean files in the working copy, which could still
be deduplicated against the files in the checked out commit. It's a little
harder to do that because the filerev is not available in the workingfilectx
(and it doesn't exist for added files).
Anecdotally, this change makes some real uses cases at Google 10x faster. I
think we were originally hesitant to do this because the benefits weren't
obvious, and implementing it efficiently is kind of tricky. If we simply
memoized the formatter execution function, we would be keeping tons of file
content in memory.
Also included is a regression test for a corner case that I broke with my first
attempt at optimizing this code.
Differential Revision: https://phab.mercurial-scm.org/D11280
author | Danny Hooper <hooper@google.com> |
---|---|
date | Thu, 02 Sep 2021 14:08:45 -0700 |
parents | d4ba4d51f85f |
children | 5aafc3c5bdec |
rev | line source |
---|---|
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
1 # mpatch.py - Python implementation of mpatch.c |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
2 # |
46819
d4ba4d51f85f
contributor: change mentions of mpm to olivia
Raphaël Gomès <rgomes@octobus.net>
parents:
45942
diff
changeset
|
3 # Copyright 2009 Olivia Mackall <olivia@selenic.com> and others |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
4 # |
8225
46293a0c7e9f
updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents:
7775
diff
changeset
|
5 # This software may be used and distributed according to the terms of the |
10263 | 6 # GNU General Public License version 2 or any later version. |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
7 |
27337
9a17576103a4
mpatch: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
16683
diff
changeset
|
8 from __future__ import absolute_import |
9a17576103a4
mpatch: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
16683
diff
changeset
|
9 |
7775
5280c39778b6
pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents:
7699
diff
changeset
|
10 import struct |
27337
9a17576103a4
mpatch: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents:
16683
diff
changeset
|
11 |
32512
0e8b0b9a7acc
cffi: split modules from pure
Yuya Nishihara <yuya@tcha.org>
parents:
32506
diff
changeset
|
12 from .. import pycompat |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
13 |
36958
644a02f6b34f
util: prefer "bytesio" to "stringio"
Gregory Szorc <gregory.szorc@gmail.com>
parents:
34435
diff
changeset
|
14 stringio = pycompat.bytesio |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
15 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
16 |
28782
f736f98e16ca
mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents:
28589
diff
changeset
|
17 class mpatchError(Exception): |
45942
89a2afe31e82
formating: upgrade to black 20.8b1
Augie Fackler <raf@durin42.com>
parents:
43077
diff
changeset
|
18 """error raised when a delta cannot be decoded""" |
28782
f736f98e16ca
mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents:
28589
diff
changeset
|
19 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
20 |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
21 # This attempts to apply a series of patches in time proportional to |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
22 # the total size of the patches, rather than patches * len(text). This |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
23 # means rather than shuffling strings around, we shuffle around |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
24 # pointers to fragments with fragment lists. |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
25 # |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
26 # When the fragment lists get too long, we collapse them. To do this |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
27 # efficiently, we do all our operations inside a buffer created by |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
28 # mmap and simply use memmove. This avoids creating a bunch of large |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
29 # temporary string buffers. |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
30 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
31 |
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
32 def _pull(dst, src, l): # pull l bytes from src |
28587
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
33 while l: |
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
34 f = src.pop() |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
35 if f[0] > l: # do we need to split? |
28587
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
36 src.append((f[0] - l, f[1] + l)) |
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
37 dst.append((l, f[1])) |
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
38 return |
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
39 dst.append(f) |
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
40 l -= f[0] |
76d7cab13f04
mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents:
27337
diff
changeset
|
41 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
42 |
28588
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
43 def _move(m, dest, src, count): |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
44 """move count bytes from src to dest |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
45 |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
46 The file pointer is left at the end of dest. |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
47 """ |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
48 m.seek(src) |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
49 buf = m.read(count) |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
50 m.seek(dest) |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
51 m.write(buf) |
6546afde350e
mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents:
28587
diff
changeset
|
52 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
53 |
28589
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
54 def _collect(m, buf, list): |
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
55 start = buf |
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
56 for l, p in reversed(list): |
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
57 _move(m, buf, p, l) |
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
58 buf += l |
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
59 return (buf - start, start) |
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
60 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
61 |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
62 def patches(a, bins): |
10282
08a0f04b56bd
many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents:
10263
diff
changeset
|
63 if not bins: |
08a0f04b56bd
many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents:
10263
diff
changeset
|
64 return a |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
65 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
66 plens = [len(x) for x in bins] |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
67 pl = sum(plens) |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
68 bl = len(a) + pl |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
69 tl = bl + bl + pl # enough for the patches and two working texts |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
70 b1, b2 = 0, bl |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
71 |
10282
08a0f04b56bd
many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents:
10263
diff
changeset
|
72 if not tl: |
08a0f04b56bd
many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents:
10263
diff
changeset
|
73 return a |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
74 |
28861
86db5cb55d46
pycompat: switch to util.stringio for py3 compat
timeless <timeless@mozdev.org>
parents:
28782
diff
changeset
|
75 m = stringio() |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
76 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
77 # load our original text |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
78 m.write(a) |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
79 frags = [(len(a), b1)] |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
80 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
81 # copy all the patches into our segment so we can memmove from them |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
82 pos = b2 + bl |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
83 m.seek(pos) |
34435
5326e4ef1dab
style: never put multiple statements on one line
Alex Gaynor <agaynor@mozilla.com>
parents:
32512
diff
changeset
|
84 for p in bins: |
5326e4ef1dab
style: never put multiple statements on one line
Alex Gaynor <agaynor@mozilla.com>
parents:
32512
diff
changeset
|
85 m.write(p) |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
86 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
87 for plen in plens: |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
88 # if our list gets too long, execute it |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
89 if len(frags) > 128: |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
90 b2, b1 = b1, b2 |
28589
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
91 frags = [_collect(m, b1, frags)] |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
92 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
93 new = [] |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
94 end = pos + plen |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
95 last = 0 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
96 while pos < end: |
7775
5280c39778b6
pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents:
7699
diff
changeset
|
97 m.seek(pos) |
28782
f736f98e16ca
mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents:
28589
diff
changeset
|
98 try: |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
99 p1, p2, l = struct.unpack(b">lll", m.read(12)) |
28782
f736f98e16ca
mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents:
28589
diff
changeset
|
100 except struct.error: |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
101 raise mpatchError(b"patch cannot be decoded") |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
102 _pull(new, frags, p1 - last) # what didn't change |
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
103 _pull([], frags, p2 - p1) # what got deleted |
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
104 new.append((l, pos + 12)) # what got added |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
105 pos += l + 12 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
106 last = p2 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
107 frags.extend(reversed(new)) # what was left at the end |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
108 |
28589
c4c7be9f0554
mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents:
28588
diff
changeset
|
109 t = _collect(m, b2, frags) |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
110 |
7775
5280c39778b6
pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents:
7699
diff
changeset
|
111 m.seek(t[1]) |
5280c39778b6
pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents:
7699
diff
changeset
|
112 return m.read(t[0]) |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
113 |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
114 |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
115 def patchedsize(orig, delta): |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
116 outlen, last, bin = 0, 0, 0 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
117 binend = len(delta) |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
118 data = 12 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
119 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
120 while data <= binend: |
43076
2372284d9457
formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents:
36958
diff
changeset
|
121 decode = delta[bin : bin + 12] |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
122 start, end, length = struct.unpack(b">lll", decode) |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
123 if start > end: |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
124 break |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
125 bin = data + length |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
126 data = bin + 12 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
127 outlen += start - last |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
128 last = end |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
129 outlen += length |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
130 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
131 if bin != binend: |
43077
687b865b95ad
formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents:
43076
diff
changeset
|
132 raise mpatchError(b"patch cannot be decoded") |
7699
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
133 |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
134 outlen += orig - last |
fac054f84600
pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff
changeset
|
135 return outlen |