mercurial/pure/mpatch.py
author Gregory Szorc <gregory.szorc@gmail.com>
Sat, 28 Mar 2020 12:18:58 -0700
changeset 44651 00e0c5c06ed5
parent 43077 687b865b95ad
child 45942 89a2afe31e82
permissions -rw-r--r--
pycompat: change argv conversion semantics Use of os.fsencode() to convert Python's sys.argv back to bytes was not correct because it isn't the logically inverse operation from what CPython was doing under the hood. This commit changes the logic for doing the str -> bytes conversion. This required a separate implementation for POSIX and Windows. The Windows behavior is arguably not ideal. The previous behavior on Windows was leading to failing tests, such as test-http-branchmap.t, which defines a utf-8 branch name via a command argument. Previously, Mercurial's argument parser looked to be receiving wchar_t bytes in some cases. After this commit, behavior on Windows is compatible with Python 2, where CPython did not implement `int wmain()` and Windows was performing a Unicode to ANSI conversion on the wchar_t native command line. Arguably better behavior on Windows would be for Mercurial to preserve the original Unicode sequence coming from Python and to wrap this in a bytes-like type so we can round trip safely. But, this would be new, backwards incompatible behavior. My goal for this commit was to converge Mercurial behavior on Python 3 on Windows to fix busted tests. And I believe I was successful, as this commit fixes 9 tests on my Windows machine and 14 tests in the AWS CI environment! Differential Revision: https://phab.mercurial-scm.org/D8337
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
     1
# mpatch.py - Python implementation of mpatch.c
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
     2
#
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
     3
# Copyright 2009 Matt Mackall <mpm@selenic.com> and others
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
     4
#
8225
46293a0c7e9f updated license to be explicit about GPL version 2
Martin Geisler <mg@lazybytes.net>
parents: 7775
diff changeset
     5
# This software may be used and distributed according to the terms of the
10263
25e572394f5c Update license to GPLv2+
Matt Mackall <mpm@selenic.com>
parents: 8225
diff changeset
     6
# GNU General Public License version 2 or any later version.
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
     7
27337
9a17576103a4 mpatch: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16683
diff changeset
     8
from __future__ import absolute_import
9a17576103a4 mpatch: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16683
diff changeset
     9
7775
5280c39778b6 pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents: 7699
diff changeset
    10
import struct
27337
9a17576103a4 mpatch: use absolute_import
Gregory Szorc <gregory.szorc@gmail.com>
parents: 16683
diff changeset
    11
32512
0e8b0b9a7acc cffi: split modules from pure
Yuya Nishihara <yuya@tcha.org>
parents: 32506
diff changeset
    12
from .. import pycompat
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    13
36958
644a02f6b34f util: prefer "bytesio" to "stringio"
Gregory Szorc <gregory.szorc@gmail.com>
parents: 34435
diff changeset
    14
stringio = pycompat.bytesio
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    15
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    16
28782
f736f98e16ca mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents: 28589
diff changeset
    17
class mpatchError(Exception):
f736f98e16ca mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents: 28589
diff changeset
    18
    """error raised when a delta cannot be decoded
f736f98e16ca mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents: 28589
diff changeset
    19
    """
f736f98e16ca mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents: 28589
diff changeset
    20
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    21
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    22
# This attempts to apply a series of patches in time proportional to
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    23
# the total size of the patches, rather than patches * len(text). This
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    24
# means rather than shuffling strings around, we shuffle around
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    25
# pointers to fragments with fragment lists.
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    26
#
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    27
# When the fragment lists get too long, we collapse them. To do this
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    28
# efficiently, we do all our operations inside a buffer created by
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    29
# mmap and simply use memmove. This avoids creating a bunch of large
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    30
# temporary string buffers.
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    31
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    32
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    33
def _pull(dst, src, l):  # pull l bytes from src
28587
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    34
    while l:
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    35
        f = src.pop()
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    36
        if f[0] > l:  # do we need to split?
28587
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    37
            src.append((f[0] - l, f[1] + l))
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    38
            dst.append((l, f[1]))
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    39
            return
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    40
        dst.append(f)
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    41
        l -= f[0]
76d7cab13f04 mpatch: move pull() method to top level
Augie Fackler <augie@google.com>
parents: 27337
diff changeset
    42
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    43
28588
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    44
def _move(m, dest, src, count):
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    45
    """move count bytes from src to dest
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    46
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    47
    The file pointer is left at the end of dest.
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    48
    """
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    49
    m.seek(src)
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    50
    buf = m.read(count)
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    51
    m.seek(dest)
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    52
    m.write(buf)
6546afde350e mpatch: un-nest the move() method
Augie Fackler <augie@google.com>
parents: 28587
diff changeset
    53
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    54
28589
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    55
def _collect(m, buf, list):
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    56
    start = buf
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    57
    for l, p in reversed(list):
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    58
        _move(m, buf, p, l)
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    59
        buf += l
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    60
    return (buf - start, start)
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    61
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    62
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    63
def patches(a, bins):
10282
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 10263
diff changeset
    64
    if not bins:
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 10263
diff changeset
    65
        return a
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    66
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    67
    plens = [len(x) for x in bins]
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    68
    pl = sum(plens)
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    69
    bl = len(a) + pl
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
    70
    tl = bl + bl + pl  # enough for the patches and two working texts
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    71
    b1, b2 = 0, bl
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    72
10282
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 10263
diff changeset
    73
    if not tl:
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 10263
diff changeset
    74
        return a
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    75
28861
86db5cb55d46 pycompat: switch to util.stringio for py3 compat
timeless <timeless@mozdev.org>
parents: 28782
diff changeset
    76
    m = stringio()
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    77
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    78
    # load our original text
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    79
    m.write(a)
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    80
    frags = [(len(a), b1)]
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    81
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    82
    # copy all the patches into our segment so we can memmove from them
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    83
    pos = b2 + bl
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    84
    m.seek(pos)
34435
5326e4ef1dab style: never put multiple statements on one line
Alex Gaynor <agaynor@mozilla.com>
parents: 32512
diff changeset
    85
    for p in bins:
5326e4ef1dab style: never put multiple statements on one line
Alex Gaynor <agaynor@mozilla.com>
parents: 32512
diff changeset
    86
        m.write(p)
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    87
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    88
    for plen in plens:
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    89
        # if our list gets too long, execute it
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    90
        if len(frags) > 128:
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    91
            b2, b1 = b1, b2
28589
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
    92
            frags = [_collect(m, b1, frags)]
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    93
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    94
        new = []
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    95
        end = pos + plen
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    96
        last = 0
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
    97
        while pos < end:
7775
5280c39778b6 pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents: 7699
diff changeset
    98
            m.seek(pos)
28782
f736f98e16ca mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents: 28589
diff changeset
    99
            try:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   100
                p1, p2, l = struct.unpack(b">lll", m.read(12))
28782
f736f98e16ca mpatch: unify mpatchError (issue5182)
timeless <timeless@mozdev.org>
parents: 28589
diff changeset
   101
            except struct.error:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   102
                raise mpatchError(b"patch cannot be decoded")
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
   103
            _pull(new, frags, p1 - last)  # what didn't change
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
   104
            _pull([], frags, p2 - p1)  # what got deleted
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
   105
            new.append((l, pos + 12))  # what got added
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   106
            pos += l + 12
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   107
            last = p2
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
   108
        frags.extend(reversed(new))  # what was left at the end
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   109
28589
c4c7be9f0554 mpatch: move collect() to module level
Augie Fackler <augie@google.com>
parents: 28588
diff changeset
   110
    t = _collect(m, b2, frags)
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   111
7775
5280c39778b6 pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents: 7699
diff changeset
   112
    m.seek(t[1])
5280c39778b6 pure/mpatch: use StringIO instead of mmap (issue1493)
Martin Geisler <mg@daimi.au.dk>
parents: 7699
diff changeset
   113
    return m.read(t[0])
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   114
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
   115
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   116
def patchedsize(orig, delta):
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   117
    outlen, last, bin = 0, 0, 0
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   118
    binend = len(delta)
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   119
    data = 12
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   120
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   121
    while data <= binend:
43076
2372284d9457 formatting: blacken the codebase
Augie Fackler <augie@google.com>
parents: 36958
diff changeset
   122
        decode = delta[bin : bin + 12]
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   123
        start, end, length = struct.unpack(b">lll", decode)
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   124
        if start > end:
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   125
            break
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   126
        bin = data + length
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   127
        data = bin + 12
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   128
        outlen += start - last
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   129
        last = end
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   130
        outlen += length
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   131
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   132
    if bin != binend:
43077
687b865b95ad formatting: byteify all mercurial/ and hgext/ string literals
Augie Fackler <augie@google.com>
parents: 43076
diff changeset
   133
        raise mpatchError(b"patch cannot be decoded")
7699
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   134
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   135
    outlen += orig - last
fac054f84600 pure Python implementation of mpatch.c
Martin Geisler <mg@daimi.au.dk>
parents:
diff changeset
   136
    return outlen