Mercurial > hg
view mercurial/cffi/bdiff.py @ 39037:ede768cfe83e
mail: always fall back to iso-8859-1 if us-ascii won't work (BC)
It looks like this was a well-intentioned backwards compat hack for
previewing the output of `hg email` in a stable way. Unfortunately I
think this hack's time has come, because Python 3 does a much better
job of ensuring it actually emits *valid* email messages. In
particular, Python 2 would blindly trust us that the bytes we handed
it were valid for the encoding we claimed, but Python 3 has some more
sniff-tests that we end up failing.
As a result, if we're going to print an email to the terminal, try
us-ascii first, but if that fails go straight to iso-8859-1 which
should be reasonably readable for ascii-compatible patch bodies. This
*will* be a breaking change for ascii-incompatible textual patch
content, but I don't think that's avoidable if we want to continue
using the email library from the stdlib.
.. bc::
Emails from the patchbomb extension will always be printed as though
they are iso-8859-1 if they're not valid us-ascii. Previously,
previewed emails were always claimed to be us-ascii and might
contain invalid byte sequences.
Differential Revision: https://phab.mercurial-scm.org/D4231
author | Augie Fackler <augie@google.com> |
---|---|
date | Thu, 09 Aug 2018 21:04:15 -0400 |
parents | 857876ebaed4 |
children | 2372284d9457 |
line wrap: on
line source
# bdiff.py - CFFI implementation of bdiff.c # # Copyright 2016 Maciej Fijalkowski <fijall@gmail.com> # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. from __future__ import absolute_import import struct from ..pure.bdiff import * from . import _bdiff ffi = _bdiff.ffi lib = _bdiff.lib def blocks(sa, sb): a = ffi.new("struct bdiff_line**") b = ffi.new("struct bdiff_line**") ac = ffi.new("char[]", str(sa)) bc = ffi.new("char[]", str(sb)) l = ffi.new("struct bdiff_hunk*") try: an = lib.bdiff_splitlines(ac, len(sa), a) bn = lib.bdiff_splitlines(bc, len(sb), b) if not a[0] or not b[0]: raise MemoryError count = lib.bdiff_diff(a[0], an, b[0], bn, l) if count < 0: raise MemoryError rl = [None] * count h = l.next i = 0 while h: rl[i] = (h.a1, h.a2, h.b1, h.b2) h = h.next i += 1 finally: lib.free(a[0]) lib.free(b[0]) lib.bdiff_freehunks(l.next) return rl def bdiff(sa, sb): a = ffi.new("struct bdiff_line**") b = ffi.new("struct bdiff_line**") ac = ffi.new("char[]", str(sa)) bc = ffi.new("char[]", str(sb)) l = ffi.new("struct bdiff_hunk*") try: an = lib.bdiff_splitlines(ac, len(sa), a) bn = lib.bdiff_splitlines(bc, len(sb), b) if not a[0] or not b[0]: raise MemoryError count = lib.bdiff_diff(a[0], an, b[0], bn, l) if count < 0: raise MemoryError rl = [] h = l.next la = lb = 0 while h: if h.a1 != la or h.b1 != lb: lgt = (b[0] + h.b1).l - (b[0] + lb).l rl.append(struct.pack(">lll", (a[0] + la).l - a[0].l, (a[0] + h.a1).l - a[0].l, lgt)) rl.append(str(ffi.buffer((b[0] + lb).l, lgt))) la = h.a2 lb = h.b2 h = h.next finally: lib.free(a[0]) lib.free(b[0]) lib.bdiff_freehunks(l.next) return "".join(rl)