mercurial/cext/bdiff.c
author Pierre-Yves David <pierre-yves.david@octobus.net>
Wed, 21 Feb 2024 10:41:09 +0100
changeset 51409 2f39c7aeb549
parent 48821 b0dd39b91e7a
permissions -rw-r--r--
phases: large rewrite on retract boundary The new code is still pure Python, so we still have room to going significantly faster. However its complexity of the complex part is `O(|[min_new_draft, tip]|)` instead of `O(|[min_draft, tip]|` which should help tremendously one repository with old draft (like mercurial-devel or mozilla-try). This is especially useful as the most common "retract boundary" operation happens when we commit/rewrite new drafts or when we push new draft to a non-publishing server. In this case, the smallest new_revs is very close to the tip and there is very few work to do. A few smaller optimisation could be done for these cases and will be introduced in later changesets. We still have iterate over large sets of roots, but this is already a great improvement for a very small amount of work. We gather information on the affected changeset as we go as we can put it to use in the next changesets. This extra data collection might slowdown the `register_new` case a bit, however for register_new, it should not really matters. The set of new nodes is either small, so the impact is negligible, or the set of new nodes is large, and the amount of work to do to had them will dominate the overhead the collecting information in `changed_revs`. As this new code compute the changes on the fly, it unlock other interesting improvement to be done in later changeset.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     1
/*
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     2
 bdiff.c - efficient binary diff extension for Mercurial
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     3
46819
d4ba4d51f85f contributor: change mentions of mpm to olivia
Raphaël Gomès <rgomes@octobus.net>
parents: 41336
diff changeset
     4
 Copyright 2005, 2006 Olivia Mackall <olivia@selenic.com>
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     5
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     6
 This software may be used and distributed according to the terms of
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     7
 the GNU General Public License, incorporated herein by reference.
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     8
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
     9
 Based roughly on Python difflib
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    10
*/
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    11
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    12
#define PY_SSIZE_T_CLEAN
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    13
#include <Python.h>
34438
b90e8da190da cext: reorder #include
Gregory Szorc <gregory.szorc@gmail.com>
parents: 32369
diff changeset
    14
#include <limits.h>
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    15
#include <stdlib.h>
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    16
#include <string.h>
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    17
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    18
#include "bdiff.h"
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    19
#include "bitmanipulation.h"
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
    20
#include "thirdparty/xdiff/xdiff.h"
30170
15635d8b17e0 bdiff: include util.h
Gregory Szorc <gregory.szorc@gmail.com>
parents: 29541
diff changeset
    21
#include "util.h"
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    22
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    23
static PyObject *blocks(PyObject *self, PyObject *args)
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    24
{
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    25
	PyObject *sa, *sb, *rl = NULL, *m;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    26
	struct bdiff_line *a, *b;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    27
	struct bdiff_hunk l, *h;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    28
	int an, bn, count, pos = 0;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    29
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    30
	l.next = NULL;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    31
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    32
	if (!PyArg_ParseTuple(args, "SS:bdiff", &sa, &sb)) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    33
		return NULL;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    34
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    35
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    36
	an = bdiff_splitlines(PyBytes_AsString(sa), PyBytes_Size(sa), &a);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    37
	bn = bdiff_splitlines(PyBytes_AsString(sb), PyBytes_Size(sb), &b);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    38
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    39
	if (!a || !b) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    40
		goto nomem;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    41
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    42
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    43
	count = bdiff_diff(a, an, b, bn, &l);
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    44
	if (count < 0) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    45
		goto nomem;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    46
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    47
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    48
	rl = PyList_New(count);
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    49
	if (!rl) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    50
		goto nomem;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    51
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    52
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    53
	for (h = l.next; h; h = h->next) {
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    54
		m = Py_BuildValue("iiii", h->a1, h->a2, h->b1, h->b2);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    55
		PyList_SetItem(rl, pos, m);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    56
		pos++;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    57
	}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    58
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    59
nomem:
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    60
	free(a);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    61
	free(b);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    62
	bdiff_freehunks(l.next);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    63
	return rl ? rl : PyErr_NoMemory();
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    64
}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    65
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    66
static PyObject *bdiff(PyObject *self, PyObject *args)
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    67
{
36655
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    68
	Py_buffer ba, bb;
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    69
	char *rb, *ia, *ib;
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    70
	PyObject *result = NULL;
36654
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
    71
	struct bdiff_line *al = NULL, *bl = NULL;
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    72
	struct bdiff_hunk l, *h;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    73
	int an, bn, count;
30561
7c0c722d568d bdiff: early pruning of common prefix before doing expensive computations
Mads Kiilerich <madski@unity3d.com>
parents: 30170
diff changeset
    74
	Py_ssize_t len = 0, la, lb, li = 0, lcommon = 0, lmax;
36654
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
    75
	PyThreadState *_save = NULL;
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    76
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    77
	l.next = NULL;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    78
48821
b0dd39b91e7a cext: remove PY23()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48810
diff changeset
    79
	if (!PyArg_ParseTuple(args, "y*y*:bdiff", &ba, &bb)) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    80
		return NULL;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
    81
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    82
36655
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    83
	if (!PyBuffer_IsContiguous(&ba, 'C') || ba.ndim > 1) {
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    84
		PyErr_SetString(PyExc_ValueError, "bdiff input not contiguous");
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    85
		goto cleanup;
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    86
	}
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    87
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    88
	if (!PyBuffer_IsContiguous(&bb, 'C') || bb.ndim > 1) {
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    89
		PyErr_SetString(PyExc_ValueError, "bdiff input not contiguous");
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    90
		goto cleanup;
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    91
	}
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    92
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    93
	la = ba.len;
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    94
	lb = bb.len;
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    95
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    96
	if (la > UINT_MAX || lb > UINT_MAX) {
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    97
		PyErr_SetString(PyExc_ValueError, "bdiff inputs too large");
36655
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
    98
		goto cleanup;
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
    99
	}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   100
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   101
	_save = PyEval_SaveThread();
30561
7c0c722d568d bdiff: early pruning of common prefix before doing expensive computations
Mads Kiilerich <madski@unity3d.com>
parents: 30170
diff changeset
   102
7c0c722d568d bdiff: early pruning of common prefix before doing expensive computations
Mads Kiilerich <madski@unity3d.com>
parents: 30170
diff changeset
   103
	lmax = la > lb ? lb : la;
36655
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
   104
	for (ia = ba.buf, ib = bb.buf; li < lmax && *ia == *ib;
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
   105
	     ++li, ++ia, ++ib) {
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   106
		if (*ia == '\n') {
30561
7c0c722d568d bdiff: early pruning of common prefix before doing expensive computations
Mads Kiilerich <madski@unity3d.com>
parents: 30170
diff changeset
   107
			lcommon = li + 1;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   108
		}
36655
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
   109
	}
30561
7c0c722d568d bdiff: early pruning of common prefix before doing expensive computations
Mads Kiilerich <madski@unity3d.com>
parents: 30170
diff changeset
   110
	/* we can almost add: if (li == lmax) lcommon = li; */
7c0c722d568d bdiff: early pruning of common prefix before doing expensive computations
Mads Kiilerich <madski@unity3d.com>
parents: 30170
diff changeset
   111
36681
340e4b711df7 bdiff: avoid pointer arithmetic on void*
Matt Harbison <matt_harbison@yahoo.com>
parents: 36675
diff changeset
   112
	an = bdiff_splitlines((char *)ba.buf + lcommon, la - lcommon, &al);
340e4b711df7 bdiff: avoid pointer arithmetic on void*
Matt Harbison <matt_harbison@yahoo.com>
parents: 36675
diff changeset
   113
	bn = bdiff_splitlines((char *)bb.buf + lcommon, lb - lcommon, &bl);
36654
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   114
	if (!al || !bl) {
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   115
		PyErr_NoMemory();
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   116
		goto cleanup;
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   117
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   118
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   119
	count = bdiff_diff(al, an, bl, bn, &l);
36654
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   120
	if (count < 0) {
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   121
		PyErr_NoMemory();
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   122
		goto cleanup;
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   123
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   124
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   125
	/* calculate length of output */
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   126
	la = lb = 0;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   127
	for (h = l.next; h; h = h->next) {
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   128
		if (h->a1 != la || h->b1 != lb) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   129
			len += 12 + bl[h->b1].l - bl[lb].l;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   130
		}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   131
		la = h->a2;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   132
		lb = h->b2;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   133
	}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   134
	PyEval_RestoreThread(_save);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   135
	_save = NULL;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   136
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   137
	result = PyBytes_FromStringAndSize(NULL, len);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   138
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   139
	if (!result) {
36654
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   140
		goto cleanup;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   141
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   142
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   143
	/* build binary patch */
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   144
	rb = PyBytes_AsString(result);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   145
	la = lb = 0;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   146
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   147
	for (h = l.next; h; h = h->next) {
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   148
		if (h->a1 != la || h->b1 != lb) {
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   149
			len = bl[h->b1].l - bl[lb].l;
30561
7c0c722d568d bdiff: early pruning of common prefix before doing expensive computations
Mads Kiilerich <madski@unity3d.com>
parents: 30170
diff changeset
   150
			putbe32((uint32_t)(al[la].l + lcommon - al->l), rb);
36055
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   151
			putbe32((uint32_t)(al[h->a1].l + lcommon - al->l),
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   152
			        rb + 4);
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   153
			putbe32((uint32_t)len, rb + 8);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   154
			memcpy(rb + 12, bl[lb].l, len);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   155
			rb += 12 + len;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   156
		}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   157
		la = h->a2;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   158
		lb = h->b2;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   159
	}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   160
36654
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   161
cleanup:
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   162
	if (_save) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   163
		PyEval_RestoreThread(_save);
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   164
	}
36655
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
   165
	PyBuffer_Release(&ba);
68026dd7c4f9 cext: accept arguments as Py_buffer
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36654
diff changeset
   166
	PyBuffer_Release(&bb);
38301
d9e87566f879 cext: stop worrying and love the free(NULL)
Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
parents: 37980
diff changeset
   167
	free(al);
d9e87566f879 cext: stop worrying and love the free(NULL)
Josef 'Jeff' Sipek <jeffpc@josefsipek.net>
parents: 37980
diff changeset
   168
	free(bl);
38309
93b812d5b818 bdiff: one more safe call of bdiff_freehunks(NULL)
Yuya Nishihara <yuya@tcha.org>
parents: 38301
diff changeset
   169
	bdiff_freehunks(l.next);
36654
b864f4536ca8 cext: refactor cleanup code in bdiff()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 36620
diff changeset
   170
	return result;
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   171
}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   172
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   173
/*
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   174
 * If allws != 0, remove all whitespace (' ', \t and \r). Otherwise,
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   175
 * reduce whitespace sequences to a single space and trim remaining whitespace
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   176
 * from end of lines.
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   177
 */
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   178
static PyObject *fixws(PyObject *self, PyObject *args)
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   179
{
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   180
	PyObject *s, *result = NULL;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   181
	char allws, c;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   182
	const char *r;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   183
	Py_ssize_t i, rlen, wlen = 0;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   184
	char *w;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   185
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   186
	if (!PyArg_ParseTuple(args, "Sb:fixws", &s, &allws)) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   187
		return NULL;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   188
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   189
	r = PyBytes_AsString(s);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   190
	rlen = PyBytes_Size(s);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   191
31467
08ecec297521 bdiff: use Python memory allocator in fixws
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30561
diff changeset
   192
	w = (char *)PyMem_Malloc(rlen ? rlen : 1);
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   193
	if (!w) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   194
		goto nomem;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   195
	}
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   196
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   197
	for (i = 0; i != rlen; i++) {
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   198
		c = r[i];
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   199
		if (c == ' ' || c == '\t' || c == '\r') {
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   200
			if (!allws && (wlen == 0 || w[wlen - 1] != ' ')) {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   201
				w[wlen++] = ' ';
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   202
			}
36055
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   203
		} else if (c == '\n' && !allws && wlen > 0 &&
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   204
		           w[wlen - 1] == ' ') {
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   205
			w[wlen - 1] = '\n';
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   206
		} else {
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   207
			w[wlen++] = c;
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   208
		}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   209
	}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   210
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   211
	result = PyBytes_FromStringAndSize(w, wlen);
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   212
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   213
nomem:
31467
08ecec297521 bdiff: use Python memory allocator in fixws
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30561
diff changeset
   214
	PyMem_Free(w);
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   215
	return result ? result : PyErr_NoMemory();
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   216
}
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   217
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   218
static bool sliceintolist(PyObject *list, Py_ssize_t destidx,
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   219
                          const char *source, Py_ssize_t len)
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   220
{
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   221
	PyObject *sliced = PyBytes_FromStringAndSize(source, len);
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   222
	if (sliced == NULL) {
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   223
		return false;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   224
	}
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   225
	PyList_SET_ITEM(list, destidx, sliced);
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   226
	return true;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   227
}
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   228
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   229
static PyObject *splitnewlines(PyObject *self, PyObject *args)
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   230
{
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   231
	const char *text;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   232
	Py_ssize_t nelts = 0, size, i, start = 0;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   233
	PyObject *result = NULL;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   234
48821
b0dd39b91e7a cext: remove PY23()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48810
diff changeset
   235
	if (!PyArg_ParseTuple(args, "y#", &text, &size)) {
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   236
		goto abort;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   237
	}
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   238
	if (!size) {
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   239
		return PyList_New(0);
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   240
	}
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   241
	/* This loops to size-1 because if the last byte is a newline,
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   242
	 * we don't want to perform a split there. */
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   243
	for (i = 0; i < size - 1; ++i) {
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   244
		if (text[i] == '\n') {
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   245
			++nelts;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   246
		}
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   247
	}
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   248
	if ((result = PyList_New(nelts + 1)) == NULL) {
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   249
		goto abort;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   250
	}
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   251
	nelts = 0;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   252
	for (i = 0; i < size - 1; ++i) {
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   253
		if (text[i] == '\n') {
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   254
			if (!sliceintolist(result, nelts++, text + start,
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   255
			                   i - start + 1)) {
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   256
				goto abort;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   257
			}
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   258
			start = i + 1;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   259
		}
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   260
	}
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   261
	if (!sliceintolist(result, nelts++, text + start, size - start)) {
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   262
		goto abort;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   263
	}
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   264
	return result;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   265
abort:
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   266
	Py_XDECREF(result);
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   267
	return NULL;
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   268
}
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   269
36916
1b9f6440506b bdiff: convert more longs to int64_t
Matt Harbison <matt_harbison@yahoo.com>
parents: 36763
diff changeset
   270
static int hunk_consumer(int64_t a1, int64_t a2, int64_t b1, int64_t b2,
1b9f6440506b bdiff: convert more longs to int64_t
Matt Harbison <matt_harbison@yahoo.com>
parents: 36763
diff changeset
   271
                         void *priv)
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   272
{
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   273
	PyObject *rl = (PyObject *)priv;
37980
273ea09f6550 bdiff: fix yet more fallout from xdiff long/int64 conversion (issue5885)
Julien Cristau <jcristau@debian.org>
parents: 36916
diff changeset
   274
	PyObject *m = Py_BuildValue("LLLL", a1, a2, b1, b2);
39421
ad76032d27da xdiff: fix leak in hunk_consumer()
Yuya Nishihara <yuya@tcha.org>
parents: 38309
diff changeset
   275
	int r;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   276
	if (!m) {
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   277
		return -1;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   278
	}
39421
ad76032d27da xdiff: fix leak in hunk_consumer()
Yuya Nishihara <yuya@tcha.org>
parents: 38309
diff changeset
   279
	r = PyList_Append(rl, m);
ad76032d27da xdiff: fix leak in hunk_consumer()
Yuya Nishihara <yuya@tcha.org>
parents: 38309
diff changeset
   280
	Py_DECREF(m);
ad76032d27da xdiff: fix leak in hunk_consumer()
Yuya Nishihara <yuya@tcha.org>
parents: 38309
diff changeset
   281
	return r;
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   282
}
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   283
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   284
static PyObject *xdiffblocks(PyObject *self, PyObject *args)
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   285
{
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   286
	Py_ssize_t la, lb;
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   287
	mmfile_t a, b;
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   288
	PyObject *rl;
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   289
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   290
	xpparam_t xpp = {
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   291
	    XDF_INDENT_HEURISTIC, /* flags */
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   292
	};
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   293
	xdemitconf_t xecfg = {
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   294
	    XDL_EMIT_BDIFFHUNK, /* flags */
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   295
	    hunk_consumer,      /* hunk_consume_func */
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   296
	};
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   297
	xdemitcb_t ecb = {
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   298
	    NULL, /* priv */
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   299
	};
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   300
48821
b0dd39b91e7a cext: remove PY23()
Gregory Szorc <gregory.szorc@gmail.com>
parents: 48810
diff changeset
   301
	if (!PyArg_ParseTuple(args, "y#y#", &a.ptr, &la, &b.ptr, &lb)) {
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   302
		return NULL;
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   303
	}
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   304
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   305
	a.size = la;
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   306
	b.size = lb;
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   307
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   308
	rl = PyList_New(0);
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   309
	if (!rl) {
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   310
		return PyErr_NoMemory();
41336
763b45bc4483 cleanup: use clang-tidy to add missing {} around one-line statements
Augie Fackler <augie@google.com>
parents: 39421
diff changeset
   311
	}
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   312
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   313
	ecb.priv = rl;
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   314
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   315
	if (xdl_diff(&a, &b, &xpp, &xecfg, &ecb) != 0) {
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   316
		Py_DECREF(rl);
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   317
		return PyErr_NoMemory();
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   318
	}
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   319
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   320
	return rl;
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   321
}
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   322
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   323
static char mdiff_doc[] = "Efficient binary diff.";
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   324
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   325
static PyMethodDef methods[] = {
36055
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   326
    {"bdiff", bdiff, METH_VARARGS, "calculate a binary diff\n"},
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   327
    {"blocks", blocks, METH_VARARGS, "find a list of matching lines\n"},
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   328
    {"fixws", fixws, METH_VARARGS, "normalize diff whitespaces\n"},
36146
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   329
    {"splitnewlines", splitnewlines, METH_VARARGS,
29dd37a418aa bdiff: write a native version of splitnewlines
Augie Fackler <augie@google.com>
parents: 36055
diff changeset
   330
     "like str.splitlines, but only split on newlines\n"},
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   331
    {"xdiffblocks", xdiffblocks, METH_VARARGS,
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   332
     "find a list of matching lines using xdiff algorithm\n"},
36055
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   333
    {NULL, NULL},
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   334
};
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   335
36675
430fdb717549 bdiff: add a xdiffblocks method
Jun Wu <quark@fb.com>
parents: 36655
diff changeset
   336
static const int version = 3;
32355
4195b84940e9 bdiff: add version to help detect breaking binary changes
Jun Wu <quark@fb.com>
parents: 31467
diff changeset
   337
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   338
static struct PyModuleDef bdiff_module = {
36055
b4fdc6177b29 bdiff: add to clang-format oversight
Augie Fackler <augie@google.com>
parents: 34438
diff changeset
   339
    PyModuleDef_HEAD_INIT, "bdiff", mdiff_doc, -1, methods,
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   340
};
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   341
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   342
PyMODINIT_FUNC PyInit_bdiff(void)
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   343
{
32355
4195b84940e9 bdiff: add version to help detect breaking binary changes
Jun Wu <quark@fb.com>
parents: 31467
diff changeset
   344
	PyObject *m;
4195b84940e9 bdiff: add version to help detect breaking binary changes
Jun Wu <quark@fb.com>
parents: 31467
diff changeset
   345
	m = PyModule_Create(&bdiff_module);
4195b84940e9 bdiff: add version to help detect breaking binary changes
Jun Wu <quark@fb.com>
parents: 31467
diff changeset
   346
	PyModule_AddIntConstant(m, "version", version);
4195b84940e9 bdiff: add version to help detect breaking binary changes
Jun Wu <quark@fb.com>
parents: 31467
diff changeset
   347
	return m;
29541
9631ff5ebbeb bdiff: split bdiff into cpy-aware and cpy-agnostic part
Maciej Fijalkowski <fijall@gmail.com>
parents:
diff changeset
   348
}