annotate mercurial/bdiff.c @ 15063:c20688b7c061 stable

setdiscovery: fix hang when #heads>200 (issue2971) When setting up the next sample, we always add all of the heads, regardless of the desired max sample size. But if the number of heads exceeds this size, then we don't add any more nodes from the still undecided set. (This is debatable per se, and I'll investigate it, but it's how we designed it at the moment.) The bug was that we always added the overall heads, not the heads of the remaining undecided set. Thus, if #heads>200 (desired sample size), we did not make progress any longer.
author Peter Arrenbrecht <peter.arrenbrecht@gmail.com>
date Thu, 25 Aug 2011 21:25:14 +0200
parents afe9269dccec
children 73015301db86
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
1 /*
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
2 bdiff.c - efficient binary diff extension for Mercurial
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
3
2859
345bac2bc4ec update copyrights.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents: 2600
diff changeset
4 Copyright 2005, 2006 Matt Mackall <mpm@selenic.com>
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
5
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
6 This software may be used and distributed according to the terms of
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
7 the GNU General Public License, incorporated herein by reference.
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
8
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
9 Based roughly on Python difflib
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
10 */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
11
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
12 #include <Python.h>
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
13 #include <stdlib.h>
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
14 #include <string.h>
5341
458acf92b49e bdiff: use INT_MAX to avoid some inner loop comparisons
Matt Mackall <mpm@selenic.com>
parents: 5340
diff changeset
15 #include <limits.h>
867
0cd2ee61b10a Allow Mercurial to build on HP-UX 11
tksoh@users.sourceforge.net
parents: 839
diff changeset
16
3561
8c617d48564a add AIX to the list of compilers that don't have inline keyword
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 3369
diff changeset
17 #if defined __hpux || defined __SUNPRO_C || defined _AIX
10282
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
18 #define inline
2600
c4325f0a9b91 clean up trailing white space.
Vadim Gelfer <vadim.gelfer@gmail.com>
parents: 2577
diff changeset
19 #endif
1759
5afd459db177 Sunpro compiler patch
Fabian Otto <sigsegv@alchiba.ni.cs.tu-berlin.de>
parents: 1542
diff changeset
20
8858
16f6c13706df bdiff: fix compile with GCC -ansi (issue1690)
Matt Mackall <mpm@selenic.com>
parents: 7625
diff changeset
21 #ifdef __linux
10282
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
22 #define inline __inline
8858
16f6c13706df bdiff: fix compile with GCC -ansi (issue1690)
Matt Mackall <mpm@selenic.com>
parents: 7625
diff changeset
23 #endif
16f6c13706df bdiff: fix compile with GCC -ansi (issue1690)
Matt Mackall <mpm@selenic.com>
parents: 7625
diff changeset
24
411
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
25 #ifdef _WIN32
551
b460a2fd8bb7 [PATCH] bdiff/mpatch under MSVC
mpm@selenic.com
parents: 510
diff changeset
26 #ifdef _MSC_VER
b460a2fd8bb7 [PATCH] bdiff/mpatch under MSVC
mpm@selenic.com
parents: 510
diff changeset
27 #define inline __inline
b460a2fd8bb7 [PATCH] bdiff/mpatch under MSVC
mpm@selenic.com
parents: 510
diff changeset
28 typedef unsigned long uint32_t;
b460a2fd8bb7 [PATCH] bdiff/mpatch under MSVC
mpm@selenic.com
parents: 510
diff changeset
29 #else
510
7f3fc8fd427e More fiddling with uint32_t includes for extensions
mpm@selenic.com
parents: 495
diff changeset
30 #include <stdint.h>
551
b460a2fd8bb7 [PATCH] bdiff/mpatch under MSVC
mpm@selenic.com
parents: 510
diff changeset
31 #endif
411
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
32 static uint32_t htonl(uint32_t x)
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
33 {
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
34 return ((x & 0x000000ffUL) << 24) |
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
35 ((x & 0x0000ff00UL) << 8) |
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
36 ((x & 0x00ff0000UL) >> 8) |
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
37 ((x & 0xff000000UL) >> 24);
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
38 }
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
39 #else
510
7f3fc8fd427e More fiddling with uint32_t includes for extensions
mpm@selenic.com
parents: 495
diff changeset
40 #include <sys/types.h>
7036
bfad9865b1dc allow Mercurial to compile on Haiku
Scott McCreary <scottmc2@gmail.com>
parents: 5620
diff changeset
41 #if defined __BEOS__ && !defined __HAIKU__
4073
95ffa36d1d2a BeOS compatibility support
Andrew Bachmann <andrewbachmann@gmail.com>
parents: 3561
diff changeset
42 #include <ByteOrder.h>
95ffa36d1d2a BeOS compatibility support
Andrew Bachmann <andrewbachmann@gmail.com>
parents: 3561
diff changeset
43 #else
597
e530637ea060 [PATCH] use <arpa/inet.h> instead of <netinet/in.h> for ntohl/htonl
mpm@selenic.com
parents: 553
diff changeset
44 #include <arpa/inet.h>
4073
95ffa36d1d2a BeOS compatibility support
Andrew Bachmann <andrewbachmann@gmail.com>
parents: 3561
diff changeset
45 #endif
2543
860e9c83fc59 Include inttypes.h instead of stdint.h (fixes issue299)
Thomas Arendsen Hein <thomas@intevation.de>
parents: 2483
diff changeset
46 #include <inttypes.h>
411
9e9f7ab43ce2 Add 'other OS' bits to bdiff.c / style cleanups
mpm@selenic.com
parents: 400
diff changeset
47 #endif
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
48
11364
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
49 #include "util.h"
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
50
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
51 struct line {
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
52 int hash, len, n, e;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
53 const char *l;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
54 };
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
55
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
56 struct pos {
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
57 int pos, len;
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
58 };
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
59
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
60 struct hunk;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
61 struct hunk {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
62 int a1, a2, b1, b2;
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
63 struct hunk *next;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
64 };
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
65
13729
4a9c09239ba1 bdiff.c: make all local functions static
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13302
diff changeset
66 static int splitlines(const char *a, int len, struct line **lr)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
67 {
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
68 unsigned hash;
13731
5d0cdf4ec338 bdiff.c: use unsigned arithmetic for hash computation
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13730
diff changeset
69 int i;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
70 const char *p, *b = a;
5340
5737845fd974 bdiff: simple splitlines optimization
Christoph Spiel <cspiel@freenet.de>
parents: 5339
diff changeset
71 const char * const plast = a + len - 1;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
72 struct line *l;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
73
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
74 /* count the lines */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
75 i = 1; /* extra line for sentinel */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
76 for (p = a; p < a + len; p++)
5340
5737845fd974 bdiff: simple splitlines optimization
Christoph Spiel <cspiel@freenet.de>
parents: 5339
diff changeset
77 if (*p == '\n' || p == plast)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
78 i++;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
79
1978
10606ee61107 do proper typecasting on malloc() and calloc() calls
TK Soh <teekaysoh@yahoo.com>
parents: 1759
diff changeset
80 *lr = l = (struct line *)malloc(sizeof(struct line) * i);
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
81 if (!l)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
82 return -1;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
83
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
84 /* build the line array and calculate hashes */
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
85 hash = 0;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
86 for (p = a; p < a + len; p++) {
5342
d0c48891dd4a bdiff: switch to lyhash
Matt Mackall <mpm@selenic.com>
parents: 5341
diff changeset
87 /* Leonid Yuriev's hash */
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
88 hash = (hash * 1664525) + (unsigned char)*p + 1013904223;
5342
d0c48891dd4a bdiff: switch to lyhash
Matt Mackall <mpm@selenic.com>
parents: 5341
diff changeset
89
5340
5737845fd974 bdiff: simple splitlines optimization
Christoph Spiel <cspiel@freenet.de>
parents: 5339
diff changeset
90 if (*p == '\n' || p == plast) {
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
91 l->hash = hash;
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
92 hash = 0;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
93 l->len = p - b + 1;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
94 l->l = b;
5341
458acf92b49e bdiff: use INT_MAX to avoid some inner loop comparisons
Matt Mackall <mpm@selenic.com>
parents: 5340
diff changeset
95 l->n = INT_MAX;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
96 l++;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
97 b = p + 1;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
98 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
99 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
100
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
101 /* set up a sentinel */
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
102 l->hash = 0;
13731
5d0cdf4ec338 bdiff.c: use unsigned arithmetic for hash computation
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13730
diff changeset
103 l->len = 0;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
104 l->l = a + len;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
105 return i - 1;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
106 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
107
13729
4a9c09239ba1 bdiff.c: make all local functions static
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13302
diff changeset
108 static inline int cmp(struct line *a, struct line *b)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
109 {
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
110 return a->hash != b->hash || a->len != b->len || memcmp(a->l, b->l, a->len);
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
111 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
112
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
113 static int equatelines(struct line *a, int an, struct line *b, int bn)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
114 {
5452
82b4ff3abbcd bdiff: tweaks for large files
Matt Mackall <mpm@selenic.com>
parents: 5342
diff changeset
115 int i, j, buckets = 1, t, scale;
82b4ff3abbcd bdiff: tweaks for large files
Matt Mackall <mpm@selenic.com>
parents: 5342
diff changeset
116 struct pos *h = NULL;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
117
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
118 /* build a hash table of the next highest power of 2 */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
119 while (buckets < bn + 1)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
120 buckets *= 2;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
121
5339
058e93c3d07d I have spotted the biggest bottleneck in "bdiff.c". Actually it was
Christoph Spiel <cspiel@freenet.de>
parents: 4134
diff changeset
122 /* try to allocate a large hash table to avoid collisions */
5452
82b4ff3abbcd bdiff: tweaks for large files
Matt Mackall <mpm@selenic.com>
parents: 5342
diff changeset
123 for (scale = 4; scale; scale /= 2) {
5339
058e93c3d07d I have spotted the biggest bottleneck in "bdiff.c". Actually it was
Christoph Spiel <cspiel@freenet.de>
parents: 4134
diff changeset
124 h = (struct pos *)malloc(scale * buckets * sizeof(struct pos));
5452
82b4ff3abbcd bdiff: tweaks for large files
Matt Mackall <mpm@selenic.com>
parents: 5342
diff changeset
125 if (h)
82b4ff3abbcd bdiff: tweaks for large files
Matt Mackall <mpm@selenic.com>
parents: 5342
diff changeset
126 break;
82b4ff3abbcd bdiff: tweaks for large files
Matt Mackall <mpm@selenic.com>
parents: 5342
diff changeset
127 }
5339
058e93c3d07d I have spotted the biggest bottleneck in "bdiff.c". Actually it was
Christoph Spiel <cspiel@freenet.de>
parents: 4134
diff changeset
128
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
129 if (!h)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
130 return 0;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
131
5339
058e93c3d07d I have spotted the biggest bottleneck in "bdiff.c". Actually it was
Christoph Spiel <cspiel@freenet.de>
parents: 4134
diff changeset
132 buckets = buckets * scale - 1;
058e93c3d07d I have spotted the biggest bottleneck in "bdiff.c". Actually it was
Christoph Spiel <cspiel@freenet.de>
parents: 4134
diff changeset
133
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
134 /* clear the hash table */
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
135 for (i = 0; i <= buckets; i++) {
5341
458acf92b49e bdiff: use INT_MAX to avoid some inner loop comparisons
Matt Mackall <mpm@selenic.com>
parents: 5340
diff changeset
136 h[i].pos = INT_MAX;
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
137 h[i].len = 0;
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
138 }
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
139
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
140 /* add lines to the hash table chains */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
141 for (i = bn - 1; i >= 0; i--) {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
142 /* find the equivalence class */
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
143 for (j = b[i].hash & buckets; h[j].pos != INT_MAX;
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
144 j = (j + 1) & buckets)
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
145 if (!cmp(b + i, b + h[j].pos))
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
146 break;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
147
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
148 /* add to the head of the equivalence class */
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
149 b[i].n = h[j].pos;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
150 b[i].e = j;
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
151 h[j].pos = i;
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
152 h[j].len++; /* keep track of popularity */
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
153 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
154
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
155 /* compute popularity threshold */
9534
8e202431d620 bdiff: gradually enable the popularity hack
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 8858
diff changeset
156 t = (bn >= 31000) ? bn / 1000 : 1000000 / (bn + 1);
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
157
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
158 /* match items in a to their equivalence class in b */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
159 for (i = 0; i < an; i++) {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
160 /* find the equivalence class */
13732
afe9269dccec bdiff.c: rename all variables which hold a hash value to "hash"
Markus F.X.J. Oberhumer <markus@oberhumer.com>
parents: 13731
diff changeset
161 for (j = a[i].hash & buckets; h[j].pos != INT_MAX;
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
162 j = (j + 1) & buckets)
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
163 if (!cmp(a + i, b + h[j].pos))
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
164 break;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
165
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
166 a[i].e = j; /* use equivalence class for quick compare */
1542
8e80eefb3de6 made C src formatting more consistent
twaldmann@thinkmo.de
parents: 1397
diff changeset
167 if (h[j].len <= t)
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
168 a[i].n = h[j].pos; /* point to head of match list */
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
169 else
5341
458acf92b49e bdiff: use INT_MAX to avoid some inner loop comparisons
Matt Mackall <mpm@selenic.com>
parents: 5340
diff changeset
170 a[i].n = INT_MAX; /* too popular */
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
171 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
172
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
173 /* discard hash tables */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
174 free(h);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
175 return 1;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
176 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
177
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
178 static int longest_match(struct line *a, struct line *b, struct pos *pos,
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
179 int a1, int a2, int b1, int b2, int *omi, int *omj)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
180 {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
181 int mi = a1, mj = b1, mk = 0, mb = 0, i, j, k;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
182
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
183 for (i = a1; i < a2; i++) {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
184 /* skip things before the current block */
5341
458acf92b49e bdiff: use INT_MAX to avoid some inner loop comparisons
Matt Mackall <mpm@selenic.com>
parents: 5340
diff changeset
185 for (j = a[i].n; j < b1; j = b[j].n)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
186 ;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
187
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
188 /* loop through all lines match a[i] in b */
5341
458acf92b49e bdiff: use INT_MAX to avoid some inner loop comparisons
Matt Mackall <mpm@selenic.com>
parents: 5340
diff changeset
189 for (; j < b2; j = b[j].n) {
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
190 /* does this extend an earlier match? */
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
191 if (i > a1 && j > b1 && pos[j - 1].pos == i - 1)
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
192 k = pos[j - 1].len + 1;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
193 else
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
194 k = 1;
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
195 pos[j].pos = i;
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
196 pos[j].len = k;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
197
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
198 /* best match so far? */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
199 if (k > mk) {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
200 mi = i;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
201 mj = j;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
202 mk = k;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
203 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
204 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
205 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
206
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
207 if (mk) {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
208 mi = mi - mk + 1;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
209 mj = mj - mk + 1;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
210 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
211
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
212 /* expand match to include neighboring popular lines */
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
213 while (mi - mb > a1 && mj - mb > b1 &&
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
214 a[mi - mb - 1].e == b[mj - mb - 1].e)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
215 mb++;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
216 while (mi + mk < a2 && mj + mk < b2 &&
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
217 a[mi + mk].e == b[mj + mk].e)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
218 mk++;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
219
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
220 *omi = mi - mb;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
221 *omj = mj - mb;
5341
458acf92b49e bdiff: use INT_MAX to avoid some inner loop comparisons
Matt Mackall <mpm@selenic.com>
parents: 5340
diff changeset
222
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
223 return mk + mb;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
224 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
225
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
226 static struct hunk *recurse(struct line *a, struct line *b, struct pos *pos,
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
227 int a1, int a2, int b1, int b2, struct hunk *l)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
228 {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
229 int i, j, k;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
230
10500
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
231 while (1) {
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
232 /* find the longest match in this chunk */
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
233 k = longest_match(a, b, pos, a1, a2, b1, b2, &i, &j);
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
234 if (!k)
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
235 return l;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
236
10500
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
237 /* and recurse on the remaining chunks on either side */
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
238 l = recurse(a, b, pos, a1, i, b1, j, l);
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
239 if (!l)
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
240 return NULL;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
241
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
242 l->next = (struct hunk *)malloc(sizeof(struct hunk));
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
243 if (!l->next)
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
244 return NULL;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
245
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
246 l = l->next;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
247 l->a1 = i;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
248 l->a2 = i + k;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
249 l->b1 = j;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
250 l->b2 = j + k;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
251 l->next = NULL;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
252
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
253 /* tail-recursion didn't happen, so do equivalent iteration */
10500
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
254 a1 = i + k;
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
255 b1 = j + k;
e96597c8d0ea bdiff: do not use recursion / avoid stackoverflow (issue1940)
Alistair Bell <alistair@bellsonline.com>
parents: 10282
diff changeset
256 }
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
257 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
258
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
259 static int diff(struct line *a, int an, struct line *b, int bn,
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
260 struct hunk *base)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
261 {
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
262 struct hunk *curr;
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
263 struct pos *pos;
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
264 int t, count = 0;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
265
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
266 /* allocate and fill arrays */
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
267 t = equatelines(a, an, b, bn);
5571
f84bb2e1cc3a fix calloc(0, ...) issue
Jim Hague <jim.hague@acm.org>
parents: 5452
diff changeset
268 pos = (struct pos *)calloc(bn ? bn : 1, sizeof(struct pos));
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
269
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
270 if (pos && t) {
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
271 /* generate the matching block list */
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
272
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
273 curr = recurse(a, b, pos, 0, an, 0, bn, base);
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
274 if (!curr)
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
275 return -1;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
276
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
277 /* sentinel end hunk */
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
278 curr->next = (struct hunk *)malloc(sizeof(struct hunk));
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
279 if (!curr->next)
13090
c73745762f33 bdiff: Fix bogus NULL return
Matt Mackall <mpm@selenic.com>
parents: 13089
diff changeset
280 return -1;
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
281 curr = curr->next;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
282 curr->a1 = curr->a2 = an;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
283 curr->b1 = curr->b2 = bn;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
284 curr->next = NULL;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
285 }
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
286
474
b2ae8283d1a6 Minor speed improvements for bdiff
mpm@selenic.com
parents: 472
diff changeset
287 free(pos);
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
288
7625
930a2be7e875 bdiff: add comment about normalization
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7189
diff changeset
289 /* normalize the hunk list, try to push each hunk towards the end */
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
290 for (curr = base->next; curr; curr = curr->next) {
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
291 struct hunk *next = curr->next;
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
292 int shift = 0;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
293
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
294 if (!next)
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
295 break;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
296
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
297 if (curr->a2 == next->a1)
10282
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
298 while (curr->a2 + shift < an && curr->b2 + shift < bn
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
299 && !cmp(a + curr->a2 + shift,
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
300 b + curr->b2 + shift))
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
301 shift++;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
302 else if (curr->b2 == next->b1)
10282
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
303 while (curr->b2 + shift < bn && curr->a2 + shift < an
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
304 && !cmp(b + curr->b2 + shift,
08a0f04b56bd many, many trivial check-code fixups
Matt Mackall <mpm@selenic.com>
parents: 9534
diff changeset
305 a + curr->a2 + shift))
7104
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
306 shift++;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
307 if (!shift)
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
308 continue;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
309 curr->b2 += shift;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
310 next->b1 += shift;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
311 curr->a2 += shift;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
312 next->a1 += shift;
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
313 }
9514cbb6e4f6 bdiff: normalize the diff (issue1295)
Benoit Boissinot <benoit.boissinot@ens-lyon.org>
parents: 7036
diff changeset
314
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
315 for (curr = base->next; curr; curr = curr->next)
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
316 count++;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
317 return count;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
318 }
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
319
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
320 static void freehunks(struct hunk *l)
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
321 {
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
322 struct hunk *n;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
323 for (; l; l = n) {
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
324 n = l->next;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
325 free(l);
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
326 }
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
327 }
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
328
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
329 static PyObject *blocks(PyObject *self, PyObject *args)
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
330 {
435
e731d25ddab2 Fix a compile warning for bdiff
mpm@selenic.com
parents: 433
diff changeset
331 PyObject *sa, *sb, *rl = NULL, *m;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
332 struct line *a, *b;
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
333 struct hunk l, *h;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
334 int an, bn, count, pos = 0;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
335
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
336 if (!PyArg_ParseTuple(args, "SS:bdiff", &sa, &sb))
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
337 return NULL;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
338
11364
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
339 an = splitlines(PyBytes_AsString(sa), PyBytes_Size(sa), &a);
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
340 bn = splitlines(PyBytes_AsString(sb), PyBytes_Size(sb), &b);
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
341
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
342 if (!a || !b)
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
343 goto nomem;
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
344
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
345 l.next = NULL;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
346 count = diff(a, an, b, bn, &l);
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
347 if (count < 0)
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
348 goto nomem;
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
349
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
350 rl = PyList_New(count);
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
351 if (!rl)
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
352 goto nomem;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
353
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
354 for (h = l.next; h; h = h->next) {
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
355 m = Py_BuildValue("iiii", h->a1, h->a2, h->b1, h->b2);
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
356 PyList_SetItem(rl, pos, m);
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
357 pos++;
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
358 }
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
359
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
360 nomem:
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
361 free(a);
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
362 free(b);
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
363 freehunks(l.next);
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
364 return rl ? rl : PyErr_NoMemory();
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
365 }
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
366
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
367 static PyObject *bdiff(PyObject *self, PyObject *args)
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
368 {
3335
9061613c1593 Teach bdiff to support buffer objects
Brendan Cully <brendan@kublai.com>
parents: 2859
diff changeset
369 char *sa, *sb;
9061613c1593 Teach bdiff to support buffer objects
Brendan Cully <brendan@kublai.com>
parents: 2859
diff changeset
370 PyObject *result = NULL;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
371 struct line *al, *bl;
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
372 struct hunk l, *h;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
373 char encode[12], *rb;
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
374 int an, bn, len = 0, la, lb, count;
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
375
3369
4bad632913d8 python2.5 PyArg_ParseTuple fix
Alexis S. L. Carvalho <alexis@cecm.usp.br>
parents: 3335
diff changeset
376 if (!PyArg_ParseTuple(args, "s#s#:bdiff", &sa, &la, &sb, &lb))
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
377 return NULL;
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
378
3335
9061613c1593 Teach bdiff to support buffer objects
Brendan Cully <brendan@kublai.com>
parents: 2859
diff changeset
379 an = splitlines(sa, la, &al);
9061613c1593 Teach bdiff to support buffer objects
Brendan Cully <brendan@kublai.com>
parents: 2859
diff changeset
380 bn = splitlines(sb, lb, &bl);
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
381 if (!al || !bl)
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
382 goto nomem;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
383
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
384 l.next = NULL;
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
385 count = diff(al, an, bl, bn, &l);
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
386 if (count < 0)
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
387 goto nomem;
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
388
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
389 /* calculate length of output */
3335
9061613c1593 Teach bdiff to support buffer objects
Brendan Cully <brendan@kublai.com>
parents: 2859
diff changeset
390 la = lb = 0;
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
391 for (h = l.next; h; h = h->next) {
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
392 if (h->a1 != la || h->b1 != lb)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
393 len += 12 + bl[h->b1].l - bl[lb].l;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
394 la = h->a2;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
395 lb = h->b2;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
396 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
397
11364
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
398 result = PyBytes_FromStringAndSize(NULL, len);
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
399
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
400 if (!result)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
401 goto nomem;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
402
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
403 /* build binary patch */
11364
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
404 rb = PyBytes_AsString(result);
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
405 la = lb = 0;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
406
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
407 for (h = l.next; h; h = h->next) {
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
408 if (h->a1 != la || h->b1 != lb) {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
409 len = bl[h->b1].l - bl[lb].l;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
410 *(uint32_t *)(encode) = htonl(al[la].l - al->l);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
411 *(uint32_t *)(encode + 4) = htonl(al[h->a1].l - al->l);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
412 *(uint32_t *)(encode + 8) = htonl(len);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
413 memcpy(rb, encode, 12);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
414 memcpy(rb + 12, bl[lb].l, len);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
415 rb += 12 + len;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
416 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
417 la = h->a2;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
418 lb = h->b2;
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
419 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
420
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
421 nomem:
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
422 free(al);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
423 free(bl);
13089
faee0ffbc24b bdiff: dynamically allocate hunks
Matt Mackall <mpm@selenic.com>
parents: 11364
diff changeset
424 freehunks(l.next);
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
425 return result ? result : PyErr_NoMemory();
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
426 }
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
427
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
428 static char mdiff_doc[] = "Efficient binary diff.";
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
429
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
430 static PyMethodDef methods[] = {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
431 {"bdiff", bdiff, METH_VARARGS, "calculate a binary diff\n"},
433
79c694462294 Add bdiff.blocks / minor performance tweaks
mpm@selenic.com
parents: 411
diff changeset
432 {"blocks", blocks, METH_VARARGS, "find a list of matching lines\n"},
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
433 {NULL, NULL}
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
434 };
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
435
11364
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
436 #ifdef IS_PY3K
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
437 static struct PyModuleDef bdiff_module = {
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
438 PyModuleDef_HEAD_INIT,
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
439 "bdiff",
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
440 mdiff_doc,
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
441 -1,
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
442 methods
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
443 };
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
444
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
445 PyMODINIT_FUNC PyInit_bdiff(void)
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
446 {
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
447 return PyModule_Create(&bdiff_module);
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
448 }
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
449 #else
400
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
450 PyMODINIT_FUNC initbdiff(void)
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
451 {
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
452 Py_InitModule3("bdiff", methods, mdiff_doc);
8b067bde6679 Add a fast binary diff extension (not yet used)
mpm@selenic.com
parents:
diff changeset
453 }
11364
0044193a1c45 bdiff.c: Added support for py3k.
Renato Cunha <renatoc@gmail.com>
parents: 10500
diff changeset
454 #endif
1542
8e80eefb3de6 made C src formatting more consistent
twaldmann@thinkmo.de
parents: 1397
diff changeset
455