Fix array overflow bug in bdiff
I ran into a bug while importing a large repository into mercurial.
The diff algorithm does not allocate a big enough array of hunks
for some test cases. This results in memory corruption, and possibly,
as in my case, a seg fault.
You should be able to reproduce this problem with any case of more
than a few lines that follows this pattern:
a b
= =
1 1
2
2 3
4
3 5
.
4 .
.
5
.
.
.
I.e., "a" has blank lines on every other line that have been removed in
"b". In this case, the number of matching hunks is equal to the number
of lines in "b". This is more than ((an + bn)/4 + 2). I'm not sure what
motivates this formula, but when I changed it to the smaller of an or
bn (+ 1), it works.
[comment added by mpm]
Mercurial git BK (*)
storage revlog delta compressed revisions SCCS weave
storage naming by filename by revision hash by filename
merge file DAGs changeset DAG file DAGs?
consistency SHA1 SHA1 CRC
signable? yes yes no
retrieve file tip O(1) O(1) O(revs)
add rev O(1) O(1) O(revs)
find prev file rev O(1) O(changesets) O(revs)
annotate file O(revs) O(changesets) O(revs)
find file changeset O(1) O(changesets) ?
checkout O(files) O(files) O(revs)?
commit O(changes) O(changes) ?
6 patches/s 6 patches/s slow
diff working dir O(changes) O(changes) ?
< 1s < 1s ?
tree diff revs O(changes) O(changes) ?
< 1s < 1s ?
hardlink clone O(files) O(revisions) O(files)
find remote csets O(log new) rsync: O(revisions) ?
git-http: O(changesets)
pull remote csets O(patch) O(modified files) O(patch)
repo growth O(patch) O(revisions) O(patch)
kernel history 300M 3.5G? 250M?
lines of code 2500 6500 (+ cogito) ??
* I've never used BK so this is just guesses