bdiff: further restrain potential quadratic performance
This causes the longest_match search to limit itself to a window of
30000 lines during search (roughly 1MB), thus avoiding a full O(N*M)
search that might occur in repetitive structured inputs. For a
particular class of many MB pathological test cases, this generated
the following timings:
size before after
10x 1.25s 1.24s
100x 57s 33s
1000x >8400s 400s
The times on the right quickly become much faster and appear more linear.
While windowing means deltas are no longer "optimal", the resulting
deltas were within a couple percent of expected size. While we've yet
to have a report of a file with the level of repetition necessary to
hit this case, some JSON/XML database dump scenario is fairly likely
to hit it.
This may also slightly improve the average-case performance for deltas
of large binaries.
Mercurial
=========
Mercurial is a fast, easy to use, distributed revision control tool
for software developers.
Basic install:
$ make # see install targets
$ make install # do a system-wide install
$ hg debuginstall # sanity-check setup
$ hg # see help
Running without installing:
$ make local # build for inplace usage
$ ./hg --version # should show the latest version
See https://mercurial-scm.org/ for detailed installation
instructions, platform-specific notes, and Mercurial user information.