tests/test-issue4074.t
author Matt Mackall <mpm@selenic.com>
Thu, 21 Apr 2016 22:04:11 -0500
branchstable
changeset 29014 f1ca249696ed
child 32958 75be14993fda
permissions -rw-r--r--
bdiff: balance recursion to avoid quadratic behavior (issue4704) For highly structured files like JSON or XML dumps with large numbers of duplicate lines (eg braces) and isolated matching lines, bdiff could find large numbers of equally good spans. Because it prefers earlier matches, this would result in pathologically unbalance recursion that resulted in quadratic performance. This patch makes it prefer matches closer to the middle that tend to balance recursion. This change improves the speed of a pathological test case from 1100s to 9s. Included is a smaller test that has a roughly 50x safety margin on the performance it accepts. It's likely to fail on pure builds because difflib also has a recursion-balancing problem.
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
29014
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     1
#require no-pure
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     2
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     3
A script to generate nasty diff worst-case scenarios:
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     4
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     5
  $ cat > s.py <<EOF
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     6
  > import random
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     7
  > for x in xrange(100000):
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     8
  >     print
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
     9
  >     if random.randint(0, 100) >= 50:
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    10
  >         x += 1
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    11
  >     print hex(x)
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    12
  > EOF
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    13
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    14
  $ hg init a
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    15
  $ cd a
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    16
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    17
Check in a big file:
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    18
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    19
  $ python ../s.py > a
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    20
  $ hg ci -qAm0
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    21
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    22
Modify it:
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    23
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    24
  $ python ../s.py > a
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    25
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    26
Time a check-in, should never take more than 10 seconds user time:
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    27
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    28
  $ hg ci --time -m1
f1ca249696ed bdiff: balance recursion to avoid quadratic behavior (issue4704)
Matt Mackall <mpm@selenic.com>
parents:
diff changeset
    29
  time: real .* secs .user [0-9][.].* sys .* (re)