CONTRIBUTORS
author Brodie Rao <brodie@bitheap.org>
Mon, 19 Sep 2011 15:58:03 -0700
changeset 15141 16dc9a32ca04
parent 5514 c29efd272395
permissions -rw-r--r--
mdiff: speed up showfunc for large diffs This addresses the following issues with showfunc: - Silly usage of regular expressions. - Doing str.rstrip() needlessly in an inner loop. - Doing catastrophic backtracking when trying to find a function line. Finding function text is now at worst O(n lines in the old file), and at best close to O(n hunks). Given a diff like this[1]: src/main/antlr3/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunker.g | 4 +- src/main/java/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunkerLexer.java | 2 +- src/main/java/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunkerParser.java | 29189 +++++---- 3 files changed, 14741 insertions(+), 14454 deletions(-) [1]: https://bitbucket.org/wwmm/chemicaltagger/changeset/d2bfbaecd4fc/raw Without this change, hg log --stat --config diff.showfunc=1 takes an absurdly long time to complete: CallCount Recursive Total(ms) Inline(ms) module:lineno(function) 32813 0 80.3546 40.6086 mercurial.mdiff:160(yieldhunk) +65062746 0 25.7227 25.7227 +<method 'match' of '_sre.SRE_Pattern' objects> +65062746 0 14.0221 14.0221 +<method 'rstrip' of 'str' objects> +1809 0 0.0009 0.0009 +mercurial.mdiff:148(contextend) +1809 0 0.0003 0.0003 +<len> 65062746 0 25.7227 25.7227 <method 'match' of '_sre.SRE_Pattern' objects> 65062763 0 14.0221 14.0221 <method 'rstrip' of 'str' objects> 543 0 0.1631 0.1631 <zlib.decompress> 3 0 0.0505 0.0505 <mercurial.bdiff.blocks> 31007 0 80.4564 0.0477 mercurial.mdiff:147(_unidiff) +32813 0 80.3546 40.6086 +mercurial.mdiff:160(yieldhunk) +3 0 0.0505 0.0505 +<mercurial.bdiff.blocks> +3618 0 0.0022 0.0022 +mercurial.mdiff:154(contextstart) +5427 0 0.0013 0.0013 +<len> +3 0 0.0001 0.0000 +re:188(compile) 1 0 80.8381 0.0322 mercurial.patch:1777(diffstatdata) +107499 0 0.0235 0.0235 +<method 'startswith' of 'str' objects> +31014 0 80.7820 0.0071 +mercurial.util:1284(iterlines) +3 0 0.0000 0.0000 +<method 'search' of '_sre.SRE_Pattern' objects> +4 0 0.0000 0.0000 +mercurial.patch:1783(addresult) +3 0 0.0000 0.0000 +<method 'group' of '_sre.SRE_Match' objects> 6 0 0.0444 0.0283 mercurial.mdiff:12(splitnewlines) +6 0 0.0160 0.0160 +<method 'split' of 'str' objects> 32 0 0.0246 0.0246 <method 'update' of '_hashlib.HASH' objects> 11 0 0.0236 0.0236 <method 'read' of 'file' objects> Time: real 80.880 secs (user 80.200+0.000 sys 0.380+0.000) With this change, it's almost as fast as not using showfunc at all: CallCount Recursive Total(ms) Inline(ms) module:lineno(function) 543 0 0.1699 0.1699 <zlib.decompress> 3 0 0.0501 0.0501 <mercurial.bdiff.blocks> 32813 0 0.0415 0.0348 mercurial.mdiff:161(yieldhunk) +70837 0 0.0058 0.0058 +<method 'isalnum' of 'str' objects> +1809 0 0.0006 0.0006 +mercurial.mdiff:148(contextend) +1809 0 0.0002 0.0002 +<len> 1 0 0.4879 0.0310 mercurial.patch:1777(diffstatdata) +107499 0 0.0230 0.0230 +<method 'startswith' of 'str' objects> +31014 0 0.4335 0.0065 +mercurial.util:1284(iterlines) +3 0 0.0000 0.0000 +<method 'search' of '_sre.SRE_Pattern' objects> +4 0 0.0000 0.0000 +mercurial.patch:1783(addresult) +1 0 0.0004 0.0000 +re:188(compile) 32 0 0.0293 0.0293 <method 'update' of '_hashlib.HASH' objects> 6 0 0.0427 0.0279 mercurial.mdiff:12(splitnewlines) +6 0 0.0147 0.0147 +<method 'split' of 'str' objects> 31007 0 0.1169 0.0235 mercurial.mdiff:147(_unidiff) +3 0 0.0501 0.0501 +<mercurial.bdiff.blocks> +32813 0 0.0415 0.0348 +mercurial.mdiff:161(yieldhunk) +3618 0 0.0012 0.0012 +mercurial.mdiff:154(contextstart) +5427 0 0.0006 0.0006 +<len> 107597 0 0.0230 0.0230 <method 'startswith' of 'str' objects> 16 0 0.0213 0.0213 <mercurial.mpatch.patches> 194 0 0.0149 0.0149 <method 'split' of 'str' objects> Time: real 0.530 secs (user 0.450+0.000 sys 0.070+0.000)
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
5514
c29efd272395 Add note to CONTRIBUTORS file
Matt Mackall <mpm@selenic.com>
parents: 2947
diff changeset
     1
[This file is here for historical purposes, all recent contributors
c29efd272395 Add note to CONTRIBUTORS file
Matt Mackall <mpm@selenic.com>
parents: 2947
diff changeset
     2
should appear in the changelog directly]
c29efd272395 Add note to CONTRIBUTORS file
Matt Mackall <mpm@selenic.com>
parents: 2947
diff changeset
     3
c29efd272395 Add note to CONTRIBUTORS file
Matt Mackall <mpm@selenic.com>
parents: 2947
diff changeset
     4
Andrea Arcangeli <andrea at suse.de>
519
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
     5
Thomas Arendsen Hein <thomas at intevation.de>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
     6
Goffredo Baroncelli <kreijack at libero.it>
756
5d79dfa5e98f Added new code contributors, fixed Vincent's name, added hint on encoding.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 594
diff changeset
     7
Muli Ben-Yehuda <mulix at mulix.org>
5d79dfa5e98f Added new code contributors, fixed Vincent's name, added hint on encoding.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 594
diff changeset
     8
Mikael Berthe <mikael at lilotux.net>
1450
199bb2b4ed4a Add Benoit to CONTRIBUTORS
Matt Mackall <mpm@selenic.com>
parents: 1310
diff changeset
     9
Benoit Boissinot <bboissin at gmail.com>
2947
2d865068f72e Add self to contributors
Brendan Cully <brendan@kublai.com>
parents: 2162
diff changeset
    10
Brendan Cully <brendan at kublai.com>
519
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    11
Vincent Danjean <vdanjean.ml at free.fr>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    12
Jake Edge <jake at edge2.net>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    13
Michael Fetterman <michael.fetterman at intel.com>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    14
Edouard Gomez <ed.gomez at free.fr>
1231
effff847870f CONTRIBUTORS update
mpm@selenic.com
parents: 1080
diff changeset
    15
Eric Hopper <hopper at omnifarious.org>
756
5d79dfa5e98f Added new code contributors, fixed Vincent's name, added hint on encoding.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 594
diff changeset
    16
Alecs King <alecsk at gmail.com>
1310
7e8a55c9ee5c Updated CONTRIBUTORS.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 1231
diff changeset
    17
Volker Kleinfeld <Volker.Kleinfeld at gmx.de>
519
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    18
Vadim Lebedev <vadim at mbdsys.com>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    19
Christopher Li <hg at chrisli.org>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    20
Chris Mason <mason at suse.com>
2162
dac432a521d8 Add self to CONTRIBUTORS
Colin McMillen <mcmillen@cs.cmu.edu>
parents: 2120
diff changeset
    21
Colin McMillen <mcmillen at cs.cmu.edu>
1080
253072f39205 Updated list of contributors.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 896
diff changeset
    22
Wojciech Milkowski <wmilkowski at interia.pl>
756
5d79dfa5e98f Added new code contributors, fixed Vincent's name, added hint on encoding.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 594
diff changeset
    23
Chad Netzer <chad.netzer at gmail.com>
519
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    24
Bryan O'Sullivan <bos at serpentine.com>
756
5d79dfa5e98f Added new code contributors, fixed Vincent's name, added hint on encoding.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 594
diff changeset
    25
Vicent SeguĂ­ Pascual <vseguip at gmail.com>
5d79dfa5e98f Added new code contributors, fixed Vincent's name, added hint on encoding.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 594
diff changeset
    26
Sean Perry <shaleh at speakeasy.net>
594
0a2ffc5c906b Update CONTRIBUTORS
mpm@selenic.com
parents: 519
diff changeset
    27
Nguyen Anh Quynh <aquynh at gmail.com>
1310
7e8a55c9ee5c Updated CONTRIBUTORS.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 1231
diff changeset
    28
Ollivier Robert <roberto at keltia.freenix.fr>
2120
c0994047c5ff Added my name to the contributors list.
Alexander Schremmer <alex AT alexanderweb DOT de>
parents: 1450
diff changeset
    29
Alexander Schremmer <alex at alexanderweb.de>
519
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    30
Arun Sharma <arun at sharma-home.net>
1231
effff847870f CONTRIBUTORS update
mpm@selenic.com
parents: 1080
diff changeset
    31
Josef "Jeff" Sipek <jeffpc at optonline.net>
1310
7e8a55c9ee5c Updated CONTRIBUTORS.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 1231
diff changeset
    32
Kevin Smith <yarcs at qualitycode.com>
1231
effff847870f CONTRIBUTORS update
mpm@selenic.com
parents: 1080
diff changeset
    33
TK Soh <teekaysoh at yahoo.com>
519
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    34
Radoslaw Szkodzinski <astralstorm at gorzow.mm.pl>
851
73a432c8040a Added Samuel Tardieu to contributors list.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 760
diff changeset
    35
Samuel Tardieu <sam at rfc1149.net>
519
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    36
K Thananchayan <thananck at yahoo.com>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    37
Andrew Thompson <andrewkt at aktzero.com>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    38
Michael S. Tsirkin <mst at mellanox.co.il>
50768efaf6f2 Add a CONTRIBUTORS file
mpm@selenic.com
parents:
diff changeset
    39
Rafael Villar Burke <pachi at mmn-arquitectos.com>
855
a107c64c76be Added Tristan Wibberley to contributors.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 851
diff changeset
    40
Tristan Wibberley <tristan at wibberley.org>
756
5d79dfa5e98f Added new code contributors, fixed Vincent's name, added hint on encoding.
Thomas Arendsen Hein <thomas@intevation.de>
parents: 594
diff changeset
    41
Mark Williamson <mark.williamson at cl.cam.ac.uk>