mdiff: speed up showfunc for large diffs
This addresses the following issues with showfunc:
- Silly usage of regular expressions.
- Doing str.rstrip() needlessly in an inner loop.
- Doing catastrophic backtracking when trying to find a function line.
Finding function text is now at worst O(n lines in the old file), and
at best close to O(n hunks).
Given a diff like this[1]:
src/main/antlr3/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunker.g | 4 +-
src/main/java/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunkerLexer.java | 2 +-
src/main/java/uk/ac/cam/ch/wwmm/pregenerated/ChemicalChunkerParser.java | 29189 +++++----
3 files changed, 14741 insertions(+), 14454 deletions(-)
[1]: https://bitbucket.org/wwmm/chemicaltagger/changeset/
d2bfbaecd4fc/raw
Without this change, hg log --stat --config diff.showfunc=1 takes an
absurdly long time to complete:
CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
32813 0 80.3546 40.6086 mercurial.mdiff:160(yieldhunk)
+
65062746 0 25.7227 25.7227 +<method 'match' of '_sre.SRE_Pattern' objects>
+
65062746 0 14.0221 14.0221 +<method 'rstrip' of 'str' objects>
+1809 0 0.0009 0.0009 +mercurial.mdiff:148(contextend)
+1809 0 0.0003 0.0003 +<len>
65062746 0 25.7227 25.7227 <method 'match' of '_sre.SRE_Pattern' objects>
65062763 0 14.0221 14.0221 <method 'rstrip' of 'str' objects>
543 0 0.1631 0.1631 <zlib.decompress>
3 0 0.0505 0.0505 <mercurial.bdiff.blocks>
31007 0 80.4564 0.0477 mercurial.mdiff:147(_unidiff)
+32813 0 80.3546 40.6086 +mercurial.mdiff:160(yieldhunk)
+3 0 0.0505 0.0505 +<mercurial.bdiff.blocks>
+3618 0 0.0022 0.0022 +mercurial.mdiff:154(contextstart)
+5427 0 0.0013 0.0013 +<len>
+3 0 0.0001 0.0000 +re:188(compile)
1 0 80.8381 0.0322 mercurial.patch:1777(diffstatdata)
+107499 0 0.0235 0.0235 +<method 'startswith' of 'str' objects>
+31014 0 80.7820 0.0071 +mercurial.util:1284(iterlines)
+3 0 0.0000 0.0000 +<method 'search' of '_sre.SRE_Pattern' objects>
+4 0 0.0000 0.0000 +mercurial.patch:1783(addresult)
+3 0 0.0000 0.0000 +<method 'group' of '_sre.SRE_Match' objects>
6 0 0.0444 0.0283 mercurial.mdiff:12(splitnewlines)
+6 0 0.0160 0.0160 +<method 'split' of 'str' objects>
32 0 0.0246 0.0246 <method 'update' of '_hashlib.HASH' objects>
11 0 0.0236 0.0236 <method 'read' of 'file' objects>
Time: real 80.880 secs (user 80.200+0.000 sys 0.380+0.000)
With this change, it's almost as fast as not using showfunc at all:
CallCount Recursive Total(ms) Inline(ms) module:lineno(function)
543 0 0.1699 0.1699 <zlib.decompress>
3 0 0.0501 0.0501 <mercurial.bdiff.blocks>
32813 0 0.0415 0.0348 mercurial.mdiff:161(yieldhunk)
+70837 0 0.0058 0.0058 +<method 'isalnum' of 'str' objects>
+1809 0 0.0006 0.0006 +mercurial.mdiff:148(contextend)
+1809 0 0.0002 0.0002 +<len>
1 0 0.4879 0.0310 mercurial.patch:1777(diffstatdata)
+107499 0 0.0230 0.0230 +<method 'startswith' of 'str' objects>
+31014 0 0.4335 0.0065 +mercurial.util:1284(iterlines)
+3 0 0.0000 0.0000 +<method 'search' of '_sre.SRE_Pattern' objects>
+4 0 0.0000 0.0000 +mercurial.patch:1783(addresult)
+1 0 0.0004 0.0000 +re:188(compile)
32 0 0.0293 0.0293 <method 'update' of '_hashlib.HASH' objects>
6 0 0.0427 0.0279 mercurial.mdiff:12(splitnewlines)
+6 0 0.0147 0.0147 +<method 'split' of 'str' objects>
31007 0 0.1169 0.0235 mercurial.mdiff:147(_unidiff)
+3 0 0.0501 0.0501 +<mercurial.bdiff.blocks>
+32813 0 0.0415 0.0348 +mercurial.mdiff:161(yieldhunk)
+3618 0 0.0012 0.0012 +mercurial.mdiff:154(contextstart)
+5427 0 0.0006 0.0006 +<len>
107597 0 0.0230 0.0230 <method 'startswith' of 'str' objects>
16 0 0.0213 0.0213 <mercurial.mpatch.patches>
194 0 0.0149 0.0149 <method 'split' of 'str' objects>
Time: real 0.530 secs (user 0.450+0.000 sys 0.070+0.000)
/*
* Styles for man pages, which match with http://mercurial.selenic.com/
*
* Color scheme & layout are borrowed from
* http://mercurial.selenic.com/css/styles.css
*
* Some styles are from html4css1.css from Docutils, which is in the
* public domain.
*/
body {
margin: 0;
padding: 0;
font-family: sans-serif;
}
.document {
position: relative; /* be a top of absolute positioning */
margin: 1.5em 1.8em;
padding: 0;
line-height: 1.3;
}
/* layout: toc to right */
#contents {
position: absolute;
right: 0;
top: 0;
width: 26%;
}
/* layout: others to left */
h1.title, h2.subtitle, .section { width: 72%; }
.section .section { width: auto; }
table.docinfo { max-width: 72%; }
/* headings */
h1, h2, .topic-title, .admonition-title {
font-family: "MgOpen Cosmetica", "Lucida Sans Unicode", sans-serif;
font-weight: normal;
}
h1, h2, .topic-title, .admonition-title {
margin: 1em 0 0.5em;
}
h1.title { font-size: 300%; }
h2.subtitle, h1 { font-size: 200%; }
h2, .topic-title, .admonition-title { font-size: 140%; }
/* subtitle starts with lowercase in man pages, but not in HTML */
h2.subtitle:first-letter { text-transform: uppercase; }
/* override first/last margin */
.first, h1.title, h2.subtitle { margin-top: 0 !important; }
.last, .with-subtitle { margin-bottom: 0 !important; }
blockquote, pre, dd .option-list, .field-list {
margin: 0.2em 0 1em 2em;
}
kbd, tt, pre { font-family: monospace; }
dt { font-weight: bold; }
dd { margin-bottom: 0.5em; }
th, td { padding: 0.1em 0.2em; border: 0 none; }
th { font-weight: bold; text-align: left; }
a:link, a:visited { text-decoration: underline; }
a:hover, a:focus { text-decoration: none; }
a:link { color: #00b5f1; }
a:visited { color: #5c9caf; }
a:link.toc-backref, a:visited.toc-backref {
text-decoration: none;
color: inherit; /* NOTE: `inherit' is not supported by IE6 */
}
div.admonition, div.attention, div.caution,
div.danger, div.error, div.hint, div.important,
div.note, div.tip, div.warning {
border-top: 1px #ccc solid;
border-bottom: 1px #ccc solid;
padding: 0.3em 1em;
margin: 1em;
}
div.note {
border-color: #fcc200;
}
/*
* The following styles are from Docutils.
* Please refine if necessary.
*/
table.borderless td, table.borderless th {
/* Override padding for "table.docutils td" with "! important".
The right padding separates the table cells. */
padding: 0 0.5em 0 0 ! important;
}
.hidden {
display: none;
}
blockquote.epigraph {
margin: 2em 5em;
}
div.abstract {
margin: 2em 5em;
}
div.dedication {
margin: 2em 5em;
text-align: center;
font-style: italic;
}
div.figure {
margin-left: 2em;
margin-right: 2em;
}
div.footer, div.header {
clear: both;
font-size: smaller;
}
div.line-block {
display: block;
margin-top: 1em;
margin-bottom: 1em;
}
div.line-block div.line-block {
margin-top: 0;
margin-bottom: 0;
margin-left: 1.5em;
}
div.sidebar {
margin: 0 0 0.5em 1em;
border: medium outset;
padding: 1em;
background-color: #ffffee;
width: 40%;
float: right;
clear: right;
}
div.sidebar p.rubric {
font-family: sans-serif;
font-size: medium;
}
div.system-messages {
margin: 5em;
}
div.system-messages h1 {
color: red;
}
div.system-message {
border: medium outset;
padding: 1em;
}
div.system-message p.system-message-title {
color: red;
font-weight: bold;
}
h1.section-subtitle, h2.section-subtitle, h3.section-subtitle,
h4.section-subtitle, h5.section-subtitle, h6.section-subtitle {
margin-top: 0.4em;
}
hr.docutils {
width: 75%;
}
img.align-left {
clear: left;
}
img.align-right {
clear: right;
}
ol.simple, ul.simple {
margin-bottom: 1em;
}
ol.arabic {
list-style: decimal;
}
ol.loweralpha {
list-style: lower-alpha;
}
ol.upperalpha {
list-style: upper-alpha;
}
ol.lowerroman {
list-style: lower-roman;
}
ol.upperroman {
list-style: upper-roman;
}
p.attribution {
text-align: right;
margin-left: 50%;
}
p.caption {
font-style: italic;
}
p.credits {
font-style: italic;
font-size: smaller;
}
p.label {
white-space: nowrap;
}
p.rubric {
font-weight: bold;
font-size: larger;
color: maroon;
text-align: center;
}
pre.address {
margin-bottom: 0;
margin-top: 0;
font-family: serif;
font-size: 100%;
}
pre.literal-block, pre.doctest-block {
margin-left: 2em;
margin-right: 2em;
}
span.classifier {
font-family: sans-serif;
font-style: oblique;
}
span.classifier-delimiter {
font-family: sans-serif;
font-weight: bold;
}
span.interpreted {
font-family: sans-serif;
}
span.option {
white-space: nowrap;
}
span.pre {
white-space: pre;
}
span.problematic {
color: red;
}
span.section-subtitle {
/* font-size relative to parent (h1..h6 element) */
font-size: 80%;
}
table.citation {
border-left: solid 1px gray;
margin-left: 1px;
}
table.footnote {
border-left: solid 1px black;
margin-left: 1px;
}
h1 tt.docutils, h2 tt.docutils, h3 tt.docutils,
h4 tt.docutils, h5 tt.docutils, h6 tt.docutils {
font-size: 100%;
}
ul.auto-toc {
list-style-type: none;
}
div.contents.local {
-moz-column-width: 10em;
-moz-column-gap: 1em;
-webkit-column-width: 10em;
-webkit-column-gap: 1em;
}