tests/test-minirst.py.out
author |
Gregory Szorc <gregory.szorc@gmail.com> |
|
Thu, 13 Oct 2016 12:50:27 +0200 |
changeset 30155 |
b7a966ce89ed |
parent 27729 |
58f8b29c37ff
|
child 31145 |
6582b3716ae0 |
permissions |
-rw-r--r-- |
changelog: disable delta chains
This patch disables delta chains on changelogs. After this patch, new
entries on changelogs - including existing changelogs - will be stored
as the fulltext of that data (likely compressed). No delta computation
will be performed.
An overview of delta chains and data justifying this change follows.
Revlogs try to store entries as a delta against a previous entry (either
a parent revision in the case of generaldelta or the previous physical
revision when not using generaldelta). Most of the time this is the
correct thing to do: it frequently results in less CPU usage and smaller
storage.
Delta chains are most effective when the base revision being deltad
against is similar to the current data. This tends to occur naturally
for manifests and file data, since only small parts of each tend to
change with each revision. Changelogs, however, are a different story.
Changelog entries represent changesets/commits. And unless commits in a
repository are homogonous (same author, changing same files, similar
commit messages, etc), a delta from one entry to the next tends to be
relatively large compared to the size of the entry. This means that
delta chains tend to be short. How short? Here is the full vs delta
revision breakdown on some real world repos:
Repo % Full % Delta Max Length
hg 45.8 54.2 6
mozilla-central 42.4 57.6 8
mozilla-unified 42.5 57.5 17
pypy 46.1 53.9 6
python-zstandard 46.1 53.9 3
(I threw in python-zstandard as an example of a repo that is homogonous.
It contains a small Python project with changes all from the same
author.)
Contrast this with the manifest revlog for these repos, where 99+% of
revisions are deltas and delta chains run into the thousands.
So delta chains aren't as useful on changelogs. But even a short delta
chain may provide benefits. Let's measure that.
Delta chains may require less CPU to read revisions if the CPU time
spent reading smaller deltas is less than the CPU time used to
decompress larger individual entries. We can measure this via
`hg perfrevlog -c -d 1` to iterate a revlog to resolve each revision's
fulltext. Here are the results of that command on a repo using delta
chains in its changelog and on a repo without delta chains:
hg (forward)
! wall 0.407008 comb 0.410000 user 0.410000 sys 0.000000 (best of 25)
! wall 0.390061 comb 0.390000 user 0.390000 sys 0.000000 (best of 26)
hg (reverse)
! wall 0.515221 comb 0.520000 user 0.520000 sys 0.000000 (best of 19)
! wall 0.400018 comb 0.400000 user 0.390000 sys 0.010000 (best of 25)
mozilla-central (forward)
! wall 4.508296 comb 4.490000 user 4.490000 sys 0.000000 (best of 3)
! wall 4.370222 comb 4.370000 user 4.350000 sys 0.020000 (best of 3)
mozilla-central (reverse)
! wall 5.758995 comb 5.760000 user 5.720000 sys 0.040000 (best of 3)
! wall 4.346503 comb 4.340000 user 4.320000 sys 0.020000 (best of 3)
mozilla-unified (forward)
! wall 4.957088 comb 4.950000 user 4.940000 sys 0.010000 (best of 3)
! wall 4.660528 comb 4.650000 user 4.630000 sys 0.020000 (best of 3)
mozilla-unified (reverse)
! wall 6.119827 comb 6.110000 user 6.090000 sys 0.020000 (best of 3)
! wall 4.675136 comb 4.670000 user 4.670000 sys 0.000000 (best of 3)
pypy (forward)
! wall 1.231122 comb 1.240000 user 1.230000 sys 0.010000 (best of 8)
! wall 1.164896 comb 1.160000 user 1.160000 sys 0.000000 (best of 9)
pypy (reverse)
! wall 1.467049 comb 1.460000 user 1.460000 sys 0.000000 (best of 7)
! wall 1.160200 comb 1.170000 user 1.160000 sys 0.010000 (best of 9)
The data clearly shows that it takes less wall and CPU time to resolve
revisions when there are no delta chains in the changelogs, regardless
of the direction of traversal. Furthermore, not using a delta chain
means that fulltext resolution in reverse is as fast as iterating
forward. So not using delta chains on the changelog is a clear CPU win
for reading operations.
An example of a user-visible operation showing this speed-up is revset
evaluation. Here are results for
`hg perfrevset 'author(gps) or author(mpm)'`:
hg
! wall 1.655506 comb 1.660000 user 1.650000 sys 0.010000 (best of 6)
! wall 1.612723 comb 1.610000 user 1.600000 sys 0.010000 (best of 7)
mozilla-central
! wall 17.629826 comb 17.640000 user 17.600000 sys 0.040000 (best of 3)
! wall 17.311033 comb 17.300000 user 17.260000 sys 0.040000 (best of 3)
What about 00changelog.i size?
Repo Delta Chains No Delta Chains
hg 7,033,250 6,976,771
mozilla-central 82,978,748 81,574,623
mozilla-unified 88,112,349 86,702,162
pypy 20,740,699 20,659,741
The data shows that removing delta chains from the changelog makes the
changelog smaller.
Delta chains are also used during changegroup generation. This
operation essentially converts a series of revisions to one large
delta chain. And changegroup generation is smart: if the delta in
the revlog matches what the changegroup is emitting, it will reuse
the delta instead of recalculating it. We can measure the impact
removing changelog delta chains has on changegroup generation via
`hg perfchangegroupchangelog`:
hg
! wall 1.589245 comb 1.590000 user 1.590000 sys 0.000000 (best of 7)
! wall 1.788060 comb 1.790000 user 1.790000 sys 0.000000 (best of 6)
mozilla-central
! wall 17.382585 comb 17.380000 user 17.340000 sys 0.040000 (best of 3)
! wall 20.161357 comb 20.160000 user 20.120000 sys 0.040000 (best of 3)
mozilla-unified
! wall 18.722839 comb 18.720000 user 18.680000 sys 0.040000 (best of 3)
! wall 21.168075 comb 21.170000 user 21.130000 sys 0.040000 (best of 3)
pypy
! wall 4.828317 comb 4.830000 user 4.820000 sys 0.010000 (best of 3)
! wall 5.415455 comb 5.420000 user 5.410000 sys 0.010000 (best of 3)
The data shows eliminating delta chains makes the changelog part of
changegroup generation slower. This is expected since we now have to
compute deltas for revisions where we could recycle the delta before.
It is worth putting this regression into context of overall changegroup
times. Here is the rough total CPU time spent in changegroup generation
for various repos while using delta chains on the changelog:
Repo CPU Time (s) CPU Time w/ compression
hg 4.50 7.05
mozilla-central 111.1 222.0
pypy 28.68 75.5
Before compression, removing delta chains from the changegroup adds
~4.4% overhead to hg changegroup generation, 1.3% to mozilla-central,
and 2.0% to pypy. When you factor in zlib compression, these percentages
are roughly divided by 2.
While the increased CPU usage for changegroup generation is unfortunate,
I think it is acceptable because the percentage is small, server
operators (those likely impacted most by this) have other mechanisms
to mitigate CPU consumption (namely reducing zlib compression level and
pre-generated clone bundles), and because there is room to optimize this
in the future. For example, we could use the nullid as the base revision,
effectively encoding the full revision for each entry in the changegroup.
When doing this, `hg perfchangegroupchangelog` nearly halves:
mozilla-unified
! wall 21.168075 comb 21.170000 user 21.130000 sys 0.040000 (best of 3)
! wall 11.196461 comb 11.200000 user 11.190000 sys 0.010000 (best of 3)
This looks very promising as a future optimization opportunity.
It's worth that the changes in test-acl.t to the changegroup part size.
This is because revision 6 in the changegroup had a delta chain of
length 2 before and after this patch the base revision is nullrev.
When the base revision is nullrev, cg2packer.deltaparent() hardcodes
the *previous* revision from the changegroup as the delta parent.
This caused the delta in the changegroup to switch base revisions,
the delta to change, and the size to change accordingly. While the
size increased in this case, I think sizes will remain the same
on average, as the delta base for changelog revisions doesn't matter
too much (as this patch shows). So, I don't consider this a regression.
== paragraphs ==
60 column format:
----------------------------------------------------------------------
This is some text in the first paragraph.
A small indented paragraph. It is followed by some lines
containing random whitespace.
The third and final paragraph.
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
This is some text in the first
paragraph.
A small indented paragraph.
It is followed by some lines
containing random
whitespace.
The third and final paragraph.
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
This is some text in the first paragraph.
</p>
<p>
A small indented paragraph.
It is followed by some lines
containing random whitespace.
</p>
<p>
The third and final paragraph.
</p>
----------------------------------------------------------------------
== definitions ==
60 column format:
----------------------------------------------------------------------
A Term
Definition. The indented lines make up the definition.
Another Term
Another definition. The final line in the definition
determines the indentation, so this will be indented
with four spaces.
A Nested/Indented Term
Definition.
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
A Term
Definition. The indented
lines make up the
definition.
Another Term
Another definition. The
final line in the
definition determines the
indentation, so this will
be indented with four
spaces.
A Nested/Indented Term
Definition.
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<dl>
<dt>A Term
<dd>Definition. The indented lines make up the definition.
<dt>Another Term
<dd>Another definition. The final line in the definition determines the indentation, so this will be indented with four spaces.
<dt>A Nested/Indented Term
<dd>Definition.
</dl>
----------------------------------------------------------------------
== literals ==
60 column format:
----------------------------------------------------------------------
The fully minimized form is the most convenient form:
Hello
literal
world
In the partially minimized form a paragraph simply ends with
space-double-colon.
////////////////////////////////////////
long un-wrapped line in a literal block
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
This literal block is started with '::',
the so-called expanded form. The paragraph
with '::' disappears in the final output.
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
The fully minimized form is
the most convenient form:
Hello
literal
world
In the partially minimized
form a paragraph simply ends
with space-double-colon.
////////////////////////////////////////
long un-wrapped line in a literal block
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
This literal block is started with '::',
the so-called expanded form. The paragraph
with '::' disappears in the final output.
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
The fully minimized form is the most
convenient form:
</p>
<pre>
Hello
literal
world
</pre>
<p>
In the partially minimized form a paragraph
simply ends with space-double-colon.
</p>
<pre>
////////////////////////////////////////
long un-wrapped line in a literal block
\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\\
</pre>
<pre>
This literal block is started with '::',
the so-called expanded form. The paragraph
with '::' disappears in the final output.
</pre>
----------------------------------------------------------------------
== lists ==
60 column format:
----------------------------------------------------------------------
- This is the first list item.
Second paragraph in the first list item.
- List items need not be separated by a blank line.
- And will be rendered without one in any case.
We can have indented lists:
- This is an indented list item
- Another indented list item:
- A literal block in the middle
of an indented list.
(The above is not a list item since we are in the literal block.)
Literal block with no indentation (apart from
the two spaces added to all literal blocks).
1. This is an enumerated list (first item).
2. Continuing with the second item.
(1) foo
(2) bar
1) Another
2) List
Line blocks are also a form of list:
This is the first line. The line continues here.
This is the second line.
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
- This is the first list item.
Second paragraph in the
first list item.
- List items need not be
separated by a blank line.
- And will be rendered without
one in any case.
We can have indented lists:
- This is an indented list
item
- Another indented list
item:
- A literal block in the middle
of an indented list.
(The above is not a list item since we are in the literal block.)
Literal block with no indentation (apart from
the two spaces added to all literal blocks).
1. This is an enumerated list
(first item).
2. Continuing with the second
item.
(1) foo
(2) bar
1) Another
2) List
Line blocks are also a form of
list:
This is the first line. The
line continues here.
This is the second line.
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<ul>
<li> This is the first list item.
<p>
Second paragraph in the first list item.
</p>
<li> List items need not be separated by a blank line.
<li> And will be rendered without one in any case.
</ul>
<p>
We can have indented lists:
</p>
<ul>
<li> This is an indented list item
<li> Another indented list item:
<pre>
- A literal block in the middle
of an indented list.
</pre>
<pre>
(The above is not a list item since we are in the literal block.)
</pre>
</ul>
<pre>
Literal block with no indentation (apart from
the two spaces added to all literal blocks).
</pre>
<ol>
<li> This is an enumerated list (first item).
<li> Continuing with the second item.
<li> foo
<li> bar
<li> Another
<li> List
</ol>
<p>
Line blocks are also a form of list:
</p>
<ol>
<li> This is the first line. The line continues here.
<li> This is the second line.
</ol>
----------------------------------------------------------------------
== options ==
60 column format:
----------------------------------------------------------------------
There is support for simple option lists, but only with long
options:
-X --exclude filter an option with a short and long option
with an argument
-I --include an option with both a short option and
a long option
--all Output all.
--both Output both (this description is quite
long).
--long Output all day long.
--par This option has two paragraphs in its
description. This is the first.
This is the second. Blank lines may
be omitted between options (as above)
or left in (as here).
The next paragraph looks like an option list, but lacks the
two-space marker after the option. It is treated as a normal
paragraph:
--foo bar baz
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
There is support for simple
option lists, but only with
long options:
-X --exclude filter an
option
with a
short
and
long
option
with an
argumen
t
-I --include an
option
with
both a
short
option
and a
long
option
--all Output
all.
--both Output
both
(this d
escript
ion is
quite
long).
--long Output
all day
long.
--par This
option
has two
paragra
phs in
its des
criptio
n. This
is the
first.
This is
the
second.
Blank
lines
may be
omitted
between
options
(as
above)
or left
in (as
here).
The next paragraph looks like
an option list, but lacks the
two-space marker after the
option. It is treated as a
normal paragraph:
--foo bar baz
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
There is support for simple option lists,
but only with long options:
</p>
<dl>
<dt>-X --exclude filter
<dd>an option with a short and long option with an argument
<dt>-I --include
<dd>an option with both a short option and a long option
<dt> --all
<dd>Output all.
<dt> --both
<dd>Output both (this description is quite long).
<dt> --long
<dd>Output all day long.
<dt> --par
<dd>This option has two paragraphs in its description. This is the first.
<p>
This is the second. Blank lines may be omitted between
options (as above) or left in (as here).
</p>
</dl>
<p>
The next paragraph looks like an option list, but lacks the two-space
marker after the option. It is treated as a normal paragraph:
</p>
<p>
--foo bar baz
</p>
----------------------------------------------------------------------
== fields ==
60 column format:
----------------------------------------------------------------------
a First item.
ab Second item. Indentation and wrapping is
handled automatically.
Next list:
small The larger key below triggers full indentation
here.
much too large
This key is big enough to get its own line.
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
a First item.
ab Second item.
Indentation and
wrapping is
handled
automatically.
Next list:
small The larger key
below triggers
full indentation
here.
much too large
This key is big
enough to get
its own line.
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<dl>
<dt>a
<dd>First item.
<dt>ab
<dd>Second item. Indentation and wrapping is handled automatically.
</dl>
<p>
Next list:
</p>
<dl>
<dt>small
<dd>The larger key below triggers full indentation here.
<dt>much too large
<dd>This key is big enough to get its own line.
</dl>
----------------------------------------------------------------------
== containers (normal) ==
60 column format:
----------------------------------------------------------------------
Normal output.
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Normal output.
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
Normal output.
</p>
----------------------------------------------------------------------
== containers (verbose) ==
60 column format:
----------------------------------------------------------------------
Normal output.
Verbose output.
----------------------------------------------------------------------
['debug', 'debug']
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Normal output.
Verbose output.
----------------------------------------------------------------------
['debug', 'debug']
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
Normal output.
</p>
<p>
Verbose output.
</p>
----------------------------------------------------------------------
['debug', 'debug']
----------------------------------------------------------------------
== containers (debug) ==
60 column format:
----------------------------------------------------------------------
Normal output.
Initial debug output.
----------------------------------------------------------------------
['verbose']
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Normal output.
Initial debug output.
----------------------------------------------------------------------
['verbose']
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
Normal output.
</p>
<p>
Initial debug output.
</p>
----------------------------------------------------------------------
['verbose']
----------------------------------------------------------------------
== containers (verbose debug) ==
60 column format:
----------------------------------------------------------------------
Normal output.
Initial debug output.
Verbose output.
Debug output.
----------------------------------------------------------------------
[]
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Normal output.
Initial debug output.
Verbose output.
Debug output.
----------------------------------------------------------------------
[]
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
Normal output.
</p>
<p>
Initial debug output.
</p>
<p>
Verbose output.
</p>
<p>
Debug output.
</p>
----------------------------------------------------------------------
[]
----------------------------------------------------------------------
== roles ==
60 column format:
----------------------------------------------------------------------
Please see 'hg add'.
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Please see 'hg add'.
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
Please see 'hg add'.
</p>
----------------------------------------------------------------------
== sections ==
60 column format:
----------------------------------------------------------------------
Title
=====
Section
-------
Subsection
''''''''''
Markup: "foo" and 'hg help'
---------------------------
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Title
=====
Section
-------
Subsection
''''''''''
Markup: "foo" and 'hg help'
---------------------------
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<h1>Title</h1>
<h2>Section</h2>
<h3>Subsection</h3>
<h2>Markup: "foo" and 'hg help'</h2>
----------------------------------------------------------------------
== admonitions ==
60 column format:
----------------------------------------------------------------------
Note:
This is a note
- Bullet 1
- Bullet 2
Warning!
This is a warning Second input line of warning
!Danger!
This is danger
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Note:
This is a note
- Bullet 1
- Bullet 2
Warning!
This is a warning Second
input line of warning
!Danger!
This is danger
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
<b>Note:</b>
</p>
<p>
This is a note
</p>
<ul>
<li> Bullet 1
<li> Bullet 2
</ul>
<p>
<b>Warning!</b> This is a warning Second input line of warning
</p>
<p>
<b>!Danger!</b> This is danger
</p>
----------------------------------------------------------------------
== comments ==
60 column format:
----------------------------------------------------------------------
Some text.
Some indented text.
Empty comment above
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
Some text.
Some indented text.
Empty comment above
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<p>
Some text.
</p>
<p>
Some indented text.
</p>
<p>
Empty comment above
</p>
----------------------------------------------------------------------
=== === ========================================
a b c
=== === ========================================
1 2 3
foo bar baz this list is very very very long man
=== === ========================================
== table ==
60 column format:
----------------------------------------------------------------------
a b c
------------------------------------------------
1 2 3
foo bar baz this list is very very very long man
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
a b c
------------------------------
1 2 3
foo bar baz this list is
very very very long
man
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<table>
<tr><td>a</td>
<td>b</td>
<td>c</td></tr>
<tr><td>1</td>
<td>2</td>
<td>3</td></tr>
<tr><td>foo</td>
<td>bar</td>
<td>baz this list is very very very long man</td></tr>
</table>
----------------------------------------------------------------------
= ==== ======================================
s long line goes on here
xy tried to fix here by indenting
= ==== ======================================
== table+nl ==
60 column format:
----------------------------------------------------------------------
s long line goes on here
xy tried to fix here by indenting
----------------------------------------------------------------------
30 column format:
----------------------------------------------------------------------
s long line goes on here
xy tried to fix here by
indenting
----------------------------------------------------------------------
html format:
----------------------------------------------------------------------
<table>
<tr><td>s</td>
<td>long</td>
<td>line goes on here</td></tr>
<tr><td></td>
<td>xy</td>
<td>tried to fix here by indenting</td></tr>
</table>
----------------------------------------------------------------------