Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 09:24:35 -0700] rev 38978
changegroup: key off changelogdone
We use self._changelogdone for similar checks. Let's make things
consistent.
Differential Revision: https://phab.mercurial-scm.org/D4135
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 10:43:05 -0700] rev 38977
perf: call _generatechangelog() instead of group()
Now that we have a separate function for generating just the changelog
bits, the perf command should call it so it gets more accurate
behavior.
This changes the results of this command on my hg repo significantly:
! wall 1.390502 comb 1.390000 user 1.370000 sys 0.020000 (best of 8)
! wall 1.768750 comb 1.760000 user 1.760000 sys 0.000000 (best of 6)
Profiling seems to reveal that ~20% of execution time is spent in
progress bar accounting and printing! If we run with
progress.disable=true:
! wall 1.639134 comb 1.650000 user 1.630000 sys 0.020000 (best of 7)
A nice speedup. But profiling still shows a good chunk of time being
spent in progress bar accounting code. The reason is that the
progress bar is conditionally enabled via an argument to
cgpacker.group(). The previous code in perf.py calling into group()
did not enable the progress bar but _generatechangelog() always does.
I think it is important for the perf* commands to capture real-world
use cases. And this code always runs with an active progress bar. So
the regression is acceptable.
That being said, terminal printing performance can vary substantially.
I don't think perf* commands should test terminal printing unless
explicitly desired. So I've disabled progress bar printing in this
command.
Differential Revision: https://phab.mercurial-scm.org/D4134
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 17:59:56 -0700] rev 38976
changegroup: factor changelog chunk generation into own function
We have separate functions for generating manifests and filelogs.
Let's split changelog into its own function so things are consistent.
As part of this, we refactor the code slightly. Before, the
changelog linknode callback was updating state on variables
inherited via a closure. Since the closure is now separate from
generate(), we need to a way pass state between generate() and
_generatechangelog(). The return value of _generatechangelog()
is a 2-tuple where the first item is a dict containing accumulated
state. We then alias some of its members into the scope of
generate() to reduce code churn.
I will be converting other functions to a similar pattern in future
commits.
Differential Revision: https://phab.mercurial-scm.org/D4133
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 14:16:14 -0700] rev 38975
changegroup: pass function to resolve delta parents into constructor
Previously, _deltaparent() encapsulated the logic for all 3
delta parent modes of operation. The choice of delta parent
is static for the lifetime of the packer and can be passed into
the packer as a callable. So do that.
Differential Revision: https://phab.mercurial-scm.org/D4132
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 10:24:49 -0700] rev 38974
changegroup: restore original behavior of _nextclrevtolocalrev
0548f696795b accidentally changed the behavior of cgpacker._close().
The old behavior moved _nextclrevtolocalrev to _clrevtolocalrev only
when _nextclrevtolocalrev was present and then removed
_nextclrevtolocalrev. The bad behavior performed this move
then cleared _clrevtolocalrev because it was the same object as
_nextclrevtolocalrev.
This commit restores the previous behavior.
Surprisingly, no tests changed as a result of this bad logic. I'm
not sure why.
Differential Revision: https://phab.mercurial-scm.org/D4155
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 12:03:39 -0400] rev 38973
py3: whitelist another test caught by the ratchet
Differential Revision: https://phab.mercurial-scm.org/D4173
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 11:56:24 -0400] rev 38972
debugcommands: force import of fileset in debugfileset
It looks like Python 3's lazy importer is better than Python 2's for
this command, and as a result we had no symbols in the filesetlang
symbol table, which resulted in some really mysterious test-fileset.t
failures around withstatus optimizations. Inserting this explicit
import and forcing its evaluation fixes the test failure.
Differential Revision: https://phab.mercurial-scm.org/D4172
Jun Wu <quark@fb.com> [Tue, 07 Aug 2018 17:22:33 -0700] rev 38971
linelog: optimize replacelines
The optimization to avoid calling `annotate` inside `replacelines` is significant
for practical use patterns.
Before this patch:
hg perflinelogedits
! wall 6.778478 comb 6.710000 user 6.700000 sys 0.010000 (best of 3)
After this patch:
hg perflinelogedits
! wall 0.136573 comb 0.140000 user 0.130000 sys 0.010000 (best of 63)
Differential Revision: https://phab.mercurial-scm.org/D4150
Jun Wu <quark@fb.com> [Tue, 07 Aug 2018 17:17:01 -0700] rev 38970
linelog: extract `len(self._program)` to a local function
This is a micro optimization prepared for following changes where
`len(self._program)` is used in a loop.
Differential Revision: https://phab.mercurial-scm.org/D4149
Jun Wu <quark@fb.com> [Mon, 06 Aug 2018 18:56:24 -0700] rev 38969
perf: add a command to benchmark linelog edits
The use pattern of creating a linelog is usually by calling "replacelines"
multiple times. Add a command to benchmark it.
Differential Revision: https://phab.mercurial-scm.org/D4148
Jun Wu <quark@fb.com> [Mon, 06 Aug 2018 18:56:24 -0700] rev 38968
linelog: update internal help text
This clarifies the details asked by @martinvonz on D3990.
Differential Revision: https://phab.mercurial-scm.org/D4147
Danny Hooper <hooper@google.com> [Tue, 07 Aug 2018 21:15:27 -0700] rev 38967
fix: determine fixer tool failure by exit code instead of stderr
This seems like the more natural thing, and it probably should have been this
way to beign with. It is more flexible because it allows tools to emit
diagnostic information while also modifying a file. An example would be an
automatic code formatter that also prints any remaining lint issues.
Differential Revision: https://phab.mercurial-scm.org/D4158
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 09 Aug 2018 13:13:09 +0300] rev 38966
status: advertise --abort instead of 'update -C .' to abort graft
Recent release got us a --abort flag for 'hg graft' command which is nice UI and
we should advertise that to stop the graft instead of 'update -C .' which is
kind of ugly.
Differential Revision: https://phab.mercurial-scm.org/D4169
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 09 Aug 2018 12:32:11 +0300] rev 38965
status: advertise --abort instead of 'update -C .' to abort a merge
status has a part where it shows the conflict information and how to continue or
abort. Couple of release ago, we got merge --abort and we should advertise that
instead of 'hg update -C .' which is kind of ugly.
I know we need to unify the logic here.
Differential Revision: https://phab.mercurial-scm.org/D4168
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 09 Aug 2018 12:20:28 +0300] rev 38964
narrow: add '()' to ellipsis in the revset help
ellipsis is a revset function and was missing () after it's name in the help
text. This might confuse users as they try `hg log -r 'ellipsis'`.
Differential Revision: https://phab.mercurial-scm.org/D4167
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 10:11:10 -0400] rev 38963
tests: make all the string constants in test-match.py be bytes
Done with
python3 contrib/byteify-strings.py tests/test-match.py -i
# skip-blame just bytes prefixes
Differential Revision: https://phab.mercurial-scm.org/D4171
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 10:10:09 -0400] rev 38962
linelog: fix bytes/str issue in exception raise on Python 3
Differential Revision: https://phab.mercurial-scm.org/D4170
David Demelier <markand@malikania.fr> [Thu, 09 Aug 2018 13:13:00 +0200] rev 38961
absorb: following UI conventions
https://www.mercurial-scm.org/wiki/UIGuideline#adding_new_options
Sangeet Kumar Mishra <mail2sangeetmishra@gmail.com> [Wed, 08 Aug 2018 19:29:02 +0530] rev 38960
grep: search all commits in allfiles mode
All the commits are added to the 'wanted' set when allfiles mode is enabled.
Differential Revision: https://phab.mercurial-scm.org/D4157
Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 17:07:27 -0700] rev 38959
dirstate: add comment on why we don't need to check if something is a dir/file
Differential Revision: https://phab.mercurial-scm.org/D4161
Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 17:03:05 -0700] rev 38958
match: add missing "return set()", add FIXME to test to doc a bug
These were both brought up during the codereview of D4130.
Differential Revision: https://phab.mercurial-scm.org/D4160
Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 16:53:17 -0700] rev 38957
match: correct doc for _rootsdirsandparents after
5a7df82de142
Differential Revision: https://phab.mercurial-scm.org/D4159
Kyle Lippincott <spectral@google.com> [Tue, 31 Jul 2018 16:47:43 -0700] rev 38956
dirstate: use visitchildrenset in traverse
This speeds up `hg status` a fair amount when there is a very large directory
and narrow is in use.
Timing numbers according to command:
hyperfine --warmup 1 'hg status'
HGRCPATH points to a file with the following contents:
[extensions]
narrow =
mozilla-unified (called m-u below) was at revision #468856.
regular hash:
eb39298e432d
treemanifests hash:
0553b7f29eaf
large-dir-repo (called l-d-r below) was generated with the following script:
#!/bin/bash
hg init large-dir-repo
mkdir -p large-dir-repo/third_party/rust/log
touch large-dir-repo/third_party/rust/log/foo.txt
for i in $(seq 1 30000); do
d=$(mktemp -d large-dir-repo/third_party/XXXXXXXXX)
touch $d/file.txt
done
hg -R large-dir-repo ci -Am 'rev0' --user test --date '0 0'
for repos that use narrow, the narrowspec was this:
[includes]
rootfilesin:third_party/rust/log
[excludes]
This narrowspec was chosen due to the size of the third_party/rust directory;
this directory was *not* modified in revision #468856 in mozilla-unified.
Importantly, when using narrow, these repos had everything checked out (in the
case of large-dir-repo, that means all 30,001 directories), *before* adding the
narrowspec. This is to simulate the behavior when using a virtual filesystem
that shows everything for the user even if they haven't added it to the
narrowspec yet. This is not a supported configuration, and `hg update` will not
really do the "correct" thing, but non-mutating commands should behave
correctly.
There are two repos below that do not follow the setup above, 'citc1' and
'citc2', which are using a virtual filesystem and can not be reproduced
upstream; these numbers are here mostly to indicate that these performance
improvements are not hypothetical, and show the benefits we're hoping to achieve
on our real workloads. 'citc1' is closest to large-dir-repo with one of our
pathological cases, 'citc2' is an arbitrary repo and closer to "average".
I'm not claiming anything less than a 5% speed win as improvements due to this
change; these are probably eiter measurement artifacts or constant time
improvements. The numbers that aren't changing are shown primarily to prove that
this doesn't make anything worse in any case I plan on testing during this
series.
'before' is hg from commit
c83ad576. 'N' indicates narrow in use, 'T' indicates
treemanifest in use.
hg status:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 2.284 s +- 0.022 s | 2.274 s +- 0.021 s | 99.6%
m-u | | x | 2.289 s +- 0.008 s | 2.284 s +- 0.028 s | 99.8%
m-u | x | | 430.8 ms +- 3.1 ms | 424.5 ms +- 3.2 ms | 98.5%
m-u | x | x | 429.8 ms +- 2.5 ms | 425.8 ms +- 3.7 ms | 99.1%
l-d-r | | | 681.3 ms +- 5.5 ms | 689.6 ms +- 8.0 ms | 101.2%
l-d-r | | x | 666.8 ms +- 21.8 ms | 672.5 ms +- 14.9 ms | 100.9%
l-d-r | x | | 282.6 ms +- 1.8 ms | 203.0 ms +- 1.2 ms | 71.8% <--
l-d-r | x | x | 275.2 ms +- 3.9 ms | 199.3 ms +- 3.5 ms | 72.4% <--
citc1 | x | x | 1.023 s +- 0.011 s | 398.6 ms +- 9.2 ms | 39.0% <--
citc2 | x | x | 297.9 ms +- 4.4 ms | 289.6 ms +- 4.2 ms | 97.2%
hg status --change .:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 478.2 ms +- 2.0 ms | 476.9 ms +- 3.7 ms | 99.7%
m-u | | x | 169.5 ms +- 2.7 ms | 169.5 ms +- 2.5 ms | 100.0%
m-u | x | | 477.0 ms +- 2.4 ms | 476.1 ms +- 1.4 ms | 99.8%
m-u | x | x | 124.7 ms +- 1.9 ms | 124.2 ms +- 3.3 ms | 99.6%
l-d-r | | | 97.4 ms +- 1.2 ms | 96.5 ms +- 1.2 ms | 99.1%
l-d-r | | x | 4.778 s +- 0.018 s | 4.774 s +- 0.011 s | 99.9%
l-d-r | x | | 99.9 ms +- 1.1 ms | 98.8 ms +- 1.3 ms | 98.9%
l-d-r | x | x | 848.7 ms +- 7.1 ms | 849.4 ms +- 6.5 ms | 100.1%
citc1 | x | x | 4.250 s +- 0.051 s | 4.283 s +- 0.042 s | 100.8%
citc2 | x | x | 341.5 ms +- 4.7 ms | 341.5 ms +- 4.1 ms | 100.0%
hg update $rev^; hg update $rev:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 4.357 s +- 0.032 s | 4.312 s +- 0.093 s | 99.0%
m-u | | x | 3.599 s +- 0.061 s | 3.592 s +- 0.071 s | 99.8%
m-u | x | | 1.815 s +- 0.012 s | 1.816 s +- 0.013 s | 100.1%
m-u | x | x | 1.110 s +- 0.009 s | 1.106 s +- 0.005 s | 99.6%
l-d-r | | | 527.1 ms +- 7.8 ms | 523.3 ms +- 6.5 ms | 99.3%
l-d-r | | x | 8.835 s +- 0.067 s | 8.825 s +- 0.064 s | 99.9%
l-d-r | x | | 313.0 ms +- 2.2 ms | 312.1 ms +- 1.2 ms | 99.7%
l-d-r | x | x | 1.780 s +- 0.011 s | 1.799 s +- 0.013 s | 101.1%
citc1 | x | x | 6.825 s +- 0.262 s | 6.707 s +- 0.353 s | 98.3%
citc2 | x | x | 776.4 ms +- 4.5 ms | 781.3 ms +- 6.3 ms | 100.6%
hg diff:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 1.519 s +- 0.015 s | 1.525 s +- 0.017 s | 100.4%
m-u | | x | 1.512 s +- 0.010 s | 1.517 s +- 0.027 s | 100.3%
m-u | x | | 420.0 ms +- 3.2 ms | 417.1 ms +- 1.9 ms | 99.3%
m-u | x | x | 415.0 ms +- 3.8 ms | 415.7 ms +- 2.7 ms | 100.2%
l-d-r | | | 220.8 ms +- 4.0 ms | 220.8 ms +- 3.7 ms | 100.0%
l-d-r | | x | 216.6 ms +- 7.5 ms | 211.4 ms +- 2.1 ms | 97.6%
l-d-r | x | | 111.9 ms +- 1.8 ms | 112.0 ms +- 1.5 ms | 100.1%
l-d-r | x | x | 111.4 ms +- 1.4 ms | 110.2 ms +- 1.0 ms | 98.9%
citc1 | x | x | 268.7 ms +- 2.3 ms | 269.6 ms +- 2.8 ms | 100.3%
citc2 | x | x | 273.5 ms +- 5.5 ms | 273.9 ms +- 3.7 ms | 100.1%
hg diff -c .:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+--------------------------+-----------------------+----------
m-u | | | 497.1 ms +- 1.4 ms | 500.1 ms +- 2.4 ms | 100.6%
m-u | | x | 195.3 ms +- 13.2 ms | 191.6 ms +- 3.0 ms | 98.1%
m-u | x | | 476.8 ms +- 1.9 ms | 476.7 ms +- 2.3 ms | 100.0%
m-u | x | x | 122.8 ms +- 2.1 ms | 122.9 ms +- 2.0 ms | 100.1%
l-d-r | | | 99.3 ms +- 2.3 ms | 98.8 ms +- 1.7 ms | 99.5%
l-d-r | | x | 4.875 s +- 0.041 s | 4.847 s +- 0.038 s | 99.4%
l-d-r | x | | 98.5 ms +- 1.2 ms | 98.9 ms +- 1.3 ms | 100.4%
l-d-r | x | x | 864.6 ms +- 7.4 ms | 855.4 ms +- 6.6 ms | 98.9%
citc1 | x | x | 4.505 s +- 0.060 s | 4.466 s +- 0.036 s | 99.1%
citc2 | x | x | 368.0 ms +- 4.0 ms | 365.5 ms +- 6.3 ms | 99.3%
Differential Revision: https://phab.mercurial-scm.org/D4131
spectral <spectral@google.com> [Mon, 06 Aug 2018 12:52:33 -0700] rev 38955
match: add visitchildrenset complement to visitdir
`visitdir(d)` lets a caller query whether the directory is part of the matcher.
It can receive a response of 'all' (yes, and all children, you can stop calling
visitdir now), False (no, and no children either), or True (yes, either
something in this directory or a child is part of the matcher).
`visitchildrenset(d)` augments that by instead of returning True, it returns a
list of items to actually investigate. With this, code can be modified from:
for f in self.all_items:
if match.visitdir(self.dir + '/' + f):
<do stuff>
to be:
for f in self.all_items.intersect(match.visitchildrenset(self.dir)):
<do stuff>
use of this function can provide significant performance improvements,
especially when using narrow (so that the matcher is much smaller than the stuff
we see on disk) and/or treemanifests (so that we can avoid loading manifests for
trees that aren't part of the matcher).
Differential Revision: https://phab.mercurial-scm.org/D4130