Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 16:01:26 -0700] rev 39054
changegroup: differentiate between fulltext and diff based deltas
Previously, revisiondelta encoded a delta and an optional prefix
containing a delta header. The underlying code could populate
the delta with either a real delta or a fulltext revision.
Following the theme of wanting to defer serialization of revision
data to the changegroup format as long as possible, it seems
prudent for the revision delta instance to capture what type of
data is being represented. This could possibly allow us to
encode revision data differently in the future. But for the
short term, it makes the behavior of a revisiondelta more
explicit.
Differential Revision: https://phab.mercurial-scm.org/D4213
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 15:28:22 -0700] rev 39053
changegroup: minor cleanups to deltagroup()
Differential Revision: https://phab.mercurial-scm.org/D4212
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 14:44:48 -0700] rev 39052
changegroup: emit revisiondelta instances from deltagroup()
By abstracting the concept of a delta group away from its
serialization (the changegroup format), we make it easier
to establish alternate serialization formats. We also make
it possible to move aspects of delta group generation into
the storage layer. This will allow storage to make decisions
about e.g. delta parent choices without the changegroup code
needing storage APIs to determine delta parents. We're still
a bit of a way from there. Future commits will work towards
that world.
Differential Revision: https://phab.mercurial-scm.org/D4211
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 14:33:33 -0700] rev 39051
changegroup: move file chunk emission to generate()
Same deal as manifests. We want to get to a point where we can
emit data structures from deltagroup() and derive the raw
changegroup data as late as possible.
Differential Revision: https://phab.mercurial-scm.org/D4210
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 15:14:59 -0700] rev 39050
changegroup: move manifest chunk emission to generate()
We want to get to a point where we can emit data structures from
deltagroup() and derive the raw changegroup data as late as possible.
Differential Revision: https://phab.mercurial-scm.org/D4209
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 15:09:12 -0700] rev 39049
changegroup: move size tracking and end of manifests to generate()
Preparing for all the generate* functions to emit data structures
instead of raw chunks.
Differential Revision: https://phab.mercurial-scm.org/D4208
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 14:15:33 -0700] rev 39048
changegroup: emit delta group close chunk outside of deltagroup()
I want to make deltagroup() emit data structures rather than
serialized deltas. Upcoming commits will demonstrate why.
Differential Revision: https://phab.mercurial-scm.org/D4207
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 14:19:02 -0700] rev 39047
changegroup: extract cgpacker.group() to standalone function
It doesn't need to be part of the packer class.
Differential Revision: https://phab.mercurial-scm.org/D4206
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 14:02:31 -0700] rev 39046
changegroup: pass all state into group()
This will allow us to split it into a standalone function.
Differential Revision: https://phab.mercurial-scm.org/D4205
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 08 Aug 2018 13:50:54 -0700] rev 39045
changegroup: inline _prune() into call sites
The functionality is pretty simple. As a bonus, _prune() had special
code for the manifest case. We can now exclude this check from the
file call site.
Differential Revision: https://phab.mercurial-scm.org/D4199
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 15:31:03 -0700] rev 39044
changegroup: inline _packmanifests() into generatemanifests()
It is relatively small. Every other generate*() calls group()
directly. So the new code is consistent.
Differential Revision: https://phab.mercurial-scm.org/D4198
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 15:13:25 -0700] rev 39043
changegroup: invert conditional and dedent
I don't like seeing code that visually resembles the pyramid of
doom.
Differential Revision: https://phab.mercurial-scm.org/D4197
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 15:10:38 -0700] rev 39042
changegroup: make _revisiondeltanarrow() a standalone function
It doesn't require any state on the packer. Everything impacting
behavior is passed in as a function. So split it out, just like
what was done for _revisiondeltanormal().
Differential Revision: https://phab.mercurial-scm.org/D4196
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 15:08:29 -0700] rev 39041
changegroup: pass state into _revisiondeltanarrow
After this, the method no longer accesses self and can be split
into a standalone function.
Differential Revision: https://phab.mercurial-scm.org/D4195
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 14:53:42 -0700] rev 39040
changegroup: inline _close()
Now that it doesn't clear self._clrevtolocalrev on every invocation
and is a simple one-liner that calls another function, we can
do away with this method and inline its content into all call
sites.
Differential Revision: https://phab.mercurial-scm.org/D4194
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 14:52:00 -0700] rev 39039
changegroup: pass clrevtolocalrev to each group
clrevtolocalrev is a per-changegroup group mapping revisions to
aid with shallow clone.
Back when this functionality was implemented in an extension, this
dict was added to the packer instance so monkeypatched functions
could reference it there. Now that this code is part of core, we
can pass the dict to each consumer properly so it doesn't have to
live on the cgpacker instance. This commit does that.
Differential Revision: https://phab.mercurial-scm.org/D4193
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 12:44:56 -0700] rev 39038
changegroup: combine _generatefiles() into generatefiles()
These were split out in a06aab274aef as part of moving the
narrow code into core. They don't need to be separate
functions.
Differential Revision: https://phab.mercurial-scm.org/D4192
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 15:45:56 -0700] rev 39037
changegroup: define linknodes callbacks in generatefiles()
This is how it is done everywhere else.
But the logic here is a bit more complex because shallow clone
needs to reference the original linknode implementation. But
at least now all function implementations are defined in the
same place.
Differential Revision: https://phab.mercurial-scm.org/D4191
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 10:55:32 -0700] rev 39036
changegroup: track changelog to manifest revision map explicitly
Previously, self._nextclrevtolocalrev was only populated as part
of the changelog lookup callback. But cgpacker._close() was looking
at self._nextclrevtolocalrev on every invocation.
Since self._nextclrevtolocalrev is for communicating the mapping
of changelog revisions to manifest revisions, this commit refactors
the code to make that explicit.
The changelog state now stores this mapping. And after the changelog
group is emitted, we update self._clrevtolocalrev with that dict.
self._nextclrevtolocalrev is unused and has been deleted.
Differential Revision: https://phab.mercurial-scm.org/D4190
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 10:49:41 -0700] rev 39035
changegroup: remove _clnodetorev
cgpacker._clnodetorev is a glorified cache/index of changelog
nodes to revision numbers.
I'm not sure why it exists. Maybe performance? But its presence
is making refactoring of this code more complicated than it needs
to be.
This commit removes the cache and replaces it with direct lookups
against the changelog.
If this cache was for performance reasons, we should be able to
restore it easily enough... after the changegroup refactor is
complete.
Differential Revision: https://phab.mercurial-scm.org/D4189
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 15:44:33 -0700] rev 39034
changegroup: rename _fullnodes to _fullclnodes
So it is obvious which nodes we are talking about.
And sneak in a docs change to reflect that this variable is a set.
Differential Revision: https://phab.mercurial-scm.org/D4188
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 15:04:20 -0700] rev 39033
changegroup: move part of _revisiondeltanarrow into group()
Now all the logic for determining which delta generation code
is called lives in a single function.
Differential Revision: https://phab.mercurial-scm.org/D4187
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 14:56:37 -0700] rev 39032
changegroup: populate _clnodetorev as part of changelog linknode lookup
The thing that matters is that self._clnodetorev is populated with
changesets that are being sent. Back when this code was in an
extension, it wasn't possible to monkeypatch the changelog lookup
function. Now that the code is in core, we can move this code to
where it logically belongs.
Differential Revision: https://phab.mercurial-scm.org/D4186
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 13:08:29 -0400] rev 39031
tests: rename variables in revlog index parse test for clarity
Now it's unambiguous which one is the expected value. c_res_{1,2} was
also misleading a bit because in --pure mode we're testing the old
slow Python version against the modern optimized Python version.
Differential Revision: https://phab.mercurial-scm.org/D4180
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 13:06:50 -0400] rev 39030
tests: move assertion closer to want/got declarations in test-parseindex2.py
I find this easier to understand.
Differential Revision: https://phab.mercurial-scm.org/D4179
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 13:05:40 -0400] rev 39029
tests: move chunks of test-parseindex2.py to use unittest properly
This doesn't touch the version-detection tests yet, because those are
more involved.
Differential Revision: https://phab.mercurial-scm.org/D4178
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 12:59:23 -0400] rev 39028
tests: fix up indent width in test-parseindex2.py
Differential Revision: https://phab.mercurial-scm.org/D4177
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 12:58:25 -0400] rev 39027
tests: start moving test-parseindex2.py to a unittest
Using 2-space indents in this revision to make the code motion easier
to review. I'll fix it in the next commit.
Differential Revision: https://phab.mercurial-scm.org/D4176
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 12:10:34 -0400] rev 39026
tests: port test-absorb-filefixupstate to Python 3
Mostly b prefixes, but also some isinstance() checks and a couple of
maplist() instances. The test now passes on Python 3.
Differential Revision: https://phab.mercurial-scm.org/D4175
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 12:06:31 -0400] rev 39025
absorb: port partway to Python 3
Use pycompat.maplist() in the one place that matters and use the
default iterator of a dict instead of iterkeys().
Two new tests pass on Python 3.
Differential Revision: https://phab.mercurial-scm.org/D4174
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 04 Aug 2018 21:31:46 -0400] rev 39024
localrepo: better error when a repo exists but we lack permissions
Claiming "repository foo not found" when the repository does exist
causes confusion regularly ("where is the typo?").
Differential Revision: https://phab.mercurial-scm.org/D4122
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 11:32:16 -0700] rev 39023
changegroup: extract _revisiondeltanormal() to standalone function
It wasn't accessing anything important on the cgpacker that warranted
it being a method instead of a function.
Differential Revision: https://phab.mercurial-scm.org/D4142
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 11:13:25 -0700] rev 39022
changegroup: inline _revchunk() into group()
_revchunk() was pretty minimal. I think having all the code for
generating data composing the changegroup in one function makes
things easier to understand.
As part of the refactor, we now call the _revisiondelta* functions
explicitly. This paves the road to refactor their argument
signatures.
Differential Revision: https://phab.mercurial-scm.org/D4141
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 11:06:22 -0700] rev 39021
changegroup: pass mfdicts properly
With the narrow code part of core, the hacky pass-argument-via-
attribute-on-self can be accomplished with a regular function
argument.
Differential Revision: https://phab.mercurial-scm.org/D4140
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 11:33:05 -0700] rev 39020
changegroup: pass sorted revisions into group() (API)
Currently, group() receives a list of nodes and calls _sortgroup()
to sort them and turn them into revs. Since the sorting behavior
varies depending on the type of data being transferred, I think it
makes sense to perform the sorting before group() is invoked.
This commit extracts _sortgroup() to a pair of standalone functions.
It then moves the calling of these functions to the 3 call sites of
group(). group() now receives an iterable of revs instead of nodes.
Differential Revision: https://phab.mercurial-scm.org/D4139
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 18:40:41 -0700] rev 39019
changegroup: pull _fileheader out of cgpacker
It doesn't need any state from the packer.
Differential Revision: https://phab.mercurial-scm.org/D4138
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 09:26:02 -0700] rev 39018
changegroup: factor changelogdone into an argument
The variable was basically tracking whether the current operation
is being performed against the changelog or something else. So
let's just pass such a flag to everything that needs to access it.
I'm still not a huge fan of building changelog awareness into
low-level functions like revision delta generation. But passing
an argument is strictly better than state on the packer instance.
Differential Revision: https://phab.mercurial-scm.org/D4137
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 18:31:00 -0700] rev 39017
changegroup: record changelogdone after fully consuming its data
Setting this as a side-effect of calling _close() is wonky. There's
only one group for changelog data. So we can wait until after all
data has been emitted before recording it.
Differential Revision: https://phab.mercurial-scm.org/D4136
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 09:24:35 -0700] rev 39016
changegroup: key off changelogdone
We use self._changelogdone for similar checks. Let's make things
consistent.
Differential Revision: https://phab.mercurial-scm.org/D4135
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 10:43:05 -0700] rev 39015
perf: call _generatechangelog() instead of group()
Now that we have a separate function for generating just the changelog
bits, the perf command should call it so it gets more accurate
behavior.
This changes the results of this command on my hg repo significantly:
! wall 1.390502 comb 1.390000 user 1.370000 sys 0.020000 (best of 8)
! wall 1.768750 comb 1.760000 user 1.760000 sys 0.000000 (best of 6)
Profiling seems to reveal that ~20% of execution time is spent in
progress bar accounting and printing! If we run with
progress.disable=true:
! wall 1.639134 comb 1.650000 user 1.630000 sys 0.020000 (best of 7)
A nice speedup. But profiling still shows a good chunk of time being
spent in progress bar accounting code. The reason is that the
progress bar is conditionally enabled via an argument to
cgpacker.group(). The previous code in perf.py calling into group()
did not enable the progress bar but _generatechangelog() always does.
I think it is important for the perf* commands to capture real-world
use cases. And this code always runs with an active progress bar. So
the regression is acceptable.
That being said, terminal printing performance can vary substantially.
I don't think perf* commands should test terminal printing unless
explicitly desired. So I've disabled progress bar printing in this
command.
Differential Revision: https://phab.mercurial-scm.org/D4134
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 17:59:56 -0700] rev 39014
changegroup: factor changelog chunk generation into own function
We have separate functions for generating manifests and filelogs.
Let's split changelog into its own function so things are consistent.
As part of this, we refactor the code slightly. Before, the
changelog linknode callback was updating state on variables
inherited via a closure. Since the closure is now separate from
generate(), we need to a way pass state between generate() and
_generatechangelog(). The return value of _generatechangelog()
is a 2-tuple where the first item is a dict containing accumulated
state. We then alias some of its members into the scope of
generate() to reduce code churn.
I will be converting other functions to a similar pattern in future
commits.
Differential Revision: https://phab.mercurial-scm.org/D4133
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 14:16:14 -0700] rev 39013
changegroup: pass function to resolve delta parents into constructor
Previously, _deltaparent() encapsulated the logic for all 3
delta parent modes of operation. The choice of delta parent
is static for the lifetime of the packer and can be passed into
the packer as a callable. So do that.
Differential Revision: https://phab.mercurial-scm.org/D4132
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 07 Aug 2018 10:24:49 -0700] rev 39012
changegroup: restore original behavior of _nextclrevtolocalrev
0548f696795b accidentally changed the behavior of cgpacker._close().
The old behavior moved _nextclrevtolocalrev to _clrevtolocalrev only
when _nextclrevtolocalrev was present and then removed
_nextclrevtolocalrev. The bad behavior performed this move
then cleared _clrevtolocalrev because it was the same object as
_nextclrevtolocalrev.
This commit restores the previous behavior.
Surprisingly, no tests changed as a result of this bad logic. I'm
not sure why.
Differential Revision: https://phab.mercurial-scm.org/D4155
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 12:03:39 -0400] rev 39011
py3: whitelist another test caught by the ratchet
Differential Revision: https://phab.mercurial-scm.org/D4173
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 11:56:24 -0400] rev 39010
debugcommands: force import of fileset in debugfileset
It looks like Python 3's lazy importer is better than Python 2's for
this command, and as a result we had no symbols in the filesetlang
symbol table, which resulted in some really mysterious test-fileset.t
failures around withstatus optimizations. Inserting this explicit
import and forcing its evaluation fixes the test failure.
Differential Revision: https://phab.mercurial-scm.org/D4172
Jun Wu <quark@fb.com> [Tue, 07 Aug 2018 17:22:33 -0700] rev 39009
linelog: optimize replacelines
The optimization to avoid calling `annotate` inside `replacelines` is significant
for practical use patterns.
Before this patch:
hg perflinelogedits
! wall 6.778478 comb 6.710000 user 6.700000 sys 0.010000 (best of 3)
After this patch:
hg perflinelogedits
! wall 0.136573 comb 0.140000 user 0.130000 sys 0.010000 (best of 63)
Differential Revision: https://phab.mercurial-scm.org/D4150
Jun Wu <quark@fb.com> [Tue, 07 Aug 2018 17:17:01 -0700] rev 39008
linelog: extract `len(self._program)` to a local function
This is a micro optimization prepared for following changes where
`len(self._program)` is used in a loop.
Differential Revision: https://phab.mercurial-scm.org/D4149
Jun Wu <quark@fb.com> [Mon, 06 Aug 2018 18:56:24 -0700] rev 39007
perf: add a command to benchmark linelog edits
The use pattern of creating a linelog is usually by calling "replacelines"
multiple times. Add a command to benchmark it.
Differential Revision: https://phab.mercurial-scm.org/D4148
Jun Wu <quark@fb.com> [Mon, 06 Aug 2018 18:56:24 -0700] rev 39006
linelog: update internal help text
This clarifies the details asked by @martinvonz on D3990.
Differential Revision: https://phab.mercurial-scm.org/D4147
Danny Hooper <hooper@google.com> [Tue, 07 Aug 2018 21:15:27 -0700] rev 39005
fix: determine fixer tool failure by exit code instead of stderr
This seems like the more natural thing, and it probably should have been this
way to beign with. It is more flexible because it allows tools to emit
diagnostic information while also modifying a file. An example would be an
automatic code formatter that also prints any remaining lint issues.
Differential Revision: https://phab.mercurial-scm.org/D4158
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 09 Aug 2018 13:13:09 +0300] rev 39004
status: advertise --abort instead of 'update -C .' to abort graft
Recent release got us a --abort flag for 'hg graft' command which is nice UI and
we should advertise that to stop the graft instead of 'update -C .' which is
kind of ugly.
Differential Revision: https://phab.mercurial-scm.org/D4169
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 09 Aug 2018 12:32:11 +0300] rev 39003
status: advertise --abort instead of 'update -C .' to abort a merge
status has a part where it shows the conflict information and how to continue or
abort. Couple of release ago, we got merge --abort and we should advertise that
instead of 'hg update -C .' which is kind of ugly.
I know we need to unify the logic here.
Differential Revision: https://phab.mercurial-scm.org/D4168
Pulkit Goyal <pulkit@yandex-team.ru> [Thu, 09 Aug 2018 12:20:28 +0300] rev 39002
narrow: add '()' to ellipsis in the revset help
ellipsis is a revset function and was missing () after it's name in the help
text. This might confuse users as they try `hg log -r 'ellipsis'`.
Differential Revision: https://phab.mercurial-scm.org/D4167
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 10:11:10 -0400] rev 39001
tests: make all the string constants in test-match.py be bytes
Done with
python3 contrib/byteify-strings.py tests/test-match.py -i
# skip-blame just bytes prefixes
Differential Revision: https://phab.mercurial-scm.org/D4171
Augie Fackler <augie@google.com> [Thu, 09 Aug 2018 10:10:09 -0400] rev 39000
linelog: fix bytes/str issue in exception raise on Python 3
Differential Revision: https://phab.mercurial-scm.org/D4170
David Demelier <markand@malikania.fr> [Thu, 09 Aug 2018 13:13:00 +0200] rev 38999
absorb: following UI conventions
https://www.mercurial-scm.org/wiki/UIGuideline#adding_new_options
Sangeet Kumar Mishra <mail2sangeetmishra@gmail.com> [Wed, 08 Aug 2018 19:29:02 +0530] rev 38998
grep: search all commits in allfiles mode
All the commits are added to the 'wanted' set when allfiles mode is enabled.
Differential Revision: https://phab.mercurial-scm.org/D4157
Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 17:07:27 -0700] rev 38997
dirstate: add comment on why we don't need to check if something is a dir/file
Differential Revision: https://phab.mercurial-scm.org/D4161
Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 17:03:05 -0700] rev 38996
match: add missing "return set()", add FIXME to test to doc a bug
These were both brought up during the codereview of D4130.
Differential Revision: https://phab.mercurial-scm.org/D4160
Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 16:53:17 -0700] rev 38995
match: correct doc for _rootsdirsandparents after 5a7df82de142
Differential Revision: https://phab.mercurial-scm.org/D4159
Kyle Lippincott <spectral@google.com> [Tue, 31 Jul 2018 16:47:43 -0700] rev 38994
dirstate: use visitchildrenset in traverse
This speeds up `hg status` a fair amount when there is a very large directory
and narrow is in use.
Timing numbers according to command:
hyperfine --warmup 1 'hg status'
HGRCPATH points to a file with the following contents:
[extensions]
narrow =
mozilla-unified (called m-u below) was at revision #468856.
regular hash: eb39298e432d
treemanifests hash: 0553b7f29eaf
large-dir-repo (called l-d-r below) was generated with the following script:
#!/bin/bash
hg init large-dir-repo
mkdir -p large-dir-repo/third_party/rust/log
touch large-dir-repo/third_party/rust/log/foo.txt
for i in $(seq 1 30000); do
d=$(mktemp -d large-dir-repo/third_party/XXXXXXXXX)
touch $d/file.txt
done
hg -R large-dir-repo ci -Am 'rev0' --user test --date '0 0'
for repos that use narrow, the narrowspec was this:
[includes]
rootfilesin:third_party/rust/log
[excludes]
This narrowspec was chosen due to the size of the third_party/rust directory;
this directory was *not* modified in revision #468856 in mozilla-unified.
Importantly, when using narrow, these repos had everything checked out (in the
case of large-dir-repo, that means all 30,001 directories), *before* adding the
narrowspec. This is to simulate the behavior when using a virtual filesystem
that shows everything for the user even if they haven't added it to the
narrowspec yet. This is not a supported configuration, and `hg update` will not
really do the "correct" thing, but non-mutating commands should behave
correctly.
There are two repos below that do not follow the setup above, 'citc1' and
'citc2', which are using a virtual filesystem and can not be reproduced
upstream; these numbers are here mostly to indicate that these performance
improvements are not hypothetical, and show the benefits we're hoping to achieve
on our real workloads. 'citc1' is closest to large-dir-repo with one of our
pathological cases, 'citc2' is an arbitrary repo and closer to "average".
I'm not claiming anything less than a 5% speed win as improvements due to this
change; these are probably eiter measurement artifacts or constant time
improvements. The numbers that aren't changing are shown primarily to prove that
this doesn't make anything worse in any case I plan on testing during this
series.
'before' is hg from commit c83ad576. 'N' indicates narrow in use, 'T' indicates
treemanifest in use.
hg status:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 2.284 s +- 0.022 s | 2.274 s +- 0.021 s | 99.6%
m-u | | x | 2.289 s +- 0.008 s | 2.284 s +- 0.028 s | 99.8%
m-u | x | | 430.8 ms +- 3.1 ms | 424.5 ms +- 3.2 ms | 98.5%
m-u | x | x | 429.8 ms +- 2.5 ms | 425.8 ms +- 3.7 ms | 99.1%
l-d-r | | | 681.3 ms +- 5.5 ms | 689.6 ms +- 8.0 ms | 101.2%
l-d-r | | x | 666.8 ms +- 21.8 ms | 672.5 ms +- 14.9 ms | 100.9%
l-d-r | x | | 282.6 ms +- 1.8 ms | 203.0 ms +- 1.2 ms | 71.8% <--
l-d-r | x | x | 275.2 ms +- 3.9 ms | 199.3 ms +- 3.5 ms | 72.4% <--
citc1 | x | x | 1.023 s +- 0.011 s | 398.6 ms +- 9.2 ms | 39.0% <--
citc2 | x | x | 297.9 ms +- 4.4 ms | 289.6 ms +- 4.2 ms | 97.2%
hg status --change .:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 478.2 ms +- 2.0 ms | 476.9 ms +- 3.7 ms | 99.7%
m-u | | x | 169.5 ms +- 2.7 ms | 169.5 ms +- 2.5 ms | 100.0%
m-u | x | | 477.0 ms +- 2.4 ms | 476.1 ms +- 1.4 ms | 99.8%
m-u | x | x | 124.7 ms +- 1.9 ms | 124.2 ms +- 3.3 ms | 99.6%
l-d-r | | | 97.4 ms +- 1.2 ms | 96.5 ms +- 1.2 ms | 99.1%
l-d-r | | x | 4.778 s +- 0.018 s | 4.774 s +- 0.011 s | 99.9%
l-d-r | x | | 99.9 ms +- 1.1 ms | 98.8 ms +- 1.3 ms | 98.9%
l-d-r | x | x | 848.7 ms +- 7.1 ms | 849.4 ms +- 6.5 ms | 100.1%
citc1 | x | x | 4.250 s +- 0.051 s | 4.283 s +- 0.042 s | 100.8%
citc2 | x | x | 341.5 ms +- 4.7 ms | 341.5 ms +- 4.1 ms | 100.0%
hg update $rev^; hg update $rev:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 4.357 s +- 0.032 s | 4.312 s +- 0.093 s | 99.0%
m-u | | x | 3.599 s +- 0.061 s | 3.592 s +- 0.071 s | 99.8%
m-u | x | | 1.815 s +- 0.012 s | 1.816 s +- 0.013 s | 100.1%
m-u | x | x | 1.110 s +- 0.009 s | 1.106 s +- 0.005 s | 99.6%
l-d-r | | | 527.1 ms +- 7.8 ms | 523.3 ms +- 6.5 ms | 99.3%
l-d-r | | x | 8.835 s +- 0.067 s | 8.825 s +- 0.064 s | 99.9%
l-d-r | x | | 313.0 ms +- 2.2 ms | 312.1 ms +- 1.2 ms | 99.7%
l-d-r | x | x | 1.780 s +- 0.011 s | 1.799 s +- 0.013 s | 101.1%
citc1 | x | x | 6.825 s +- 0.262 s | 6.707 s +- 0.353 s | 98.3%
citc2 | x | x | 776.4 ms +- 4.5 ms | 781.3 ms +- 6.3 ms | 100.6%
hg diff:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 1.519 s +- 0.015 s | 1.525 s +- 0.017 s | 100.4%
m-u | | x | 1.512 s +- 0.010 s | 1.517 s +- 0.027 s | 100.3%
m-u | x | | 420.0 ms +- 3.2 ms | 417.1 ms +- 1.9 ms | 99.3%
m-u | x | x | 415.0 ms +- 3.8 ms | 415.7 ms +- 2.7 ms | 100.2%
l-d-r | | | 220.8 ms +- 4.0 ms | 220.8 ms +- 3.7 ms | 100.0%
l-d-r | | x | 216.6 ms +- 7.5 ms | 211.4 ms +- 2.1 ms | 97.6%
l-d-r | x | | 111.9 ms +- 1.8 ms | 112.0 ms +- 1.5 ms | 100.1%
l-d-r | x | x | 111.4 ms +- 1.4 ms | 110.2 ms +- 1.0 ms | 98.9%
citc1 | x | x | 268.7 ms +- 2.3 ms | 269.6 ms +- 2.8 ms | 100.3%
citc2 | x | x | 273.5 ms +- 5.5 ms | 273.9 ms +- 3.7 ms | 100.1%
hg diff -c .:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+--------------------------+-----------------------+----------
m-u | | | 497.1 ms +- 1.4 ms | 500.1 ms +- 2.4 ms | 100.6%
m-u | | x | 195.3 ms +- 13.2 ms | 191.6 ms +- 3.0 ms | 98.1%
m-u | x | | 476.8 ms +- 1.9 ms | 476.7 ms +- 2.3 ms | 100.0%
m-u | x | x | 122.8 ms +- 2.1 ms | 122.9 ms +- 2.0 ms | 100.1%
l-d-r | | | 99.3 ms +- 2.3 ms | 98.8 ms +- 1.7 ms | 99.5%
l-d-r | | x | 4.875 s +- 0.041 s | 4.847 s +- 0.038 s | 99.4%
l-d-r | x | | 98.5 ms +- 1.2 ms | 98.9 ms +- 1.3 ms | 100.4%
l-d-r | x | x | 864.6 ms +- 7.4 ms | 855.4 ms +- 6.6 ms | 98.9%
citc1 | x | x | 4.505 s +- 0.060 s | 4.466 s +- 0.036 s | 99.1%
citc2 | x | x | 368.0 ms +- 4.0 ms | 365.5 ms +- 6.3 ms | 99.3%
Differential Revision: https://phab.mercurial-scm.org/D4131
spectral <spectral@google.com> [Mon, 06 Aug 2018 12:52:33 -0700] rev 38993
match: add visitchildrenset complement to visitdir
`visitdir(d)` lets a caller query whether the directory is part of the matcher.
It can receive a response of 'all' (yes, and all children, you can stop calling
visitdir now), False (no, and no children either), or True (yes, either
something in this directory or a child is part of the matcher).
`visitchildrenset(d)` augments that by instead of returning True, it returns a
list of items to actually investigate. With this, code can be modified from:
for f in self.all_items:
if match.visitdir(self.dir + '/' + f):
<do stuff>
to be:
for f in self.all_items.intersect(match.visitchildrenset(self.dir)):
<do stuff>
use of this function can provide significant performance improvements,
especially when using narrow (so that the matcher is much smaller than the stuff
we see on disk) and/or treemanifests (so that we can avoid loading manifests for
trees that aren't part of the matcher).
Differential Revision: https://phab.mercurial-scm.org/D4130
spectral <spectral@google.com> [Mon, 06 Aug 2018 12:52:22 -0700] rev 38992
includematcher: separate "parents" from "dirs"
A future patch will make use of this separation so that we can make more
intelligent decisions about what to investigate/load when the matcher is in use.
Currently, even with this patch, we typically use the 'visitdir' call to identify if
we can skip some directory, something along the lines of:
for f in all_items:
if match.visitdir(f):
<do stuff>
This can be slower than we'd like if there are a lot of items; it requires N
calls to match.visitdir in the best case. Commonly, especially with 'narrow',
we have a situation where we do some work for the directory, possibly just
loading it from disk (when using treemanifests) and then check if we should be
interacting with it at all, which can be a huge slowdown in some pathological
cases.
Differential Revision: https://phab.mercurial-scm.org/D4129
spectral <spectral@google.com> [Sun, 05 Aug 2018 18:31:19 -0700] rev 38991
match: add tests for visitdir functionality
There are a few cases that we could have done better with some additional logic;
I tried to annotate these when I noticed them, but may have missed some. The
tests are not exhaustive; there are certainly some patterns that I didn't test
well, and many that I didn't test at all.
The primary motivation was to get coverage on visitdir so that I can cover
identical cases in a similar method I'm working on, to make sure that this new
method behaves the same (or better).
Differential Revision: https://phab.mercurial-scm.org/D4128
Martin von Zweigbergk <martinvonz@google.com> [Mon, 23 Jul 2018 22:51:53 -0700] rev 38990
mergetool: warn if ui.merge points to nonexistent tool
This adds a warning when ui.merge is configured but points to an
executable that doesn't exist. It gets printed once per fail, but that
seems to be how our other warnings about merge tools are reported.
Differential Revision: https://phab.mercurial-scm.org/D3975
Martin von Zweigbergk <martinvonz@google.com> [Mon, 23 Jul 2018 22:51:50 -0700] rev 38989
tests: demonstrate that no requested merge tool is ignored if missing
If you explicitly configure a merge tool, it seems wrong that we don't
even warn if we can't find it. This patch adds a test case that
demonstrates that.
Differential Revision: https://phab.mercurial-scm.org/D3974
Danny Hooper <hooper@google.com> [Mon, 06 Aug 2018 16:00:00 -0700] rev 38988
fix: correctly set wdirwritten given that the dict item is deleted
Differential Revision: https://phab.mercurial-scm.org/D4146
Danny Hooper <hooper@google.com> [Mon, 06 Aug 2018 14:30:27 -0700] rev 38987
fix: pull out flag definitions to make them re-usable from extensions
This makes it cleaner to implement fix-related commands with additional
functionality while sharing some flags with the core implementation.
Differential Revision: https://phab.mercurial-scm.org/D4145
Yuya Nishihara <yuya@tcha.org> [Tue, 24 Jul 2018 22:13:21 +0900] rev 38986
templatekw: copy {author} to {user} and document {author} as an alias
In other places including "log -Tjson" and revset, "user" is the canonical
name. Let's standardize it.
This is a part of the name unification of the Generic Templating Plan.
https://www.mercurial-scm.org/wiki/GenericTemplatingPlan#Dictionary
Yuya Nishihara <yuya@tcha.org> [Tue, 24 Jul 2018 22:33:08 +0900] rev 38985
templates: rename "user" to "luser" defined in default map file (API)
"user" will be shadowed by the {user} keyword to be added by the next
patch.
I think the naming of template fields is a sort of an internal API, so
this patch is flagged as an API change.
.. api::
Rewrite ``{user}`` to ``{luser}`` in log templates inherited from
map-cmdline.default.
Sangeet Kumar Mishra <mail2sangeetmishra@gmail.com> [Wed, 25 Jul 2018 12:50:31 +0530] rev 38984
grep: add MULTIREV support to --allfiles flag
This patch facilitates passing multiple revisions with all-files flag.
It's assumed that if you are passing multiple revisions to --allfiles,
you want hits from all of them.
Differential Revision: https://phab.mercurial-scm.org/D3976
Cédric Krier <ced@b2ck.com> [Wed, 25 Jul 2018 10:34:31 +0200] rev 38983
phabricator: convert description into local
The description from conduit is a unicode.
Differential Revision: https://phab.mercurial-scm.org/D3980
Martin von Zweigbergk <martinvonz@google.com> [Thu, 19 Jul 2018 23:15:21 -0700] rev 38982
index: move index_clearcaches() further down
I want to add a call from it to a new function (nt_dealloc) that will
be inserted below its current position.
Differential Revision: https://phab.mercurial-scm.org/D4117
Martin von Zweigbergk <martinvonz@google.com> [Thu, 19 Jul 2018 11:08:30 -0700] rev 38981
index: move all "nt_*" functions to one place
Differential Revision: https://phab.mercurial-scm.org/D4116
Martin von Zweigbergk <martinvonz@google.com> [Thu, 19 Jul 2018 00:03:45 -0700] rev 38980
index: rename "nt_*(indexObject *self,...)" functions to "index_*"
These functions do something with the nodetree, but they're less
generic and won't make sense as methods on the nodetree when it
becomes a Python type.
Differential Revision: https://phab.mercurial-scm.org/D4115
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 12:03:31 -0700] rev 38979
index: split up nt_init() in two
I'd like to make nt_init() take a pointer to a nodetree to initialize,
but it currently also allocates the nodetree. This patch prepares for
that change by making nt_init() be about initializing an existing node
tree and by creating a new index_init_nt() that creates the nodetree.
Differential Revision: https://phab.mercurial-scm.org/D4114
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 23:20:01 -0700] rev 38978
index: make most "nt_*" functions take a nodetree
Now that the nodetree has a pointer to the index, we can pass the
nodtree instead of the index. There are few "nt_*" functions left
after this. I'll deal with them soon.
Differential Revision: https://phab.mercurial-scm.org/D4113
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 23:07:08 -0700] rev 38977
index: add pointer from nodetree back to index
This is always a cycle right now, but it will not be for the nodetree
instances I'm planning to add later (see earlier patch).
Differential Revision: https://phab.mercurial-scm.org/D4112
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 09:59:51 -0700] rev 38976
index: remove side-effect from failed nt_new()
As pointed out by Yuya in the review of D4108, if realloc() fails, we
would end up with an invalid nodetree instance (with nt->nodes set to
NULL), which means that if it was later accessed again it would likely
segfault. It's probably unlikely that much else happens in the process
if it ran out memory, but we should of course do our best to handle
it. This patch makes it so we don't update the nodetree in this case.
Differential Revision: https://phab.mercurial-scm.org/D4154
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 22:34:37 -0700] rev 38975
index: remove side-effect from failed nt_init()
As pointed out by Yuya in the review of D4108, if we run into the
"overflow in nt_init" case (which I think normally happens only in
repos with at least 2^26=64Mi revisions), we would leave the node tree
half-initialized.
Differential Revision: https://phab.mercurial-scm.org/D4153
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 22:24:34 -0700] rev 38974
index: use PyMem_Free() to free nodeetree instance
As Yuya pointed out in the review of D4108, PyMem_Malloc() and
PyMem_Free() should be paired. IIUC, PyMem_Malloc() may use a
different allocator than malloc(), so using free() with a pointer from
PyMem_Malloc() may be very wrong.
Differential Revision: https://phab.mercurial-scm.org/D4152
Jun Wu <quark@fb.com> [Mon, 06 Aug 2018 22:24:00 -0700] rev 38973
linelog: fix infinite loop vulnerability
Checking `len(lines)` is not a great way of detecting infinite loops, as
demonstrated in the added test. Therefore check instruction count instead.
The original C implementation does not have this problem. There are a few
other places where the C implementation enforces more strictly, like
`a1 <= a2`, `b1 <= b2`, `rev > 0`. But they are optional.
Test Plan:
Add a test. The old code forces the test to time out.
Differential Revision: https://phab.mercurial-scm.org/D4151
Augie Fackler <augie@google.com> [Mon, 06 Aug 2018 17:19:33 -0400] rev 38972
tests: fix bytes/str issues in run-tests.py caught by python3
Differential Revision: https://phab.mercurial-scm.org/D4143
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 16:45:25 -0700] rev 38971
changegroup: assign to proper attribute
0548f696795b accidentally assigned to self.clrevtolocalrev instead of
self._clrevtolocalrev. Surprisingly, no tests failed as a result of
this mistake. Curious.
Differential Revision: https://phab.mercurial-scm.org/D4144
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 09:00:26 -0700] rev 38970
absorb: remove sf alias for command
I'm not even sure what it is supposed to stand for.
Differential Revision: https://phab.mercurial-scm.org/D4126
Yuya Nishihara <yuya@tcha.org> [Sun, 25 Feb 2018 21:04:33 +0900] rev 38969
templatekw: deprecate old-style template keyword function (API)
.. api::
`f(**kwargs)` style template keyword function is deprecated. Switch to
new `(context, mapping)` API by declaring resource requirements.
The new-style API will be the default in Mercurial 4.9. See
registrar.templatekeyword for details.
Yuya Nishihara <yuya@tcha.org> [Sat, 28 Jul 2018 21:19:24 +0900] rev 38968
hgweb: mark all lambda template keywords as new-style function
This is just a temporary workaround, and will be removed in Mercurial 4.9.
Yuya Nishihara <yuya@tcha.org> [Sat, 28 Jul 2018 21:02:05 +0900] rev 38967
hgweb: use registrar to add "motd" template keyword
This prepares for deprecation of old-style keyword functions.
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 11:21:43 +0900] rev 38966
fileset: load core predicates directly to symbols dict
We no longer have any side effect in loadpredicate().
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 11:49:12 +0900] rev 38965
fileset: turn on listclean conditionally
This is just a micro optimization.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 00:33:15 -0700] rev 38964
changegroup: always use the treemanifest-enabled version of _packmanifests()
It works for flat manifests too. We just cannot use cg1 or cg2 if we
have subdirectory manifests.
Differential Revision: https://phab.mercurial-scm.org/D4124
Augie Fackler <augie@google.com> [Mon, 30 Jul 2018 23:52:15 -0400] rev 38963
linelog: add replacelines_vec for fastannotate
# no-check-commit because we're conforming to an existing interface
Differential Revision: https://phab.mercurial-scm.org/D3993
Augie Fackler <augie@google.com> [Tue, 31 Jul 2018 11:29:25 -0400] rev 38962
absorb: drop wrapper around the amend command
We can reinstate this later if we want.
Differential Revision: https://phab.mercurial-scm.org/D3992
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:30:10 -0400] rev 38961
absorb: note some TODOs from the code review
Differential Revision: https://phab.mercurial-scm.org/D4047
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:29:57 -0400] rev 38960
absorb: use ui.debug() instead of open-coding it
Differential Revision: https://phab.mercurial-scm.org/D4046
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:24:43 -0400] rev 38959
absorb: use pycompat to get xrange
Differential Revision: https://phab.mercurial-scm.org/D4045