Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 17:03:05 -0700] rev 38958
match: add missing "return set()", add FIXME to test to doc a bug
These were both brought up during the codereview of D4130.
Differential Revision: https://phab.mercurial-scm.org/D4160
Kyle Lippincott <spectral@google.com> [Wed, 08 Aug 2018 16:53:17 -0700] rev 38957
match: correct doc for _rootsdirsandparents after
5a7df82de142
Differential Revision: https://phab.mercurial-scm.org/D4159
Kyle Lippincott <spectral@google.com> [Tue, 31 Jul 2018 16:47:43 -0700] rev 38956
dirstate: use visitchildrenset in traverse
This speeds up `hg status` a fair amount when there is a very large directory
and narrow is in use.
Timing numbers according to command:
hyperfine --warmup 1 'hg status'
HGRCPATH points to a file with the following contents:
[extensions]
narrow =
mozilla-unified (called m-u below) was at revision #468856.
regular hash:
eb39298e432d
treemanifests hash:
0553b7f29eaf
large-dir-repo (called l-d-r below) was generated with the following script:
#!/bin/bash
hg init large-dir-repo
mkdir -p large-dir-repo/third_party/rust/log
touch large-dir-repo/third_party/rust/log/foo.txt
for i in $(seq 1 30000); do
d=$(mktemp -d large-dir-repo/third_party/XXXXXXXXX)
touch $d/file.txt
done
hg -R large-dir-repo ci -Am 'rev0' --user test --date '0 0'
for repos that use narrow, the narrowspec was this:
[includes]
rootfilesin:third_party/rust/log
[excludes]
This narrowspec was chosen due to the size of the third_party/rust directory;
this directory was *not* modified in revision #468856 in mozilla-unified.
Importantly, when using narrow, these repos had everything checked out (in the
case of large-dir-repo, that means all 30,001 directories), *before* adding the
narrowspec. This is to simulate the behavior when using a virtual filesystem
that shows everything for the user even if they haven't added it to the
narrowspec yet. This is not a supported configuration, and `hg update` will not
really do the "correct" thing, but non-mutating commands should behave
correctly.
There are two repos below that do not follow the setup above, 'citc1' and
'citc2', which are using a virtual filesystem and can not be reproduced
upstream; these numbers are here mostly to indicate that these performance
improvements are not hypothetical, and show the benefits we're hoping to achieve
on our real workloads. 'citc1' is closest to large-dir-repo with one of our
pathological cases, 'citc2' is an arbitrary repo and closer to "average".
I'm not claiming anything less than a 5% speed win as improvements due to this
change; these are probably eiter measurement artifacts or constant time
improvements. The numbers that aren't changing are shown primarily to prove that
this doesn't make anything worse in any case I plan on testing during this
series.
'before' is hg from commit
c83ad576. 'N' indicates narrow in use, 'T' indicates
treemanifest in use.
hg status:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 2.284 s +- 0.022 s | 2.274 s +- 0.021 s | 99.6%
m-u | | x | 2.289 s +- 0.008 s | 2.284 s +- 0.028 s | 99.8%
m-u | x | | 430.8 ms +- 3.1 ms | 424.5 ms +- 3.2 ms | 98.5%
m-u | x | x | 429.8 ms +- 2.5 ms | 425.8 ms +- 3.7 ms | 99.1%
l-d-r | | | 681.3 ms +- 5.5 ms | 689.6 ms +- 8.0 ms | 101.2%
l-d-r | | x | 666.8 ms +- 21.8 ms | 672.5 ms +- 14.9 ms | 100.9%
l-d-r | x | | 282.6 ms +- 1.8 ms | 203.0 ms +- 1.2 ms | 71.8% <--
l-d-r | x | x | 275.2 ms +- 3.9 ms | 199.3 ms +- 3.5 ms | 72.4% <--
citc1 | x | x | 1.023 s +- 0.011 s | 398.6 ms +- 9.2 ms | 39.0% <--
citc2 | x | x | 297.9 ms +- 4.4 ms | 289.6 ms +- 4.2 ms | 97.2%
hg status --change .:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 478.2 ms +- 2.0 ms | 476.9 ms +- 3.7 ms | 99.7%
m-u | | x | 169.5 ms +- 2.7 ms | 169.5 ms +- 2.5 ms | 100.0%
m-u | x | | 477.0 ms +- 2.4 ms | 476.1 ms +- 1.4 ms | 99.8%
m-u | x | x | 124.7 ms +- 1.9 ms | 124.2 ms +- 3.3 ms | 99.6%
l-d-r | | | 97.4 ms +- 1.2 ms | 96.5 ms +- 1.2 ms | 99.1%
l-d-r | | x | 4.778 s +- 0.018 s | 4.774 s +- 0.011 s | 99.9%
l-d-r | x | | 99.9 ms +- 1.1 ms | 98.8 ms +- 1.3 ms | 98.9%
l-d-r | x | x | 848.7 ms +- 7.1 ms | 849.4 ms +- 6.5 ms | 100.1%
citc1 | x | x | 4.250 s +- 0.051 s | 4.283 s +- 0.042 s | 100.8%
citc2 | x | x | 341.5 ms +- 4.7 ms | 341.5 ms +- 4.1 ms | 100.0%
hg update $rev^; hg update $rev:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 4.357 s +- 0.032 s | 4.312 s +- 0.093 s | 99.0%
m-u | | x | 3.599 s +- 0.061 s | 3.592 s +- 0.071 s | 99.8%
m-u | x | | 1.815 s +- 0.012 s | 1.816 s +- 0.013 s | 100.1%
m-u | x | x | 1.110 s +- 0.009 s | 1.106 s +- 0.005 s | 99.6%
l-d-r | | | 527.1 ms +- 7.8 ms | 523.3 ms +- 6.5 ms | 99.3%
l-d-r | | x | 8.835 s +- 0.067 s | 8.825 s +- 0.064 s | 99.9%
l-d-r | x | | 313.0 ms +- 2.2 ms | 312.1 ms +- 1.2 ms | 99.7%
l-d-r | x | x | 1.780 s +- 0.011 s | 1.799 s +- 0.013 s | 101.1%
citc1 | x | x | 6.825 s +- 0.262 s | 6.707 s +- 0.353 s | 98.3%
citc2 | x | x | 776.4 ms +- 4.5 ms | 781.3 ms +- 6.3 ms | 100.6%
hg diff:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+------------------------+-----------------------+------------
m-u | | | 1.519 s +- 0.015 s | 1.525 s +- 0.017 s | 100.4%
m-u | | x | 1.512 s +- 0.010 s | 1.517 s +- 0.027 s | 100.3%
m-u | x | | 420.0 ms +- 3.2 ms | 417.1 ms +- 1.9 ms | 99.3%
m-u | x | x | 415.0 ms +- 3.8 ms | 415.7 ms +- 2.7 ms | 100.2%
l-d-r | | | 220.8 ms +- 4.0 ms | 220.8 ms +- 3.7 ms | 100.0%
l-d-r | | x | 216.6 ms +- 7.5 ms | 211.4 ms +- 2.1 ms | 97.6%
l-d-r | x | | 111.9 ms +- 1.8 ms | 112.0 ms +- 1.5 ms | 100.1%
l-d-r | x | x | 111.4 ms +- 1.4 ms | 110.2 ms +- 1.0 ms | 98.9%
citc1 | x | x | 268.7 ms +- 2.3 ms | 269.6 ms +- 2.8 ms | 100.3%
citc2 | x | x | 273.5 ms +- 5.5 ms | 273.9 ms +- 3.7 ms | 100.1%
hg diff -c .:
repo | N | T | before (mean +- stdev) | after (mean +- stdev) | % of before
------+---+---+--------------------------+-----------------------+----------
m-u | | | 497.1 ms +- 1.4 ms | 500.1 ms +- 2.4 ms | 100.6%
m-u | | x | 195.3 ms +- 13.2 ms | 191.6 ms +- 3.0 ms | 98.1%
m-u | x | | 476.8 ms +- 1.9 ms | 476.7 ms +- 2.3 ms | 100.0%
m-u | x | x | 122.8 ms +- 2.1 ms | 122.9 ms +- 2.0 ms | 100.1%
l-d-r | | | 99.3 ms +- 2.3 ms | 98.8 ms +- 1.7 ms | 99.5%
l-d-r | | x | 4.875 s +- 0.041 s | 4.847 s +- 0.038 s | 99.4%
l-d-r | x | | 98.5 ms +- 1.2 ms | 98.9 ms +- 1.3 ms | 100.4%
l-d-r | x | x | 864.6 ms +- 7.4 ms | 855.4 ms +- 6.6 ms | 98.9%
citc1 | x | x | 4.505 s +- 0.060 s | 4.466 s +- 0.036 s | 99.1%
citc2 | x | x | 368.0 ms +- 4.0 ms | 365.5 ms +- 6.3 ms | 99.3%
Differential Revision: https://phab.mercurial-scm.org/D4131
spectral <spectral@google.com> [Mon, 06 Aug 2018 12:52:33 -0700] rev 38955
match: add visitchildrenset complement to visitdir
`visitdir(d)` lets a caller query whether the directory is part of the matcher.
It can receive a response of 'all' (yes, and all children, you can stop calling
visitdir now), False (no, and no children either), or True (yes, either
something in this directory or a child is part of the matcher).
`visitchildrenset(d)` augments that by instead of returning True, it returns a
list of items to actually investigate. With this, code can be modified from:
for f in self.all_items:
if match.visitdir(self.dir + '/' + f):
<do stuff>
to be:
for f in self.all_items.intersect(match.visitchildrenset(self.dir)):
<do stuff>
use of this function can provide significant performance improvements,
especially when using narrow (so that the matcher is much smaller than the stuff
we see on disk) and/or treemanifests (so that we can avoid loading manifests for
trees that aren't part of the matcher).
Differential Revision: https://phab.mercurial-scm.org/D4130
spectral <spectral@google.com> [Mon, 06 Aug 2018 12:52:22 -0700] rev 38954
includematcher: separate "parents" from "dirs"
A future patch will make use of this separation so that we can make more
intelligent decisions about what to investigate/load when the matcher is in use.
Currently, even with this patch, we typically use the 'visitdir' call to identify if
we can skip some directory, something along the lines of:
for f in all_items:
if match.visitdir(f):
<do stuff>
This can be slower than we'd like if there are a lot of items; it requires N
calls to match.visitdir in the best case. Commonly, especially with 'narrow',
we have a situation where we do some work for the directory, possibly just
loading it from disk (when using treemanifests) and then check if we should be
interacting with it at all, which can be a huge slowdown in some pathological
cases.
Differential Revision: https://phab.mercurial-scm.org/D4129
spectral <spectral@google.com> [Sun, 05 Aug 2018 18:31:19 -0700] rev 38953
match: add tests for visitdir functionality
There are a few cases that we could have done better with some additional logic;
I tried to annotate these when I noticed them, but may have missed some. The
tests are not exhaustive; there are certainly some patterns that I didn't test
well, and many that I didn't test at all.
The primary motivation was to get coverage on visitdir so that I can cover
identical cases in a similar method I'm working on, to make sure that this new
method behaves the same (or better).
Differential Revision: https://phab.mercurial-scm.org/D4128
Martin von Zweigbergk <martinvonz@google.com> [Mon, 23 Jul 2018 22:51:53 -0700] rev 38952
mergetool: warn if ui.merge points to nonexistent tool
This adds a warning when ui.merge is configured but points to an
executable that doesn't exist. It gets printed once per fail, but that
seems to be how our other warnings about merge tools are reported.
Differential Revision: https://phab.mercurial-scm.org/D3975
Martin von Zweigbergk <martinvonz@google.com> [Mon, 23 Jul 2018 22:51:50 -0700] rev 38951
tests: demonstrate that no requested merge tool is ignored if missing
If you explicitly configure a merge tool, it seems wrong that we don't
even warn if we can't find it. This patch adds a test case that
demonstrates that.
Differential Revision: https://phab.mercurial-scm.org/D3974
Danny Hooper <hooper@google.com> [Mon, 06 Aug 2018 16:00:00 -0700] rev 38950
fix: correctly set wdirwritten given that the dict item is deleted
Differential Revision: https://phab.mercurial-scm.org/D4146
Danny Hooper <hooper@google.com> [Mon, 06 Aug 2018 14:30:27 -0700] rev 38949
fix: pull out flag definitions to make them re-usable from extensions
This makes it cleaner to implement fix-related commands with additional
functionality while sharing some flags with the core implementation.
Differential Revision: https://phab.mercurial-scm.org/D4145
Yuya Nishihara <yuya@tcha.org> [Tue, 24 Jul 2018 22:13:21 +0900] rev 38948
templatekw: copy {author} to {user} and document {author} as an alias
In other places including "log -Tjson" and revset, "user" is the canonical
name. Let's standardize it.
This is a part of the name unification of the Generic Templating Plan.
https://www.mercurial-scm.org/wiki/GenericTemplatingPlan#Dictionary
Yuya Nishihara <yuya@tcha.org> [Tue, 24 Jul 2018 22:33:08 +0900] rev 38947
templates: rename "user" to "luser" defined in default map file (API)
"user" will be shadowed by the {user} keyword to be added by the next
patch.
I think the naming of template fields is a sort of an internal API, so
this patch is flagged as an API change.
.. api::
Rewrite ``{user}`` to ``{luser}`` in log templates inherited from
map-cmdline.default.
Sangeet Kumar Mishra <mail2sangeetmishra@gmail.com> [Wed, 25 Jul 2018 12:50:31 +0530] rev 38946
grep: add MULTIREV support to --allfiles flag
This patch facilitates passing multiple revisions with all-files flag.
It's assumed that if you are passing multiple revisions to --allfiles,
you want hits from all of them.
Differential Revision: https://phab.mercurial-scm.org/D3976
Cédric Krier <ced@b2ck.com> [Wed, 25 Jul 2018 10:34:31 +0200] rev 38945
phabricator: convert description into local
The description from conduit is a unicode.
Differential Revision: https://phab.mercurial-scm.org/D3980
Martin von Zweigbergk <martinvonz@google.com> [Thu, 19 Jul 2018 23:15:21 -0700] rev 38944
index: move index_clearcaches() further down
I want to add a call from it to a new function (nt_dealloc) that will
be inserted below its current position.
Differential Revision: https://phab.mercurial-scm.org/D4117
Martin von Zweigbergk <martinvonz@google.com> [Thu, 19 Jul 2018 11:08:30 -0700] rev 38943
index: move all "nt_*" functions to one place
Differential Revision: https://phab.mercurial-scm.org/D4116
Martin von Zweigbergk <martinvonz@google.com> [Thu, 19 Jul 2018 00:03:45 -0700] rev 38942
index: rename "nt_*(indexObject *self,...)" functions to "index_*"
These functions do something with the nodetree, but they're less
generic and won't make sense as methods on the nodetree when it
becomes a Python type.
Differential Revision: https://phab.mercurial-scm.org/D4115
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 12:03:31 -0700] rev 38941
index: split up nt_init() in two
I'd like to make nt_init() take a pointer to a nodetree to initialize,
but it currently also allocates the nodetree. This patch prepares for
that change by making nt_init() be about initializing an existing node
tree and by creating a new index_init_nt() that creates the nodetree.
Differential Revision: https://phab.mercurial-scm.org/D4114
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 23:20:01 -0700] rev 38940
index: make most "nt_*" functions take a nodetree
Now that the nodetree has a pointer to the index, we can pass the
nodtree instead of the index. There are few "nt_*" functions left
after this. I'll deal with them soon.
Differential Revision: https://phab.mercurial-scm.org/D4113
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 23:07:08 -0700] rev 38939
index: add pointer from nodetree back to index
This is always a cycle right now, but it will not be for the nodetree
instances I'm planning to add later (see earlier patch).
Differential Revision: https://phab.mercurial-scm.org/D4112
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 09:59:51 -0700] rev 38938
index: remove side-effect from failed nt_new()
As pointed out by Yuya in the review of D4108, if realloc() fails, we
would end up with an invalid nodetree instance (with nt->nodes set to
NULL), which means that if it was later accessed again it would likely
segfault. It's probably unlikely that much else happens in the process
if it ran out memory, but we should of course do our best to handle
it. This patch makes it so we don't update the nodetree in this case.
Differential Revision: https://phab.mercurial-scm.org/D4154
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 22:34:37 -0700] rev 38937
index: remove side-effect from failed nt_init()
As pointed out by Yuya in the review of D4108, if we run into the
"overflow in nt_init" case (which I think normally happens only in
repos with at least 2^26=64Mi revisions), we would leave the node tree
half-initialized.
Differential Revision: https://phab.mercurial-scm.org/D4153
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 22:24:34 -0700] rev 38936
index: use PyMem_Free() to free nodeetree instance
As Yuya pointed out in the review of D4108, PyMem_Malloc() and
PyMem_Free() should be paired. IIUC, PyMem_Malloc() may use a
different allocator than malloc(), so using free() with a pointer from
PyMem_Malloc() may be very wrong.
Differential Revision: https://phab.mercurial-scm.org/D4152
Jun Wu <quark@fb.com> [Mon, 06 Aug 2018 22:24:00 -0700] rev 38935
linelog: fix infinite loop vulnerability
Checking `len(lines)` is not a great way of detecting infinite loops, as
demonstrated in the added test. Therefore check instruction count instead.
The original C implementation does not have this problem. There are a few
other places where the C implementation enforces more strictly, like
`a1 <= a2`, `b1 <= b2`, `rev > 0`. But they are optional.
Test Plan:
Add a test. The old code forces the test to time out.
Differential Revision: https://phab.mercurial-scm.org/D4151
Augie Fackler <augie@google.com> [Mon, 06 Aug 2018 17:19:33 -0400] rev 38934
tests: fix bytes/str issues in run-tests.py caught by python3
Differential Revision: https://phab.mercurial-scm.org/D4143
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 16:45:25 -0700] rev 38933
changegroup: assign to proper attribute
0548f696795b accidentally assigned to self.clrevtolocalrev instead of
self._clrevtolocalrev. Surprisingly, no tests failed as a result of
this mistake. Curious.
Differential Revision: https://phab.mercurial-scm.org/D4144
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 06 Aug 2018 09:00:26 -0700] rev 38932
absorb: remove sf alias for command
I'm not even sure what it is supposed to stand for.
Differential Revision: https://phab.mercurial-scm.org/D4126
Anton Shestakov <av6@dwimlabs.net> [Thu, 09 Aug 2018 13:04:52 +0800] rev 38931
hgweb: catch ParseError that's raised by revset.match()
Some queries, like the demonstrated "first(::)", fail earlier than we call
mfunc(), and that results in a "500 Internal Server Error". To prevent it,
revset.match() also needs to be in a try-except block.
Yuya Nishihara <yuya@tcha.org> [Sun, 25 Feb 2018 21:04:33 +0900] rev 38930
templatekw: deprecate old-style template keyword function (API)
.. api::
`f(**kwargs)` style template keyword function is deprecated. Switch to
new `(context, mapping)` API by declaring resource requirements.
The new-style API will be the default in Mercurial 4.9. See
registrar.templatekeyword for details.
Yuya Nishihara <yuya@tcha.org> [Sat, 28 Jul 2018 21:19:24 +0900] rev 38929
hgweb: mark all lambda template keywords as new-style function
This is just a temporary workaround, and will be removed in Mercurial 4.9.
Yuya Nishihara <yuya@tcha.org> [Sat, 28 Jul 2018 21:02:05 +0900] rev 38928
hgweb: use registrar to add "motd" template keyword
This prepares for deprecation of old-style keyword functions.
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 11:21:43 +0900] rev 38927
fileset: load core predicates directly to symbols dict
We no longer have any side effect in loadpredicate().
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 11:49:12 +0900] rev 38926
fileset: turn on listclean conditionally
This is just a micro optimization.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 06 Aug 2018 00:33:15 -0700] rev 38925
changegroup: always use the treemanifest-enabled version of _packmanifests()
It works for flat manifests too. We just cannot use cg1 or cg2 if we
have subdirectory manifests.
Differential Revision: https://phab.mercurial-scm.org/D4124
Augie Fackler <augie@google.com> [Mon, 30 Jul 2018 23:52:15 -0400] rev 38924
linelog: add replacelines_vec for fastannotate
# no-check-commit because we're conforming to an existing interface
Differential Revision: https://phab.mercurial-scm.org/D3993
Augie Fackler <augie@google.com> [Tue, 31 Jul 2018 11:29:25 -0400] rev 38923
absorb: drop wrapper around the amend command
We can reinstate this later if we want.
Differential Revision: https://phab.mercurial-scm.org/D3992
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:30:10 -0400] rev 38922
absorb: note some TODOs from the code review
Differential Revision: https://phab.mercurial-scm.org/D4047
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:29:57 -0400] rev 38921
absorb: use ui.debug() instead of open-coding it
Differential Revision: https://phab.mercurial-scm.org/D4046
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:24:43 -0400] rev 38920
absorb: use pycompat to get xrange
Differential Revision: https://phab.mercurial-scm.org/D4045
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:23:42 -0400] rev 38919
absorb: use set literal to avoid intermediate list
Differential Revision: https://phab.mercurial-scm.org/D4044
Augie Fackler <augie@google.com> [Wed, 01 Aug 2018 18:23:28 -0400] rev 38918
absorb: avoid mutable default arg
Differential Revision: https://phab.mercurial-scm.org/D4043
Augie Fackler <augie@google.com> [Mon, 30 Jul 2018 14:05:56 -0400] rev 38917
absorb: import extension from Facebook's hg-experimental
absorb is a wicked-fast command to use blame information to
automatically amend edits to the correct draft revision. Originally
written by Jun Wu, this import is hgext3rd/absorb/__init__.py with:
* the `testedwith` value changed
* the linelog import updated
* some missing configitems registered
* some imports reordered per check-code.py
* some missing __future__ imports added per check-code.py
Differential Revision: https://phab.mercurial-scm.org/D3991
Sushil khanchi <sushilkhanchi97@gmail.com> [Mon, 06 Aug 2018 10:03:57 +0530] rev 38916
resolve: organize 'if confirm' conditionals
Differential Revision: https://phab.mercurial-scm.org/D4123
Martin von Zweigbergk <martinvonz@google.com> [Wed, 16 May 2018 15:14:37 -0700] rev 38915
index: pass only nodetree to nt_new()
The function now only depends on the nodetree, not the index.
Differential Revision: https://phab.mercurial-scm.org/D4111
Martin von Zweigbergk <martinvonz@google.com> [Wed, 16 May 2018 13:57:28 -0700] rev 38914
index: drop now-redundant "nt" prefix of fields in nodetree struct
Differential Revision: https://phab.mercurial-scm.org/D4110
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 22:27:57 -0700] rev 38913
index: move more fields onto nodetree type
The fields moves are the ones that are not related to how the nodetree
is used in the index and that will make sense for the new nodetree
instance for a subset of the index that I'll add later.
Differential Revision: https://phab.mercurial-scm.org/D4109
Martin von Zweigbergk <martinvonz@google.com> [Wed, 16 May 2018 13:15:36 -0700] rev 38912
index: extract a type for the nodetree
This is a first step towards exposing the nodetree as a Python type.
Differential Revision: https://phab.mercurial-scm.org/D4108
Martin von Zweigbergk <martinvonz@google.com> [Wed, 18 Jul 2018 17:37:06 -0700] rev 38911
index: make "nt_*" functions work on an initialized nodetree
I want to be able to reuse these functions with another nodetree
instance later (for disambiguating node prefix within a revset). That
other nodetree instance won't want to be fully populated from the
index, so this commit moves that part to the callers.
Differential Revision: https://phab.mercurial-scm.org/D4107
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 14:03:31 -0700] rev 38910
changegroup: inline _packellipsischangegroup
It now does nothing special. The logic is simple enough to inline
in the 2 callers in narrow that need it.
The changegroup generation APIs could probably be a bit simpler.
But that's for another time.
Differential Revision: https://phab.mercurial-scm.org/D4092
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 13:43:55 -0700] rev 38909
changegroup: move fullnodes into cgpacker
And with this change, the narrow packer no longer defines
any addition attributes on packer instances!
Differential Revision: https://phab.mercurial-scm.org/D4091
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 14:00:18 -0700] rev 38908
changegroup: specify ellipses mode explicitly
Currently, code throughout changegroup relies on the presence
of self._full_nodes to enable ellipses mode. This is a very tenuous
check. And the check may be wrong once we move _full_nodes into
cgpacker.
Let's capture the enabling of ellipses mode explicitly as a constructor
argument and as an instance variable.
We could probably derive ellipses mode by presence of other
variables. But for now, this explicit approach seems simplest
since it is most compatible with existing code.
Differential Revision: https://phab.mercurial-scm.org/D4090
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 13:15:28 -0700] rev 38907
changegroup: pass ellipsis roots into cgpacker constructor
And rename the internal variable to conform with naming conventions.
Differential Revision: https://phab.mercurial-scm.org/D4089
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 13:11:13 -0700] rev 38906
changegroup: move revision maps to cgpacker
And remove the underscores so the variables conform to our
naming convention.
The logic in _close() should be the only thing warranting scrutiny
during review.
Differential Revision: https://phab.mercurial-scm.org/D4088
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 13:01:53 -0700] rev 38905
changegroup: move changelogdone into cgpacker
Looking at what it is used for, it feels like there is a better
way to implement all this. So recording a TODO to track that.
Differential Revision: https://phab.mercurial-scm.org/D4087
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 12:57:11 -0700] rev 38904
changegroup: declare shallow flag in constructor
Thus begins the process of better formalizing ellipses and shallow
changegroup generation mode so it is tracked by cgpacker at
construction time instead of bolted on after the fact by a
wrapper function.
Differential Revision: https://phab.mercurial-scm.org/D4086
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 12:47:15 -0700] rev 38903
changegroup: make some packer attributes private
These methods and attributes are low level and should not be
called or outside outside of instances. Indicate as such through
naming.
Differential Revision: https://phab.mercurial-scm.org/D4085
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 10:35:10 -0700] rev 38902
changegroup: rename cg1packer to cgpacker
There is now only a single class. We don't need to encode the
version in its name since the version is a lie.
Differential Revision: https://phab.mercurial-scm.org/D4084
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 10:35:02 -0700] rev 38901
changegroup: control delta parent behavior via constructor
The last remaining override on cg2packer related to parent delta
computation. We pass a parameter to the constructor to control
whether to delta against the previous revision and we inline all
parent delta logic into a single function.
With this change, cg2packer is empty, so it has been deleted.
Differential Revision: https://phab.mercurial-scm.org/D4083
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 10:01:58 -0700] rev 38900
changegroup: control reordering via constructor argument
cg2packer.__init__ exists just to override self._reorder. Let's
parameterize this behavior via an argument to the parent's
__init__.
The logic for self._reorder is kinda wonky. None is used as a
special value and the value should be None in some situations.
It is probably worth rewriting this logic to make behavior more
explicit. This will likely happen as part of future work to
control the delta generation process that I have planned.
Differential Revision: https://phab.mercurial-scm.org/D4082
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 09:44:30 -0700] rev 38899
changegroup: consolidate tree manifests sending into cg1packer
Previously, we overrode a method to control how manifests were
serialized. This method was redefined on cg3packer to send tree
manifests.
This commit moves the tree manifests sending variation to cg1packer
and teaches the cgpacker constructor to control which version to
use.
After these changes, cg3packer was empty. So it has been removed.
Differential Revision: https://phab.mercurial-scm.org/D4081
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 18:04:51 -0700] rev 38898
changegroup: pass end of manifests marker into constructor
cg3 inserts a custom marker in the stream once all manifests
have been transferred. This is currently abstracted out by
overriding a method.
Let's pass the end of manifests marker in as an argument to avoid
the extra method.
Differential Revision: https://phab.mercurial-scm.org/D4080
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 17:44:56 -0700] rev 38897
changegroup: pass function to build delta header into constructor
Previously, the delta header struct format was defined on each
class and each class had a separate function for building the
delta header.
We replace both of these with an argument to __init__ containing
a callable that can format a delta header given a revisiondelta
instance.
Differential Revision: https://phab.mercurial-scm.org/D4079
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 17:52:21 -0700] rev 38896
changegroup: make delta header struct formatters actual structs
Why we weren't using compiled Struct instances, I don't know. They
make code simpler. In theory they are faster. Although I don't
believe I was able to measure any meaningful change. That could be
because this code is often dominated by compression, deltafication,
and function call overhead.
Differential Revision: https://phab.mercurial-scm.org/D4078
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 17:33:23 -0700] rev 38895
changegroup: pass version into constructor
Currently, the version is an attribute on each class. Passing
the argument into the constructor gets us one step closer to
eliminating cg2packer and cg3packer.
Differential Revision: https://phab.mercurial-scm.org/D4077
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 17:29:53 -0700] rev 38894
changegroup: define functions for creating changegroup packers
Currently, we have 3 classes for changegroup generation. Each class
handles a specific changegroup format. And each subsequent version's
class inherits from the previous one.
The interface for the classes is not very well defined and a lot of
version-specific behavior is behind overloaded functions. This
approach adds complexity and makes changegroup generation difficult
to reason about.
Upcoming commits will be consolidating these 3 classes so differences
between changegroup versions and changegroup generation are controlled
by parameters to a single constructor / type rather than by
overriding class attributes via inheritance.
We begin this process by building dedicated functions for creating
each changegroup packer instance. Currently they just call the
constructor on the appropriate class. This will soon change.
Differential Revision: https://phab.mercurial-scm.org/D4076
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 03 Aug 2018 10:05:26 -0700] rev 38893
changegroup: capture revision delta in a data structure
The current changegroup generation code is tightly coupled to
the revlog API. This tight coupling makes it difficult to implement
alternate storage backends without requiring a large surface area
of the revlog API to be exposed. This is not desirable.
In order to support changegroup generation with non-revlog storage,
we'll need to abstract the concept of delta generation.
This commit is the first step down that road. We introduce a
data structure for representing a delta in a changegroup.
The API still leaves a lot to be desired. But at least we now
have separation between data and actions performed on it.
As part of this, we tweak behavior slightly: we no longer
concatenate the delta prefix with the metadata header. Instead,
we track and emit the prefix as a separate chunk. This shouldn't
have any meaningful impact since all the chunks just get sent to
the wire, the compressor, etc.
Because we're introducing a new object, this does add some
overhead to changegroup execution. `hg perfchangegroupchangelog`
on my clone of the Mercurial repo (~40,000 visible revisions in
the changelog) slows down a bit:
! wall 1.268600 comb 1.270000 user 1.270000 sys 0.000000 (best of 8)
! wall 1.419479 comb 1.410000 user 1.410000 sys 0.000000 (best of 8)
With for `hg bundle -t none-v2 -a /dev/null`:
before: real 6.610 secs (user 6.460+0.000 sys 0.140+0.000)
after: real 7.210 secs (user 7.060+0.000 sys 0.140+0.000)
I plan to claw back this regression in future commits. And I may
even do away with this data structure once the refactor is complete.
For now, it makes things easier to comprehend.
Differential Revision: https://phab.mercurial-scm.org/D4075
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 16:36:40 -0700] rev 38892
changegroup: inline ellipsisdata()
There's only one caller of it. I don't think it needs to exist as
a standalone function.
Differential Revision: https://phab.mercurial-scm.org/D4074
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 17:05:11 -0700] rev 38891
changegroup: rename "revlog" variables
"revlog" shadows the module import. But more importantly, changegroup
generation should be storage agnostic and not assume the existence
of revlogs. Let's rename the thing providing revision storage to
"store" to reflect this ideal property.
Differential Revision: https://phab.mercurial-scm.org/D4073
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 14:15:10 -0700] rev 38890
changegroup: move generate() modifications from narrow
Narrow had a custom version of generate() that was essentially a copy
of generate() with inline additions to facilitate ellipses serving.
This commit inlines those modifications into generate().
Differential Revision: https://phab.mercurial-scm.org/D4067
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 12:18:35 -0700] rev 38889
changegroup: move generatefiles() from narrow
The code is a bit ugly in that it overrides the linknodes
function that is passed in as a function. I'd like to think
that the caller of generatefiles() would pass in the appropriate
function. We can clean this up later.
Differential Revision: https://phab.mercurial-scm.org/D4066
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 12:12:12 -0700] rev 38888
changegroup: move _sortgroup() from narrow
Differential Revision: https://phab.mercurial-scm.org/D4065
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 09:52:01 -0700] rev 38887
changegroup: move close() from narrow
More of the same.
Differential Revision: https://phab.mercurial-scm.org/D4064
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 09:53:22 -0700] rev 38886
changegroup: move revchunk() from narrow
The monkeypatched revchunk for ellipses serving is a
completely independent implementation. We model it as such
in the changegroup code. revchunk() is now a simple proxy
function.
Again, I wish we had better APIs here. Especially since this
narrow code is part of cg1packer and cg1packer can't be used
with narrow. Class inheritance is wonky. And I will definitely
be making changes to changegroup code for delta generation.
As part of the code move, `node.nullrev` was replaced by
`nullrev`. And a reference to `orig` was replaced to call
`self._revchunknormal` directly.
Differential Revision: https://phab.mercurial-scm.org/D4063
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 02 Aug 2018 09:40:18 -0700] rev 38885
changegroup: move deltaparent() from narrow
I'm not keen on performing the attribute sniff to test for
presence of ellipses mode: I'd rather we use a separate packer
instance that was ellipses mode specific. But I've tried to
formalize a better API without narrow in core and I can't
make sense of all the monkeypatching. My goal is to inline
as much of the monkeypatching as possible then refactor the
changegroup generation API.
We add this code to the cg2packer because narrow doesn't work
with cg1.
Differential Revision: https://phab.mercurial-scm.org/D4062
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 28 Jul 2018 17:59:37 -0700] rev 38884
changegroup: move _packellipsischangegroup() from narrow
The behavior here is not ideal, as the function constructs a
packer then adds attributes to it. This will be cleaned up in
subsequent commits. Moving this code is necessary to move the
remainder of the bundle2-level changegroup part generation code
into core.
Differential Revision: https://phab.mercurial-scm.org/D4061
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 28 Jul 2018 17:52:21 -0700] rev 38883
changegroup: move ellipsisdata() from narrow
This is a pretty straightforward copy of the function.
Differential Revision: https://phab.mercurial-scm.org/D4060
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 19:48:50 +0900] rev 38882
fileset: narrow status computation by left-hand-side of 'and' node
Timing with warm disk cache:
$ hg status --cwd mozilla-central 'set:path:build/ and unknown()' --time
(orig) time: real 1.970 secs (user 1.560+0.000 sys 0.410+0.000)
(new) time: real 0.330 secs (user 0.310+0.000 sys 0.020+0.000)
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 19:43:57 +0900] rev 38881
fileset: move copy constructor of matchctx near __init__
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 11:20:48 +0900] rev 38880
fileset: build status according to 'withstatus' hint
_switchcallers is no longer needed since 'withstatus' node is reinserted for
arguments of functions like revs().
New matchctx instance is created per 'withstatus' to make sure that status
tuple is available only for children of the 'withstatus' node.
Yuya Nishihara <yuya@tcha.org> [Sat, 21 Jul 2018 20:27:53 +0900] rev 38879
fileset: insert hints where status should be computed
This will allow us to compute status against a narrowed set of files.
For example, "path:build/ & (unknown() + missing())" is rewritten as
"path:build/ & <withstatus>(unknown() + missing(), 'unknown missing')",
and the status call can be narrowed by the left-hand-side matcher,
"path:build/".
mctx.buildstatus() calls will be solely processed by getmatchwithstatus().
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 11:12:55 +0900] rev 38878
fileset: move buildstatus() to matchctx method
In future patches, file status will be computed while evaluating a parsed
tree. This patch provides a matchctx interface to build status.
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 10:58:32 +0900] rev 38877
fileset: keep basectx by matchctx
Yuya Nishihara <yuya@tcha.org> [Sun, 22 Jul 2018 10:55:38 +0900] rev 38876
fileset: pass in basectx to _buildstatus()
I'll make matchctx remember both ctx and basectx so that file status between
them can be computed later. This prepares for the change.
Sushil khanchi <sushilkhanchi97@gmail.com> [Sat, 04 Aug 2018 12:58:08 +0530] rev 38875
resolve: update commands.resolve.confirm help text
Included --mark and --unmark in the help text of
resolve.confirm.config.
Differential Revision: https://phab.mercurial-scm.org/D4103
Sushil khanchi <sushilkhanchi97@gmail.com> [Sat, 04 Aug 2018 12:43:41 +0530] rev 38874
resolve: support confirm config option with --unmark flag
Now, commands.resolve.confirm also respect --unmark option; and
confirm to unresolve all resolved files.
It will confirm only when no files pats are passed (same as --mark),
because when no pats are passed the default is to mark resolved files
as unresolved.
And if user has passed file pats then I think there is no need to confirm
for that.
Differential Revision: https://phab.mercurial-scm.org/D4102
Kyle Lippincott <spectral@google.com> [Sun, 05 Aug 2018 00:53:55 -0700] rev 38873
resolve: correct behavior of mark-check=none to match docs
Differential Revision: https://phab.mercurial-scm.org/D4121
Martin von Zweigbergk <martinvonz@google.com> [Thu, 02 Aug 2018 14:57:20 -0700] rev 38872
narrow: move .hg/narrowspec to .hg/store/narrowspec (BC)
The narrowspec is more closely related to the store than to the
working copy. For example, if the narrowspec changes, the set of
revlogs also needs to change (the working copy may change, but that
depends on which commit is checked out). Also, when using the share
extension, the narrowspec needs to be shared along with the
store. This patch therefore moves the narrowspec into the store/
directory.
This is clearly a breaking change, but I haven't bothered trying to
fall back to reading the narrowspec from the old location (.hg/),
because there are very few users of narrow out there. (We'll add a
temporary hack to our Google-internal extension to handle the
migration.)
Differential Revision: https://phab.mercurial-scm.org/D4099
Martin von Zweigbergk <martinvonz@google.com> [Fri, 03 Aug 2018 13:53:02 -0700] rev 38871
narrow: drop checkambig=True when restoring backup
IIUC, checkambig is about updating timestamps of the file while
renaming. That's important for the dirstate, but we never check the
timestamp of the narrowspec file. We can therefore avoid checking
passing checkambig=True.
Differential Revision: https://phab.mercurial-scm.org/D4098
Martin von Zweigbergk <martinvonz@google.com> [Thu, 02 Aug 2018 14:30:40 -0700] rev 38870
narrow: remove a repo file-cache invalidation
It's unclear why this was needed. All tests pass without it. I asked
Kyle Lippincott (who added the check) and he also doesn't remember
what it was for.
Differential Revision: https://phab.mercurial-scm.org/D4097
Martin von Zweigbergk <martinvonz@google.com> [Fri, 03 Aug 2018 11:09:41 -0700] rev 38869
narrow: call narrowspec.{save,restore,clear}backup directly
I want to move .hg/narrowspec to .hg/store/narrowspec and we need to
decouple the narrowspec update from the dirstate update for that. This
patch lets the callers call the narrowspec backup functions directly,
in addition to the dirstate backup functions. The narrowspec methods
are made to check if narrowing is enabled. For that, a repo instance
was needed, which all the callers luckily already had available.
Differential Revision: https://phab.mercurial-scm.org/D4096
Martin von Zweigbergk <martinvonz@google.com> [Sat, 04 Aug 2018 23:15:06 -0700] rev 38868
index: don't add 1 to length variables
A lot of "+ 1" and "-1" were mechanically added to ease the transition
in
781b2720d2ac (index: don't include nullid in len(),
2018-07-20). Let's clean it up now.
Differential Revision: https://phab.mercurial-scm.org/D4106
Martin von Zweigbergk <martinvonz@google.com> [Sat, 04 Aug 2018 22:48:25 -0700] rev 38867
index: drop support for nullid at position len(index) in index_node
I think no callers exist since at least
a3dacabd476b (index: don't
allow index[len(index)] to mean nullid, 2018-07-20).
Differential Revision: https://phab.mercurial-scm.org/D4105
Martin von Zweigbergk <martinvonz@google.com> [Sat, 04 Aug 2018 23:15:03 -0700] rev 38866
index: return False for "len(index) in index"
Since we no longer accept index[len(index)], we should clearly make
"len(index) in index" return False. This should have been part of
a3dacabd476b (index: don't allow index[len(index)] to mean nullid,
2018-07-20)
Differential Revision: https://phab.mercurial-scm.org/D4104
Yuya Nishihara <yuya@tcha.org> [Sat, 21 Jul 2018 17:19:12 +0900] rev 38865
fileset: combine union of basic patterns into single matcher
This appears to improve query performance in a big repository than I thought.
Writing less Python in a hot loop, faster computation we gain.
$ hg files --cwd mozilla-central --time 'set:a* + b* + c* + d* + e*'
(orig) time: real 0.670 secs (user 0.640+0.000 sys 0.030+0.000)
(new) time: real 0.210 secs (user 0.180+0.000 sys 0.020+0.000)
Yuya Nishihara <yuya@tcha.org> [Sat, 21 Jul 2018 17:13:34 +0900] rev 38864
fileset: reorder 'or' expression by weight
Yuya Nishihara <yuya@tcha.org> [Sat, 04 Aug 2018 17:08:33 +0900] rev 38863
fileset: introduce weight constants for readability
These constants are defined in the filesetlang module since it's the
bottommost module depending on WEIGHT_CHECK_FILENAME, and extensions
will be likely to import it to process function arguments.
Credit for the naming goes to Augie Fackler.