Martin von Zweigbergk <martinvonz@google.com> [Fri, 31 May 2019 15:28:31 -0700] rev 42411
narrowspec: replace one recursion-avoidance hack with another
When updating the working copy narrowspec, we call context.walk() in
order to find which files to update the working copy
with. context.walk() calls repo.narrowmatch(). In order to avoid
infinite recursion in this case, we have a hack that assigns the new
values for repo.narrowpats and repo._narrowmatch. However, doing that
of course breaks future invalidation of those properties (they're
@storecache'd). Let's instead avoid the infinite recursion by setting
a flag on the repo instance when we're updating the working copy.
Differential Revision: https://phab.mercurial-scm.org/D6468
Martin von Zweigbergk <martinvonz@google.com> [Sat, 09 Mar 2019 22:13:06 -0800] rev 42410
merge: simplify initialization of "pas"
Differential Revision: https://phab.mercurial-scm.org/D6472
Martin von Zweigbergk <martinvonz@google.com> [Sat, 09 Mar 2019 22:11:27 -0800] rev 42409
merge: reorder some initialization to make more sense
This puts the closely related definitions of "pl", "p1", "p2", "pas"
close together, and moves the definition of "overwrite" away and
closer to where it's first used.
Differential Revision: https://phab.mercurial-scm.org/D6471
Georges Racinet <georges.racinet@octobus.net> [Wed, 22 May 2019 08:27:02 +0000] rev 42408
rust-dirstate: architecture independence fix
Apparently, c_char is u8 on ppc64le and i8 on amd64
Differential Revision: https://phab.mercurial-scm.org/D6473
Martin von Zweigbergk <martinvonz@google.com> [Tue, 14 May 2019 22:20:10 -0700] rev 42407
context: get filesadded() and filesremoved() from changeset if configured
This adds the read side for getting the sets of added and removed
files from the changeset extras. I timed this command on the hg repo:
hg log -T '{rev}\n {files}\n %:{file_mods}\n +{file_adds}\n -{file_dels}\n'
It took 1m21s before and 6.4s after. I also used that command to check
that the result didn't change compared to calculating the values from
the manifests on the fly (it didn't change).
In the mozilla-unified repo, the same command run on
FIREFOX_BETA_58_END::FIREFOX_BETA_59_END went from 29s to 0.67s.
Differential Revision: https://phab.mercurial-scm.org/D6417
Martin von Zweigbergk <martinvonz@google.com> [Tue, 14 May 2019 22:19:51 -0700] rev 42406
changelog: optionally store added and removed files in changeset extras
As mentioned in an earlier patch, copies._chain() is used a lot in the
changeset-centric version of pathcopies(). It is expensive because it
needs to look at the manifest in order to filter out copies whose
target file has since been removed. I want to store the sets of added
and removed files in the changeset in order to speed that up. This
patch does the writing part of that. It could easily be a separate
config, but it's currently tied to experimental.copies.write-to since
that's the only real use case (it will also make the {file_*} template
keywords faster, but I doubt that anyone cares enough about those to
write extra metadata for them).
The new information is stored in the changeset extras. Since they're
always subsets of the changeset's "files" list, they're stored as
indexes into that list. I've stored the indexes as stringified ints
separated by NUL bytes. The size of 00changelog.d for the hg repo
increased in size by 0.28% percent (compared to the size with only
copy information in the changesets, which in turn is 0.17% larger than
without copy information). We could store only the delta between the
indexes and we could store them in binary, but the chosen format is
more readable.
We could also have implemented this as a cache outside the
changelog. One advantage of doing it that way is that we would get the
speedups from the {file_*} template keywords also on old
repos. Another advantage is that it we can rewrite the cache if we
find a bug in how we calculate the set of files. A disadvantage is
that it would be more complex. Another is that it would surely use
more space. We already write the copy information to the changeset
extras, so it seems like a small step to also write these file sets.
Differential Revision: https://phab.mercurial-scm.org/D6416
Martin von Zweigbergk <martinvonz@google.com> [Thu, 18 Apr 2019 13:35:02 -0700] rev 42405
templatekw: make {file_*} compare to both merge parents (
issue4292)
This redefines the {file_adds}, {file_dels}, {file_mods} template
keywords by getting the lists from the recently introduced context
methods instead of getting them from status compared to p1. As
mentioned before, these are better defined on merge commits. The total
number of files from the three lists now always add up to the number
of files in {files}.
I timed this command:
hg log -r 4.0::5.0 -T '{rev}\n {file_mods}\n {file_adds}\n {file_dels}\n'
It went from 7.6s to 5.6s with this patch. So it's actually faster
than before.
Note that the "files:" field in the bazaar test log output was using
"{file_mods}" (not "{files}" as one might think based on the label).
Differential Revision: https://phab.mercurial-scm.org/D6369
Martin von Zweigbergk <martinvonz@google.com> [Fri, 31 May 2019 09:25:51 -0700] rev 42404
narrowspec: use vfs.tryread() instead of reimplementing
Note that parseconfig() works well with empty strings.
Differential Revision: https://phab.mercurial-scm.org/D6465
Martin von Zweigbergk <martinvonz@google.com> [Fri, 31 May 2019 13:25:28 -0700] rev 42403
help: remove a superfluous "the" in revlogs text
Differential Revision: https://phab.mercurial-scm.org/D6466
Martin von Zweigbergk <martinvonz@google.com> [Thu, 08 Mar 2018 11:08:24 -0800] rev 42402
setdiscovery: make progress on most connected groups each roundtrip
Consider history like this:
o
| o
| |
| o
| |
| o
|/
o
| o
| |
| o
| |
| o
|/
o
| o
| |
| o
| |
| o
|/
o
~
Assume the left mainline is available in the remote repo and the other
commits are only in the local repo. Also imagine that instead of 3
local branches with 3 commits on each, there are 1000 branches (the
number of commits on each doesn't matter much here). In such a
scenario, the current setdiscovery code will pick a sample size of 200
among these branches and ask the remote which of them it has. However,
the discovery for each such branch is completely independent of the
discovery for the others -- knowing whether the remote has a commit in
one branch doesn't give us any information about the other
branches. The discovery will therefore take at least 5 roundtrips
(maybe more depending on which commit in each linear chain was
sampled). Since the discovery for each branch is independent, there is
no reason to let one branch wait for another, so this patch makes it
so we sample at least as many commits as there are branches. It may
still happen (it's very likely, even) that we get multiple samples
from one branch and none from another, but that will even out over a
few rounds and I think this is still a big improvement.
Because of http header size limits, we still use the old behavior
unless experimental.httppostargs=true.
I've timed this by running `hg debugdiscovery mozilla-unified --debug` in the
mozilla-try repo. Both repos were local. Before this patch, last part
of the output was:
2249 total queries in 5276.4859s
elapsed time: 5276.652634 seconds
heads summary:
total common heads: 13
also local heads: 4
also remote heads: 8
both: 4
local heads: 28317
common: 4
missing: 28313
remote heads: 12
common: 8
unknown: 4
local changesets: 2014901
common: 530373
missing: 1484528
common heads:
1dad417c28ad 4a108e94d3e2 4d7ef530fffb 5350524bb654 777e60ca8853 7d97fafba271 9cd2ab4d0029 a55ce37217da d38398e5144e dcc6d7a0dc00 e09297892ada e24ec6070d7b fd559328eaf3
After this patch, the output was (including all the samples, since
there were so few now):
taking initial sample
query 2; still undecided: 1599476, sample size is: 108195
sampling from both directions
query 3; still undecided: 810922, sample size is: 194158
sampling from both directions
query 4; still undecided: 325882, sample size is: 137302
sampling from both directions
query 5; still undecided: 111459, sample size is: 74586
sampling from both directions
query 6; still undecided: 26805, sample size is: 23960
sampling from both directions
query 7; still undecided: 2549, sample size is: 2528
sampling from both directions
query 8; still undecided: 21, sample size is: 21
8 total queries in 24.5064s
elapsed time: 24.670051 seconds
heads summary:
total common heads: 13
also local heads: 4
also remote heads: 8
both: 4
local heads: 28317
common: 4
missing: 28313
remote heads: 12
common: 8
unknown: 4
local changesets: 2014901
common: 530373
missing: 1484528
common heads:
1dad417c28ad 4a108e94d3e2 4d7ef530fffb 5350524bb654 777e60ca8853 7d97fafba271 9cd2ab4d0029 a55ce37217da d38398e5144e dcc6d7a0dc00 e09297892ada e24ec6070d7b fd559328eaf3
Differential Revision: https://phab.mercurial-scm.org/D2647