Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:17 +0100] rev 44388
rust-nodemap: core implementation for shortest
In this implementation, we just make `lookup()` return also the number
of steps that have been needed to come to a conclusion from the
nodetree data, and `validate_candidate()` takes care of the special
cases related to `NULL_NODE`.
This way of doing minimizes code duplication, but it means that
the comparatively slower finding of first non zero nybble will run
for all calls to `find()` where it is not needed.
Still running on the file generated for the mozilla-central repository,
it seems indeed that we now get more ofter 320 ns than 310. The odds that
this could have a significant impact on real life Mercurial performance
are still looking low. Let's wait for actual benchmark runs to see if
an optimization is needed here.
Differential Revision: https://phab.mercurial-scm.org/D7819
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:16 +0100] rev 44387
rust-nodemap: special case for prefixes of NULL_NODE
We have to behave as though NULL_NODE was stored in the node tree,
although we don't store it.
Differential Revision: https://phab.mercurial-scm.org/D7798
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:15 +0100] rev 44386
rust-nodemap: pure Rust example
To run, use `cargo run --release --example nodemap`
This demonstrates that simple scenarios entirely written
in Rust can content themselves with `NodeTree<T>`.
The example mmaps both the nodemap file and the changelog index.
We had of course to include an implementation of `RevlogIndex`
directly, which isn't much at this stage. It felt a bit
prematurate to include it in the lib.
Here are some first performance measurements, obtained with
this example, on a clone of mozilla-central with 440000
changesets:
(create) Nodemap constructed in RAM in 153.638305ms
(query
CAE63161B68962) found in 22.362us: Ok(Some(269489))
(bench) Did 3 queries in 36.418µs (mean 12.139µs)
(bench) Did 50 queries in 184.318µs (mean 3.686µs)
(bench) Did 100000 queries in 31.053461ms (mean 310ns)
To be fair, even between bench runs, results tend to depend whether
the file is still in kernel caches, and it's not so easy to
get back to a real cold start. The worst we've seen was in the
50us ballpark.
In any busy server setting, the pages would always be in RAM.
We hope it's good enough not to be significantly slower on any
concrete Mercurial operation than the C nodetree when fully in RAM,
and of course this implementation has the serious headstart advantage
of persistence.
Differential Revision: https://phab.mercurial-scm.org/D7797
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:15 +0100] rev 44385
rust-nodemap: input/output primitives
These allow to initiate a `NodeTree` from an immutable opaque
sequence of bytes, which could be passed over from Python
(extracted from a `PyBuffer`) or directly mmapped from a file.
Conversely, we can consume
a `NodeTree`, extracting the bytes that express what
has been added to the immutable part, together with the
original immutable part.
This gives callers the choice to start a new Nodetree.
After writing to disk, some would prefer to reread for
best guarantees (very cheap if mmapping), some others will
find it more convenient to grow the memory that was considered
immutable in the `NodeTree` and continue from there.
This is enough to build examples running on real data and
start gathering performance hints.
Differential Revision: https://phab.mercurial-scm.org/D7796
Martin von Zweigbergk <martinvonz@google.com> [Thu, 13 Feb 2020 15:33:36 -0800] rev 44384
pyoxidizer: allow extensions to be loaded from the file system
It seems that setting this config is all that's needed to be able to
load extensions from the file system (which we clearly want). Thanks
for making this work, Gregory Szorc!.
Differential Revision: https://phab.mercurial-scm.org/D8122
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Mon, 17 Feb 2020 20:30:03 -0500] rev 44383
graft: always allow hg graft --base . (
issue6248)
`hg graft --base . -r abc` is rejected before this change with a
"nothing to merge" error, if `abc` does not descend from `.`.
This looks like an artifact of the implementation rather than intended
behavior. It makes perfect sense to apply the diff between `.` and
`abc` to the working copy (i.e. degenerate into `hg revert`),
regardless of what `abc` is.
Differential Revision: https://phab.mercurial-scm.org/D8127
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 19 Feb 2020 17:30:04 +0100] rev 44382
revlog-compression: update the config to be a list
format.revlog-compression is now a list of engine, the first supported one is to
be used. Doing this have several benefits:
1) this is fully backward compatible, config using a single entry will be read
as a single item list, not changing any behavior.
2) This open the way to use zstd by default without impacting platform were it
is not available. This will be done in a later changesets.
Using zstd provide a significant performance boost explained in :
bb271ec2fbfb.
However zstd is not available in some cases, A notable example is the `--pure`
version of Mercurial which doesn't come with zstd support.
Differential Revision: https://phab.mercurial-scm.org/D8148
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 19 Feb 2020 13:39:00 +0530] rev 44381
remotefilelog: add 'changelog' arg to shallowcg1packer.generate (
issue6269)
This cause traceback on widening using narrow extension when remotefilelog
is enabled.
Differential Revision: https://phab.mercurial-scm.org/D8134
Martin von Zweigbergk <martinvonz@google.com> [Tue, 25 Feb 2020 12:41:35 -0800] rev 44380
drawdag: abide by new createmarkers() API
The `obsolete.createmarkers()` API was changed in
6335c0de80fa
(obsolete: allow multiple predecessors in createmarkers, 2018-09-22)
to prefer its precursors input to be a tuple instead of a single
precursor. Let's fix `drawdag.py` to comply.
Differential Revision: https://phab.mercurial-scm.org/D8149
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 14:52:46 -0500] rev 44379
lfutil: provide a hint if the largefiles/lfs cache path cannot be determined
A coworker hit this error using an LFS repo in a stripped down environment, and
didn't know how to resolve it.
The final conditional is a bit fast and loose, but there is currently no 'posix'
test in hghave, and it doesn't seem like it's worth adding for this since I
think Windows is the only non-POSIX platform we run tests on.
Differential Revision: https://phab.mercurial-scm.org/D8145
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 00:20:47 -0500] rev 44378
setup: exclude the __index__ module from itself when generating
This module is generated on Windows to hold all of the extension names and the
help summaries, so that they are discoverable inside the py2exe zipfile. The
problem is this file is generated by dumping the disabled list, and that list
comes from walking the filesystem. So once an install from source into a
virtualenv created this module, then next build from source from that virtualenv
would also see __index__.py in the filesystem, and include it. Clearly that's
wrong because this isn't a real extension, so just filter it from the list when
generating it.
The Mercurial installer was unaffected by this, but the TortoiseHg package was.
In the final package, `hg help -v extensions` and the panel of extensions both
showed it.
Differential Revision: https://phab.mercurial-scm.org/D8142
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 16:33:10 -0500] rev 44377
tests: stabilize test-amend.t on Windows
If $TESTTMP isn't quoted in this context, it ends up like
`C:Temphgtests.pikkoxchild1test-amend.t-obsstore-off`.
Differential Revision: https://phab.mercurial-scm.org/D8144
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 17:43:34 -0500] rev 44376
tests: replace truncate(1) with inline python
MSYS doesn't have truncate(1) installed by default, and FreeBSD looked unhappy
with the arguments provided.
Differential Revision: https://phab.mercurial-scm.org/D8147
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 16:59:35 -0500] rev 44375
tests: stabilize test-rename-merge2.t on Windows
I have no idea why, but this shifted in
b4057d001760.
Differential Revision: https://phab.mercurial-scm.org/D8146
Augie Fackler <augie@google.com> [Mon, 24 Feb 2020 13:50:55 -0500] rev 44374
merge with stable
Yuya Nishihara <yuya@tcha.org> [Mon, 24 Feb 2020 13:28:49 +0900] rev 44373
py3: fix EOL detection in commandserver.channeledinput
This breaks TortoiseHg's email preview which sends b'\n' while readline
request is issued and the loop never ends. Spotted by Matt Harbison.
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Thu, 13 Feb 2020 22:51:17 -0500] rev 44372
bookmarks: prevent pushes of divergent bookmarks (foo@remote)
Before this change, such bookmarks are write-only: a client can push
them but not pull/read them. And because these bookmark can't be read,
even pushes are limited (for instance trying to delete such a bookmark
fails with a vanilla client because the client thinks the bookmark is
neither on the local nor the remote).
This change makes the server refuses such bookmarks, and for earlier
errors, makes the client refuse to send them.
I think the change of behavior is acceptable because I think this is a
bug in push/pull, and I don't think we change the behavior of `hg
unbundle`, because it doesn't seem that `hg bundle` ever store
bookmarks (and even if it did, it would seem weird anyway to try to
send divergent bookmarks).
Differential Revision: https://phab.mercurial-scm.org/D8117
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Thu, 13 Feb 2020 22:06:57 -0500] rev 44371
bookmarks: refactor in preparation for next commit
Differential Revision: https://phab.mercurial-scm.org/D8116
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 15 Feb 2020 14:51:33 -0500] rev 44370
bookmarks: avoid traceback when two pushes race to delete the same bookmark
`hg push -f -B remote-only-bookmark` can raise server-side in
`bookmarks._del` (specifically in `self._refmap.pop(mark)`), if the
remote-only bookmark got deleted concurrently.
Fix this by simply not deleting the non-existent bookmark in that
case.
For avoidance of doubt, refusing to delete a bookmark that doesn't
exist when the push starts is taking care of elsewhere; no change of
behavior there.
Differential Revision: https://phab.mercurial-scm.org/D8124
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 15 Feb 2020 15:06:41 -0500] rev 44369
relnotes: add entry about previous `hg recover` change
Differential Revision: https://phab.mercurial-scm.org/D8123
Kyle Lippincott <spectral@google.com> [Thu, 20 Feb 2020 15:15:23 -0800] rev 44368
darwin: add another preemptive gui() call when using chg
Changeset
a89381e04c58 added this gui() call before background forks, and
Google's extensions do background forks on essentially every invocation for
logging purposes. The crash is reliably (though not 100%) reproducible without
this change when running `HGPLAIN=1 chg status` in one of our repos. With this
fix, I haven't been able to trigger the crash anymore.
Differential Revision: https://phab.mercurial-scm.org/D8141
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 13:24:46 -0800] rev 44367
copy: add experimental support for marking committed copies
The simplest way I'm aware of to mark a file as copied/moved after
committing is this:
hg uncommit --keep <src> <dest> # <src> needed for move, but not copy
hg mv --after <src> <dest>
hg amend
This patch teaches `hg copy` a `--at-rev` argument to simplify that
into:
hg copy --after --at-rev . <src> <dest>
In addition to being simpler, it doesn't touch the working copy, so it
can easily be used even if the destination file has been modified in
the working copy.
Differential Revision: https://phab.mercurial-scm.org/D8035
Martin von Zweigbergk <martinvonz@google.com> [Thu, 26 Dec 2019 14:02:50 -0800] rev 44366
copy: move argument validation a little earlier
Argument validation is usually done early and I will want it done
before some code that I'm about to add.
Differential Revision: https://phab.mercurial-scm.org/D8033
Martin von Zweigbergk <martinvonz@google.com> [Tue, 28 Jan 2020 14:07:57 -0800] rev 44365
copy: add experimetal support for unmarking committed copies
The simplest way I'm aware of to unmark a file as copied after
committing is this:
hg uncommit --keep <dest>
hg forget <dest>
hg add <dest>
hg amend
This patch teaches `hg copy --forget` a `-r` argument to simplify that into:
hg copy --forget --at-rev . <dest>
In addition to being simpler, it doesn't touch the working copy, so it
can easily be used even if the destination file has been modified in
the working copy.
I'll teach `hg copy` without `--forget` to work with `--at-rev` next.
Differential Revision: https://phab.mercurial-scm.org/D8030
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 15:50:13 -0800] rev 44364
copy: add option to unmark file as copied
To unmark a file as copied, the user currently has to do this:
hg forget <dest>
hg add <dest>
The new command simplifies that to:
hg copy --forget <dest>
That's not a very big improvement, but I'm planning to also teach `hg
copy [--forget]` a `--at-rev` argument for marking/unmarking copies
after commit (usually with `--at-rev .`).
Differential Revision: https://phab.mercurial-scm.org/D8029
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Feb 2020 11:18:52 +0100] rev 44363
nodemap: introduce an option to use mmap to read the nodemap mapping
The performance and memory benefit is much greater if we don't have to copy all
the data in memory for each information. So we introduce an option (on by
default) to read the data using mmap.
This changeset is the last one definition the API for index support nodemap
data. (they have to be able to use the mmaping).
Below are some benchmark comparing the best we currently have in 5.3 with the
final step of this series (using the persistent nodemap implementation in
Rust). The benchmark run `hg perfindex` with various revset and the following
variants:
Before:
* do not use the persistent nodemap
* use the CPython implementation of the index for nodemap
* use mmapping of the changelog index
After:
* use the MixedIndex Rust code, with the NodeTree object for nodemap access
(still in review)
* use the persistent nodemap data from disk
* access the persistent nodemap data through mmap
* use mmapping of the changelog index
The persistent nodemap greatly speed up most operation on very large
repositories. Some of the previously very fast lookup end up a bit slower because
the persistent nodemap has to be setup. However the absolute slowdown is very
small and won't matters in the big picture.
Here are some numbers (in seconds) for the reference copy of mozilla-try:
Revset Before After abs-change speedup
-10000: 0.004622 0.005532 0.000910 × 0.83
-10: 0.000050 0.000132 0.000082 × 0.37
tip 0.000052 0.000085 0.000033 × 0.61
0 + (-10000:) 0.028222 0.005337 -0.022885 × 5.29
0 0.023521 0.000084 -0.023437 × 280.01
(-10000:) + 0 0.235539 0.005308 -0.230231 × 44.37
(-10:) + :9 0.232883 0.000180 -0.232703 ×1293.79
(-10000:) + (:99) 0.238735 0.005358 -0.233377 × 44.55
:99 + (-10000:) 0.317942 0.005593 -0.312349 × 56.84
:9 + (-10:) 0.313372 0.000179 -0.313193 ×1750.68
:9 0.316450 0.000143 -0.316307 ×2212.93
On smaller repositories, the cost of nodemap related operation is not as big, so
the win is much more modest. Yet it helps shaving a handful of millisecond here
and there.
Here are some numbers (in seconds) for the reference copy of mercurial:
Revset Before After abs-change speedup
-10: 0.000065 0.000097 0.000032 × 0.67
tip 0.000063 0.000078 0.000015 × 0.80
0 0.000561 0.000079 -0.000482 × 7.10
-10000: 0.004609 0.003648 -0.000961 × 1.26
0 + (-10000:) 0.005023 0.003715 -0.001307 × 1.35
(-10:) + :9 0.002187 0.000108 -0.002079 ×20.25
(-10000:) + 0 0.006252 0.003716 -0.002536 × 1.68
(-10000:) + (:99) 0.006367 0.003707 -0.002660 × 1.71
:9 + (-10:) 0.003846 0.000110 -0.003736 ×34.96
:9 0.003854 0.000099 -0.003755 ×38.92
:99 + (-10000:) 0.007644 0.003778 -0.003866 × 2.02
Differential Revision: https://phab.mercurial-scm.org/D7894
Raphaël Gomès <rgomes@octobus.net> [Fri, 14 Feb 2020 15:03:26 +0100] rev 44362
rust-dirstatemap: directly return `non_normal` and `other_entries`
This cleans up the interface which I previously thought needed to be uglier
than in reality. No performance difference, simple refactoring.
Differential Revision: https://phab.mercurial-scm.org/D8121
Martin von Zweigbergk <martinvonz@google.com> [Thu, 26 Dec 2019 14:12:45 -0800] rev 44361
copy: rename `wctx` to `ctx` since it will not necessarily be working copy
Differential Revision: https://phab.mercurial-scm.org/D8032
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 14:03:12 -0800] rev 44360
copy: rewrite walkpat() to depend less on dirstate
I want to add a `hg cp/mv -r <rev>` option to mark files as
copied/moved in an existing commit (amending that commit). The code
needs to not depend on the dirstate for that.
Differential Revision: https://phab.mercurial-scm.org/D8031
Martin von Zweigbergk <martinvonz@google.com> [Thu, 13 Feb 2020 10:12:12 -0800] rev 44359
merge with stable