Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:15 +0100] rev 44386
rust-nodemap: pure Rust example
To run, use `cargo run --release --example nodemap`
This demonstrates that simple scenarios entirely written
in Rust can content themselves with `NodeTree<T>`.
The example mmaps both the nodemap file and the changelog index.
We had of course to include an implementation of `RevlogIndex`
directly, which isn't much at this stage. It felt a bit
prematurate to include it in the lib.
Here are some first performance measurements, obtained with
this example, on a clone of mozilla-central with 440000
changesets:
(create) Nodemap constructed in RAM in 153.638305ms
(query
CAE63161B68962) found in 22.362us: Ok(Some(269489))
(bench) Did 3 queries in 36.418µs (mean 12.139µs)
(bench) Did 50 queries in 184.318µs (mean 3.686µs)
(bench) Did 100000 queries in 31.053461ms (mean 310ns)
To be fair, even between bench runs, results tend to depend whether
the file is still in kernel caches, and it's not so easy to
get back to a real cold start. The worst we've seen was in the
50us ballpark.
In any busy server setting, the pages would always be in RAM.
We hope it's good enough not to be significantly slower on any
concrete Mercurial operation than the C nodetree when fully in RAM,
and of course this implementation has the serious headstart advantage
of persistence.
Differential Revision: https://phab.mercurial-scm.org/D7797
Georges Racinet <georges.racinet@octobus.net> [Tue, 18 Feb 2020 19:11:15 +0100] rev 44385
rust-nodemap: input/output primitives
These allow to initiate a `NodeTree` from an immutable opaque
sequence of bytes, which could be passed over from Python
(extracted from a `PyBuffer`) or directly mmapped from a file.
Conversely, we can consume
a `NodeTree`, extracting the bytes that express what
has been added to the immutable part, together with the
original immutable part.
This gives callers the choice to start a new Nodetree.
After writing to disk, some would prefer to reread for
best guarantees (very cheap if mmapping), some others will
find it more convenient to grow the memory that was considered
immutable in the `NodeTree` and continue from there.
This is enough to build examples running on real data and
start gathering performance hints.
Differential Revision: https://phab.mercurial-scm.org/D7796
Martin von Zweigbergk <martinvonz@google.com> [Thu, 13 Feb 2020 15:33:36 -0800] rev 44384
pyoxidizer: allow extensions to be loaded from the file system
It seems that setting this config is all that's needed to be able to
load extensions from the file system (which we clearly want). Thanks
for making this work, Gregory Szorc!.
Differential Revision: https://phab.mercurial-scm.org/D8122
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Mon, 17 Feb 2020 20:30:03 -0500] rev 44383
graft: always allow hg graft --base . (
issue6248)
`hg graft --base . -r abc` is rejected before this change with a
"nothing to merge" error, if `abc` does not descend from `.`.
This looks like an artifact of the implementation rather than intended
behavior. It makes perfect sense to apply the diff between `.` and
`abc` to the working copy (i.e. degenerate into `hg revert`),
regardless of what `abc` is.
Differential Revision: https://phab.mercurial-scm.org/D8127
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 19 Feb 2020 17:30:04 +0100] rev 44382
revlog-compression: update the config to be a list
format.revlog-compression is now a list of engine, the first supported one is to
be used. Doing this have several benefits:
1) this is fully backward compatible, config using a single entry will be read
as a single item list, not changing any behavior.
2) This open the way to use zstd by default without impacting platform were it
is not available. This will be done in a later changesets.
Using zstd provide a significant performance boost explained in :
bb271ec2fbfb.
However zstd is not available in some cases, A notable example is the `--pure`
version of Mercurial which doesn't come with zstd support.
Differential Revision: https://phab.mercurial-scm.org/D8148
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 19 Feb 2020 13:39:00 +0530] rev 44381
remotefilelog: add 'changelog' arg to shallowcg1packer.generate (
issue6269)
This cause traceback on widening using narrow extension when remotefilelog
is enabled.
Differential Revision: https://phab.mercurial-scm.org/D8134
Martin von Zweigbergk <martinvonz@google.com> [Tue, 25 Feb 2020 12:41:35 -0800] rev 44380
drawdag: abide by new createmarkers() API
The `obsolete.createmarkers()` API was changed in
6335c0de80fa
(obsolete: allow multiple predecessors in createmarkers, 2018-09-22)
to prefer its precursors input to be a tuple instead of a single
precursor. Let's fix `drawdag.py` to comply.
Differential Revision: https://phab.mercurial-scm.org/D8149
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 14:52:46 -0500] rev 44379
lfutil: provide a hint if the largefiles/lfs cache path cannot be determined
A coworker hit this error using an LFS repo in a stripped down environment, and
didn't know how to resolve it.
The final conditional is a bit fast and loose, but there is currently no 'posix'
test in hghave, and it doesn't seem like it's worth adding for this since I
think Windows is the only non-POSIX platform we run tests on.
Differential Revision: https://phab.mercurial-scm.org/D8145
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 00:20:47 -0500] rev 44378
setup: exclude the __index__ module from itself when generating
This module is generated on Windows to hold all of the extension names and the
help summaries, so that they are discoverable inside the py2exe zipfile. The
problem is this file is generated by dumping the disabled list, and that list
comes from walking the filesystem. So once an install from source into a
virtualenv created this module, then next build from source from that virtualenv
would also see __index__.py in the filesystem, and include it. Clearly that's
wrong because this isn't a real extension, so just filter it from the list when
generating it.
The Mercurial installer was unaffected by this, but the TortoiseHg package was.
In the final package, `hg help -v extensions` and the panel of extensions both
showed it.
Differential Revision: https://phab.mercurial-scm.org/D8142
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 16:33:10 -0500] rev 44377
tests: stabilize test-amend.t on Windows
If $TESTTMP isn't quoted in this context, it ends up like
`C:Temphgtests.pikkoxchild1test-amend.t-obsstore-off`.
Differential Revision: https://phab.mercurial-scm.org/D8144
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 17:43:34 -0500] rev 44376
tests: replace truncate(1) with inline python
MSYS doesn't have truncate(1) installed by default, and FreeBSD looked unhappy
with the arguments provided.
Differential Revision: https://phab.mercurial-scm.org/D8147
Matt Harbison <matt_harbison@yahoo.com> [Mon, 24 Feb 2020 16:59:35 -0500] rev 44375
tests: stabilize test-rename-merge2.t on Windows
I have no idea why, but this shifted in
b4057d001760.
Differential Revision: https://phab.mercurial-scm.org/D8146
Augie Fackler <augie@google.com> [Mon, 24 Feb 2020 13:50:55 -0500] rev 44374
merge with stable
Yuya Nishihara <yuya@tcha.org> [Mon, 24 Feb 2020 13:28:49 +0900] rev 44373
py3: fix EOL detection in commandserver.channeledinput
This breaks TortoiseHg's email preview which sends b'\n' while readline
request is issued and the loop never ends. Spotted by Matt Harbison.
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Thu, 13 Feb 2020 22:51:17 -0500] rev 44372
bookmarks: prevent pushes of divergent bookmarks (foo@remote)
Before this change, such bookmarks are write-only: a client can push
them but not pull/read them. And because these bookmark can't be read,
even pushes are limited (for instance trying to delete such a bookmark
fails with a vanilla client because the client thinks the bookmark is
neither on the local nor the remote).
This change makes the server refuses such bookmarks, and for earlier
errors, makes the client refuse to send them.
I think the change of behavior is acceptable because I think this is a
bug in push/pull, and I don't think we change the behavior of `hg
unbundle`, because it doesn't seem that `hg bundle` ever store
bookmarks (and even if it did, it would seem weird anyway to try to
send divergent bookmarks).
Differential Revision: https://phab.mercurial-scm.org/D8117
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Thu, 13 Feb 2020 22:06:57 -0500] rev 44371
bookmarks: refactor in preparation for next commit
Differential Revision: https://phab.mercurial-scm.org/D8116
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 15 Feb 2020 14:51:33 -0500] rev 44370
bookmarks: avoid traceback when two pushes race to delete the same bookmark
`hg push -f -B remote-only-bookmark` can raise server-side in
`bookmarks._del` (specifically in `self._refmap.pop(mark)`), if the
remote-only bookmark got deleted concurrently.
Fix this by simply not deleting the non-existent bookmark in that
case.
For avoidance of doubt, refusing to delete a bookmark that doesn't
exist when the push starts is taking care of elsewhere; no change of
behavior there.
Differential Revision: https://phab.mercurial-scm.org/D8124
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 15 Feb 2020 15:06:41 -0500] rev 44369
relnotes: add entry about previous `hg recover` change
Differential Revision: https://phab.mercurial-scm.org/D8123
Kyle Lippincott <spectral@google.com> [Thu, 20 Feb 2020 15:15:23 -0800] rev 44368
darwin: add another preemptive gui() call when using chg
Changeset
a89381e04c58 added this gui() call before background forks, and
Google's extensions do background forks on essentially every invocation for
logging purposes. The crash is reliably (though not 100%) reproducible without
this change when running `HGPLAIN=1 chg status` in one of our repos. With this
fix, I haven't been able to trigger the crash anymore.
Differential Revision: https://phab.mercurial-scm.org/D8141
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 13:24:46 -0800] rev 44367
copy: add experimental support for marking committed copies
The simplest way I'm aware of to mark a file as copied/moved after
committing is this:
hg uncommit --keep <src> <dest> # <src> needed for move, but not copy
hg mv --after <src> <dest>
hg amend
This patch teaches `hg copy` a `--at-rev` argument to simplify that
into:
hg copy --after --at-rev . <src> <dest>
In addition to being simpler, it doesn't touch the working copy, so it
can easily be used even if the destination file has been modified in
the working copy.
Differential Revision: https://phab.mercurial-scm.org/D8035
Martin von Zweigbergk <martinvonz@google.com> [Thu, 26 Dec 2019 14:02:50 -0800] rev 44366
copy: move argument validation a little earlier
Argument validation is usually done early and I will want it done
before some code that I'm about to add.
Differential Revision: https://phab.mercurial-scm.org/D8033
Martin von Zweigbergk <martinvonz@google.com> [Tue, 28 Jan 2020 14:07:57 -0800] rev 44365
copy: add experimetal support for unmarking committed copies
The simplest way I'm aware of to unmark a file as copied after
committing is this:
hg uncommit --keep <dest>
hg forget <dest>
hg add <dest>
hg amend
This patch teaches `hg copy --forget` a `-r` argument to simplify that into:
hg copy --forget --at-rev . <dest>
In addition to being simpler, it doesn't touch the working copy, so it
can easily be used even if the destination file has been modified in
the working copy.
I'll teach `hg copy` without `--forget` to work with `--at-rev` next.
Differential Revision: https://phab.mercurial-scm.org/D8030
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 15:50:13 -0800] rev 44364
copy: add option to unmark file as copied
To unmark a file as copied, the user currently has to do this:
hg forget <dest>
hg add <dest>
The new command simplifies that to:
hg copy --forget <dest>
That's not a very big improvement, but I'm planning to also teach `hg
copy [--forget]` a `--at-rev` argument for marking/unmarking copies
after commit (usually with `--at-rev .`).
Differential Revision: https://phab.mercurial-scm.org/D8029
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 11 Feb 2020 11:18:52 +0100] rev 44363
nodemap: introduce an option to use mmap to read the nodemap mapping
The performance and memory benefit is much greater if we don't have to copy all
the data in memory for each information. So we introduce an option (on by
default) to read the data using mmap.
This changeset is the last one definition the API for index support nodemap
data. (they have to be able to use the mmaping).
Below are some benchmark comparing the best we currently have in 5.3 with the
final step of this series (using the persistent nodemap implementation in
Rust). The benchmark run `hg perfindex` with various revset and the following
variants:
Before:
* do not use the persistent nodemap
* use the CPython implementation of the index for nodemap
* use mmapping of the changelog index
After:
* use the MixedIndex Rust code, with the NodeTree object for nodemap access
(still in review)
* use the persistent nodemap data from disk
* access the persistent nodemap data through mmap
* use mmapping of the changelog index
The persistent nodemap greatly speed up most operation on very large
repositories. Some of the previously very fast lookup end up a bit slower because
the persistent nodemap has to be setup. However the absolute slowdown is very
small and won't matters in the big picture.
Here are some numbers (in seconds) for the reference copy of mozilla-try:
Revset Before After abs-change speedup
-10000: 0.004622 0.005532 0.000910 × 0.83
-10: 0.000050 0.000132 0.000082 × 0.37
tip 0.000052 0.000085 0.000033 × 0.61
0 + (-10000:) 0.028222 0.005337 -0.022885 × 5.29
0 0.023521 0.000084 -0.023437 × 280.01
(-10000:) + 0 0.235539 0.005308 -0.230231 × 44.37
(-10:) + :9 0.232883 0.000180 -0.232703 ×1293.79
(-10000:) + (:99) 0.238735 0.005358 -0.233377 × 44.55
:99 + (-10000:) 0.317942 0.005593 -0.312349 × 56.84
:9 + (-10:) 0.313372 0.000179 -0.313193 ×1750.68
:9 0.316450 0.000143 -0.316307 ×2212.93
On smaller repositories, the cost of nodemap related operation is not as big, so
the win is much more modest. Yet it helps shaving a handful of millisecond here
and there.
Here are some numbers (in seconds) for the reference copy of mercurial:
Revset Before After abs-change speedup
-10: 0.000065 0.000097 0.000032 × 0.67
tip 0.000063 0.000078 0.000015 × 0.80
0 0.000561 0.000079 -0.000482 × 7.10
-10000: 0.004609 0.003648 -0.000961 × 1.26
0 + (-10000:) 0.005023 0.003715 -0.001307 × 1.35
(-10:) + :9 0.002187 0.000108 -0.002079 ×20.25
(-10000:) + 0 0.006252 0.003716 -0.002536 × 1.68
(-10000:) + (:99) 0.006367 0.003707 -0.002660 × 1.71
:9 + (-10:) 0.003846 0.000110 -0.003736 ×34.96
:9 0.003854 0.000099 -0.003755 ×38.92
:99 + (-10000:) 0.007644 0.003778 -0.003866 × 2.02
Differential Revision: https://phab.mercurial-scm.org/D7894
Raphaël Gomès <rgomes@octobus.net> [Fri, 14 Feb 2020 15:03:26 +0100] rev 44362
rust-dirstatemap: directly return `non_normal` and `other_entries`
This cleans up the interface which I previously thought needed to be uglier
than in reality. No performance difference, simple refactoring.
Differential Revision: https://phab.mercurial-scm.org/D8121
Martin von Zweigbergk <martinvonz@google.com> [Thu, 26 Dec 2019 14:12:45 -0800] rev 44361
copy: rename `wctx` to `ctx` since it will not necessarily be working copy
Differential Revision: https://phab.mercurial-scm.org/D8032
Martin von Zweigbergk <martinvonz@google.com> [Fri, 20 Dec 2019 14:03:12 -0800] rev 44360
copy: rewrite walkpat() to depend less on dirstate
I want to add a `hg cp/mv -r <rev>` option to mark files as
copied/moved in an existing commit (amending that commit). The code
needs to not depend on the dirstate for that.
Differential Revision: https://phab.mercurial-scm.org/D8031
Martin von Zweigbergk <martinvonz@google.com> [Thu, 13 Feb 2020 10:12:12 -0800] rev 44359
merge with stable
Yuya Nishihara <yuya@tcha.org> [Sat, 01 Feb 2020 12:57:32 +0900] rev 44358
pathutil: resurrect comment about path auditing order
It was removed at
51c86c6167c1, but expensive symlink traversal isn't the
only reason we should walk path components from the root.
Raphaël Gomès <rgomes@octobus.net> [Wed, 16 Oct 2019 14:12:48 +0200] rev 44357
rust-dirstatemap: remove additional lookup in dirstate.matches
We use the same trick as the Python implementation
Differential Revision: https://phab.mercurial-scm.org/D7119
Georges Racinet <georges.racinet@octobus.net> [Tue, 31 Dec 2019 12:43:57 +0100] rev 44356
rust-nodemap: insert method
In this implementation, we are in direct competition
with the C version: this Rust version will have a clear
startup advantage because it will read the data from disk,
but the insertion happens all in memory for both.
Differential Revision: https://phab.mercurial-scm.org/D7795
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Wed, 22 Jan 2020 14:21:34 -0500] rev 44355
recover: don't verify by default
The reason is:
- it's not that hard to trigger interrupted transactions: just run out
of disk space
- it takes forever to verify on large repos. Before --no-verify, I
told people to C-c hg recover when the progress bar showed up. Now I
tell them to pass --no-verify.
- I don't remember a single case where the verification step was
useful
This is technically a change of behavior. Perhaps this would be better
suited for tweakdefaults?
Differential Revision: https://phab.mercurial-scm.org/D7972
Augie Fackler <augie@google.com> [Tue, 11 Feb 2020 00:08:28 -0500] rev 44354
context: use manifest.find() instead of two separate calls
I noticed this while debugging an extension that's implementing the manifest
interface. Always nice to save a function call.
Differential Revision: https://phab.mercurial-scm.org/D8109
Raphaël Gomès <rgomes@octobus.net> [Thu, 16 Jan 2020 23:06:01 +0100] rev 44353
rust-matchers: implement `visit_children_set` for `FileMatcher`
As per the removed inline comment, this will become useful in a future patch
in this series as the `IncludeMatcher` is introduced.
Differential Revision: https://phab.mercurial-scm.org/D7914
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 17:13:51 -0500] rev 44352
manifest: move matches method to be outside the interface
In order to adequately smoke out any legacy consumers of the method, we rename
it to _matches so it's clear that it's class-private. To my amazement, all
consumers of this method really only wanted matching filenames, not a full
filtered manifest.
Differential Revision: https://phab.mercurial-scm.org/D8085
Augie Fackler <augie@google.com> [Mon, 10 Feb 2020 21:02:22 -0500] rev 44351
tags: use modern // operator for division
Fixes a test on Python 3.
# skip-blame only correcting a division operator, not a substantive change
Differential Revision: https://phab.mercurial-scm.org/D8108
Augie Fackler <augie@google.com> [Mon, 10 Feb 2020 20:47:19 -0500] rev 44350
tags: fix some type confusion exposed in python 3
# skip-blame just b-prefix and %-format cleanup, no meaningful change
Differential Revision: https://phab.mercurial-scm.org/D8107
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 17:20:12 -0800] rev 44349
rebase: remove some now-unused parent arguments
Differential Revision: https://phab.mercurial-scm.org/D7829
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 21:40:01 -0800] rev 44348
rebase: remove some redundant setting of dirstate parents
Since we're now setting the dirstate parents to its correct values
from the beginning (right after `merge.update()`), we usually don't
need to set them again before committing. The only case we need to
care about is when committing collapsed commits. So we can remove the
`setparents()` calls just before committing and add one only for the
collapse case.
Differential Revision: https://phab.mercurial-scm.org/D7828
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 14:22:20 -0800] rev 44347
rebase: don't use rebased node as dirstate p2 (BC)
When rebasing a node, we currently use the rebased node as p2 in the
dirstate until just before we commit it (we then change to the desired
parents). This p2 is visible to the user when the rebase gets
interrupted because of merge conflicts. That can be useful to the user
as a reminder of which commit is currently being rebased, but I
believe it's incorrect for a few reasons:
* I think the dirstate parents should be the ones that will be set
when the commit is created.
* I think having two parents means that you're merging those two
commits, but when rebasing, you're generally grafting, not merging.
* When rebasing a merge commit, we should use the two desired parents
as dirstate parents (and we clearly can't have the rebased node as
a third dirstate parent).
* `hg graft` (and `hg update --merge`) sets only one parent and `hg
rebase` should be consistent with that.
I realize that this is a somewhat large user-visible change, but I
think it's worth it because it will simplify things quite a bit.
Differential Revision: https://phab.mercurial-scm.org/D7827
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 14:17:56 -0800] rev 44346
rebase: stop relying on having two parents to resume rebase
I'm about to make it so we don't have two parents when a rebase is
interrupted (unless we're just rebasing on a merge commit). The code
for detecting if we're resuming a rebase relied on having two parents,
so this patch rewrites that to instead set a boolean when we resume.
Note that `self.resume` in the new condition implies `not
self.inmemory` (rebase cannot be resumed in memory), so that's why
that part can be omitted.
Differential Revision: https://phab.mercurial-scm.org/D7826
Martin von Zweigbergk <martinvonz@google.com> [Tue, 28 Jan 2020 21:49:50 -0800] rev 44345
graphlog: use '%' for other context in merge conflict
This lets the user more easily find the commit that is involved in the
conflict, such as the source of `hg update -m` or the commit being
grafted by `hg graft`.
Differential Revision: https://phab.mercurial-scm.org/D8043
Martin von Zweigbergk <martinvonz@google.com> [Wed, 29 Jan 2020 14:42:54 -0800] rev 44344
tests: add `hg log -G` output when there are merge conflicts
The next commit will change the behavior for these. I've used slightly
different commands in the different tests to match the surrounding
style.
Differential Revision: https://phab.mercurial-scm.org/D8042
Martin von Zweigbergk <martinvonz@google.com> [Wed, 29 Jan 2020 11:30:35 -0800] rev 44343
revset: add a revset for parents in merge state
This may be particularly useful soon, when I'm going to change how `hg
rebase` sets its parents during conflict resolution.
Differential Revision: https://phab.mercurial-scm.org/D8041
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 17:46:10 -0800] rev 44342
tests: add test of rebase with conflict in merge commit
It doesn't seem like we had any tests of this. I think it's pretty
weird that the two parents we're merging are not the working copy
parents during the conflict resolution.
Differential Revision: https://phab.mercurial-scm.org/D7824
Martin von Zweigbergk <martinvonz@google.com> [Thu, 16 Jan 2020 00:03:19 -0800] rev 44341
rebase: always be graft-like, not merge-like, also for merges
Rebase works by updating to a commit and then grafting changes on
top. However, before this patch, it would actually merge in changes
instead of grafting them in in some cases. That is, it would use the
common ancestor as base instead of using one of the parents. That
seems wrong to me, so I'm changing it so `defineparents()` always
returns a value for `base`.
This fixes the bad behavior in test-rebase-newancestor.t, which was
introduced in
65f215ea3e8e (tests: add test for rebasing merges with
ancestors of the rebase destination, 2014-11-30).
The difference in test-rebase-dest.t is because the files in the tip
revision were A, D, E, F before this patch and A, D, F, G after it. I
think both files should ideally be there.
Differential Revision: https://phab.mercurial-scm.org/D7907
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:51:01 +0100] rev 44340
nodemap: update the index with the newly written data (when appropriate)
If we are to use mmap to read the nodemap data, and if the python code is
responsible for the IO, we need to refresh the mmap after each write and provide
it back to the index.
We start this dance without the mmap first.
Differential Revision: https://phab.mercurial-scm.org/D7893
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:52 +0100] rev 44339
nodemap: never read more than the expected data amount
Since we are tracking this number we can use it to detect corrupted rawdata file
and to only read the correct amount of data when possible.
Differential Revision: https://phab.mercurial-scm.org/D7892
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:43 +0100] rev 44338
nodemap: write new data from the expected current data length
If the amount of data in the file exceed the expect amount, we will overwrite
the extra data. This is a simple way to be safer.
Differential Revision: https://phab.mercurial-scm.org/D7891
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:33 +0100] rev 44337
nodemap: double check the source docket when doing incremental update
In theory, the index will have the information we expect it to have. However by
security, it seems safer to double check that the incremental data are generated
from the data currently on disk.
Differential Revision: https://phab.mercurial-scm.org/D7890
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:24 +0100] rev 44336
nodemap: track the total and unused amount of data in the rawdata file
We need to keep that information around:
* total data will allow transaction to start appending new information without
confusing other reader.
* unused data will allow to detect when we should regenerate new rawdata file.
Differential Revision: https://phab.mercurial-scm.org/D7889
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:14 +0100] rev 44335
nodemap: track the maximum revision tracked in the nodemap
We need a simple way to detect when the on disk data contains less revision
than the index we read from disk. The docket file is meant for this, we just had
to start tracking that data.
We should also try to detect strip operation, but we will deal with this in
later changesets. Right now we are focusing on defining the API for index
supporting persistent nodemap.
Differential Revision: https://phab.mercurial-scm.org/D7888
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:04 +0100] rev 44334
nodemap: add a flag to dump the details of the docket
We are about to add more information to the docket. We first introduce a way to
debug its content.
Differential Revision: https://phab.mercurial-scm.org/D7887
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:54 +0100] rev 44333
nodemap: introduce append-only incremental update of the persistent data
Rewriting the full nodemap for each transaction has a cost we would like to
avoid. We introduce a new way to write persistent nodemap data by adding new
information at the end for file. Any new and updated block as added at the end
of the file. The last block is the new root node.
With this method, some of the block already on disk get "dereferenced" and
become dead data. In later changesets, We'll start tracking the amount of dead
data to eventually re-generate a full nodemap.
Differential Revision: https://phab.mercurial-scm.org/D7886
Kyle Lippincott <spectral@google.com> [Thu, 20 Feb 2020 16:21:00 -0800] rev 44332
shelve: fix ordering of merge labels
Differential Revision: https://phab.mercurial-scm.org/D8140
Kyle Lippincott <spectral@google.com> [Thu, 20 Feb 2020 17:06:01 -0800] rev 44331
shelve: add test clearly demonstrating that the conflict labels are backwards
Differential Revision: https://phab.mercurial-scm.org/D8139
Matt Harbison <matt_harbison@yahoo.com> [Sun, 16 Feb 2020 17:05:18 -0500] rev 44330
import: don't ignore `--secret` when `--bypass` is specified
Differential Revision: https://phab.mercurial-scm.org/D8126
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Feb 2020 13:46:10 -0500] rev 44329
phabricator: fix a phabsend crash when processing a renamed binary
This was a trivial fix, and some more tests are added to cover binary files.
Since the old filecontext is passed in, the old name is still available. But I
noticed some weirdness around what it marked as binary and not, and what is
viewable in Phabricator. Those things have been flagged, and will probably take
some digging.
Differential Revision: https://phab.mercurial-scm.org/D8133
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 13 Dec 2019 10:37:45 +0100] rev 44328
test: pin the number of CPU for
issue4074 tests
On machine with an hundreds of CPUs, the "user" CPU time reported can be
inflated by the status steps. Since the test especially focus on the diff
computation, we restrict the number of CPU to avoid potential issues.
Differential Revision: https://phab.mercurial-scm.org/D8112
Raphaël Gomès <rgomes@octobus.net> [Wed, 12 Feb 2020 23:23:59 +0100] rev 44327
rust-dirstatemap: add `NonNormalEntries` class
This fix introduces the same encapsulation as the `copymap`. There is no easy
way of doing this any better for now.
`hg up -r null && time HGRCPATH= HGMODULEPOLICY=rust+c hg up tip` on Mozilla
Central, (not super recent, but it doesn't matter):
Before: 7:44,08 total
After: 1:03,23 total
Pretty brutal regression!
This is a graft on stable of
cf1f8660e568
Differential Revision: https://phab.mercurial-scm.org/D8111
Raphaël Gomès <rgomes@octobus.net> [Thu, 30 Jan 2020 14:57:02 +0100] rev 44326
rust-dirstatemap: cache non normal and other parent set
Performance of `hg update` was significantly worse since the introduction of
the Rust `dirstatemap`. This regression was noticed by Valentin Gatien-Baron
when working on a large repository, as it goes unnoticed for smaller
repositories like Mercurial itself.
This fix introduces the same getter/setter mechanism at `hg-core` level as
for `set/get_dirs`.
While this technique is, as previously discussed, quite suboptimal, it fixes an
important enough problem. Refactoring `hg-core` to use the typestate
pattern could be a good approach to improving code quality in a future patch.
This is a graft of stable of
83b2b829c94e
Differential Revision: https://phab.mercurial-scm.org/D8110
Yuya Nishihara <yuya@tcha.org> [Tue, 11 Feb 2020 19:53:56 +0900] rev 44325
chgserver: spawn new process if schemes change
The schemes extension updates hg.schemes table. It's technically possible
for hg.repository() to look for e.g. ui.schemes instead of depending on
module-local table, but I don't think the change would make much sense
since [schemes] is usually specified in ~/.hgrc and thus it can be considered
static data.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 10 Feb 2020 15:52:52 -0800] rev 44324
tests: accept new bzr message about switching branches
The new version apparently prints "Switched to branch at " instead of
"Switched to branch: ".
Differential Revision: https://phab.mercurial-scm.org/D8106
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:45 +0100] rev 44323
nodemap: keep track of the docket for loaded data
To perform incremental update of the on disk data, we need to keep tracks of
some aspect of that data.
Differential Revision: https://phab.mercurial-scm.org/D7885
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:35 +0100] rev 44322
nodemap: introduce an explicit class/object for the docket
We are about to add more information to this docket, having a clear location to
stock them in memory will help.
Differential Revision: https://phab.mercurial-scm.org/D7884
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:26 +0100] rev 44321
nodemap: keep track of the ondisk id of nodemap blocks
If we are to incrementally update the files, we need to keep some details about
the data we read.
Differential Revision: https://phab.mercurial-scm.org/D7883
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:16 +0100] rev 44320
nodemap: provide the on disk data to indexes who support it
Time to start defining the API and prepare the rust index support. We provide
a method to do so. We use a distinct method instead of passing them in the
constructor because we will need this method anyway later (to refresh the mmap
once we update the data on disk).
Differential Revision: https://phab.mercurial-scm.org/D7847
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:06 +0100] rev 44319
nodemap: all check that revision and nodes match in the nodemap
More check is always useful.
Differential Revision: https://phab.mercurial-scm.org/D7846
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:57 +0100] rev 44318
nodemap: add basic checking of the on disk nodemap content
The simplest check it so verify we have all the revision we needs, and nothing
more.
Differential Revision: https://phab.mercurial-scm.org/D7845
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:47 +0100] rev 44317
nodemap: code to parse the persistent binary nodemap data
We now have code to read back what we persisted. This will be put to use in
later changesets.
Differential Revision: https://phab.mercurial-scm.org/D7844
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:38 +0100] rev 44316
nodemap: move the iteratio inside the Block object
Having the iteration inside the serialization function does not help
readability. Now that we have a `Block` object, let us move that code there.
Differential Revision: https://phab.mercurial-scm.org/D7843
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:28 +0100] rev 44315
nodemap: use an explicit "Block" object in the reference implementation
This will help us to introduce some test around the data currently written on
disk.
Differential Revision: https://phab.mercurial-scm.org/D7842
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:19 +0100] rev 44314
nodemap: add a optional `nodemap_add_full` method on indexes
This method can be used to obtains persistent data for a full nodemap. The end
goal is for some index implementation to managed the nodemap serialization them
selves (eg: the rust implementation)
Differential Revision: https://phab.mercurial-scm.org/D7841
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:09 +0100] rev 44313
nodemap: add a (python) index class for persistent nodemap testing
Using the persistent nodemap require a compeling performance boost and an
existing implementation. The benefit of the persistent nodemap for pure python
code is unclear and we don't have a C implementation for it. Yet we would like
to actually start testing it in more details and define an API for using that
persistent nodemap.
We introduce a new `devel` config option to use an index class dedicated to
Nodemap Testing. This feature is "pure" only because having using a pure-python
index with the `cext` policy proved more difficult than I would like.
There is nothing going on in that class for now, but the coming changeset will
change that.
Differential Revision: https://phab.mercurial-scm.org/D7840
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:59 +0100] rev 44312
nodemap: delete older raw data file when creating a new ones
When we write new full files, it replace an older one with a different name. We
add the associated cleanup for the older file to be removed after the
transaction.
We delete all file matching the expected pattern to give use extra chance to
delete orphan files we might have failed to delete earlier.
Note: eventually we won't rewrite all data for each transaction. This is coming
in later changesets.
Differential Revision: https://phab.mercurial-scm.org/D7839
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:50 +0100] rev 44311
nodemap: use an intermediate "docket" file to carry small metadata
This intermediate file will make mmapping, transaction and content validation
easier. (Most of this usefulness will arrive gradually in later changeset). In
particular it will become very useful to append new data are the end of raw file
instead of rewriting on the file on each transaction.
See in code comments for details.
Differential Revision: https://phab.mercurial-scm.org/D7838
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:40 +0100] rev 44310
nodemap: only use persistent nodemap for non-inlined revlog
Revlog are inlined while they are small (to avoid having too many file to deal
with). The persistent nodemap will only provides a significant boost for large
enough revlog index. So it does not make sens to add an extra file to store
nodemap for small revlog.
We could consider inclining the nodemap data inside the revlog itself, but the
benefit is unclear so let it be an adventure for another time.
Differential Revision: https://phab.mercurial-scm.org/D7837
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:31 +0100] rev 44309
nodemap: add a function to read the data from disk
This changeset is small and mostly an excuse to introduce an API function
reading the data from disk.
Differential Revision: https://phab.mercurial-scm.org/D7836
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:21 +0100] rev 44308
nodemap: write nodemap data on disk
Let us start writing data on disk (so that we can read it from there later).
This series of changeset is going to focus first on having data on disk and
updating it.
Right now the data is written right next to the revlog data, in the store. We
might move it to cache (with proper cache validation mechanism) later, but for
now revlog have a storevfs instance and it is simpler to us it. The right
location for this data is not the focus of this series.
Differential Revision: https://phab.mercurial-scm.org/D7835
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:12 +0100] rev 44307
nodemap: have some python code writing a nodemap in persistent binary form
This python code aims to be as "simple" as possible. It is a reference
implementation of the data we are going to write on disk (and possibly,
later a way for pure python install to make sure the on disk data are up to
date).
It is not optimized for performance and rebuild the full data structure from
the index every time.
This is a stepping stone toward a persistent nodemap on disk.
Differential Revision: https://phab.mercurial-scm.org/D7834
Augie Fackler <augie@google.com> [Mon, 10 Feb 2020 17:31:05 -0500] rev 44306
cleanup: re-run black on the codebase
Looks like a few patches have landed without having been blackened. I
strongly suspect I should write a patch for baymax that blackens
things on the way in...
# skip-blame automatic formatting
Differential Revision: https://phab.mercurial-scm.org/D8104
Raphaël Gomès <rgomes@octobus.net> [Thu, 16 Jan 2020 13:34:04 +0100] rev 44305
rust-re2: add wrapper for calling Re2 from Rust
This assumes that Re2 is installed following Google's guide. I am not sure
how we want to integrate it in the project, but I think a follow-up patch would
be more appropriate for such work.
As it stands, *not* having Re2 installed results in a compilation error, which
is a problem as it breaks install compatibility. Hence, this is gated behind
a non-default `with-re2` compilation feature.
Differential Revision: https://phab.mercurial-scm.org/D7910
Raphaël Gomès <rgomes@octobus.net> [Thu, 16 Jan 2020 11:27:12 +0100] rev 44304
rust-filepatterns: add support for `include` and `subinclude` patterns
This prepares a future patch for `IncludeMatcher` on the road to bare
`hg status` support.
Differential Revision: https://phab.mercurial-scm.org/D7909
Raphaël Gomès <rgomes@octobus.net> [Thu, 16 Jan 2020 10:28:40 +0100] rev 44303
rust-filepatterns: improve API and robustness for pattern files parsing
Within the next few patches we will be using this new API.
Differential Revision: https://phab.mercurial-scm.org/D7908
Martin von Zweigbergk <martinvonz@google.com> [Mon, 10 Feb 2020 15:50:26 -0800] rev 44302
tests: add workaround for bzr bug
This started failing for me today. I guess my bzr was upgraded.
Differential Revision: https://phab.mercurial-scm.org/D8105
Raphaël Gomès <rgomes@octobus.net> [Tue, 14 Jan 2020 17:10:20 +0100] rev 44301
rust-utils: add util for canonical path
Differential Revision: https://phab.mercurial-scm.org/D7871
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 01 Feb 2020 09:14:36 +0100] rev 44300
test: simplify test-amend.t to avoid race condition
Insted on relying on sleep, we could simply have the editor do the file change.
This remove the reliance on "sleep" and avoid test failing on heavy load
machine.
To test this, I reverted the code change in
5558e3437872 and the test started
failing again.
This is a graft on stable of
141ceec06b55 which should have targeted for stable.
Differential Revision: https://phab.mercurial-scm.org/D8103
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 09 Feb 2020 01:34:37 +0100] rev 44299
remotefilelog-test: glob some flaky output line (
issue6083)
The two following lines are flaky underload, yet the final result is correct.
The command involves background pre-check of output, these are not stable
probably because they run in parallel in multiple process.
I spent a couple of hours trying to understand the pattern and gave up. The
documented intend of these tests is safely guaranteed by checking the cache
content after the command.
If it become useful to start testing precise internal details of the, they will
have to be tested in a more appropriate framework than `.t` tests.
Differential Revision: https://phab.mercurial-scm.org/D8102
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 10:24:32 -0500] rev 44298
httpconnection: allow `httpsendfile` subclasses to suppress the progressbar
This will be neccessary for LFS, which manages the progressbar outside of the
file.
Differential Revision: https://phab.mercurial-scm.org/D7960
Raphaël Gomès <rgomes@octobus.net> [Mon, 10 Feb 2020 21:54:12 +0100] rev 44297
rust-dirstatemap: add `NonNormalEntries` class
This fix introduces the same encapsulation as the `copymap`. There is no easy
way of doing this any better for now.
`hg up -r null && time HGRCPATH= HGMODULEPOLICY=rust+c hg up tip` on Mozilla
Central, (not super recent, but it doesn't matter):
Before: 7:44,08 total
After: 1:03,23 total
Pretty brutal regression!
Differential Revision: https://phab.mercurial-scm.org/D8049
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sun, 09 Feb 2020 16:18:26 -0500] rev 44296
help: when possible, indicate flags implied by tweakdefaults
Differential Revision: https://phab.mercurial-scm.org/D8101
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sun, 09 Feb 2020 15:50:36 -0500] rev 44295
help: add a mechanism to change flags' help depending on config
It seems reasonable to have a similar mechanism for the rest of the
help, but no such thing is implemented.
The goal is to make the help of commands clearer in the presence of
significant default changes, like tweakdefaults or with company-wide
hgrcs. In these cases, a user looking at the help of a command doesn't
exactly know what his hgrc is doing.
Apply to this to the --git option of commands that display diffs, as
this option in particular causes confusion for some reason.
Differential Revision: https://phab.mercurial-scm.org/D8100
Matt Harbison <matt_harbison@yahoo.com> [Sat, 08 Feb 2020 23:39:55 -0500] rev 44294
lfs: use str for the open() mode when opening a blob for py3
The other fix for this was to leave the mode as bytes, and import
`pycompat.open()` like a bunch of other modules do. But I think it's confusing
to still use bytes at the python boundary, and obviously error prone. Grepping
for ` open\(.+, ['"][a-z]+['"]\)` and ` open\(.+, b['"][a-z]+['"]\)` outside of
`tests`, there are 51 and 87 uses respectively, so it's not like this is a rare
direct usage.
Differential Revision: https://phab.mercurial-scm.org/D8099
Raphaël Gomès <rgomes@octobus.net> [Thu, 30 Jan 2020 14:57:02 +0100] rev 44293
rust-dirstatemap: cache non normal and other parent set
Performance of `hg update` was significantly worse since the introduction of
the Rust `dirstatemap`. This regression was noticed by Valentin Gatien-Baron
when working on a large repository, as it goes unnoticed for smaller
repositories like Mercurial itself.
This fix introduces the same getter/setter mechanism at `hg-core` level as
for `set/get_dirs`.
While this technique is, as previously discussed, quite suboptimal, it fixes an
important enough problem. Refactoring `hg-core` to use the typestate
pattern could be a good approach to improving code quality in a future patch.
Differential Revision: https://phab.mercurial-scm.org/D8048
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 16:01:32 -0500] rev 44292
tags: behave better if a tags cache entry is partially written
This is done by discarding any partial cache entry, instead of
filling the partial cache entry with 0xff before.
Differential Revision: https://phab.mercurial-scm.org/D8095
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 15:55:26 -0500] rev 44291
tags: show how hg behaves if a tags cache entry is truncated
I'm seeing an error of this form in production on the order of once a
month. I'm not sure how it happens, but I suspect interrupting a pull
might result in half written cache entries.
Differential Revision: https://phab.mercurial-scm.org/D8094