Raphaël Gomès <rgomes@octobus.net> [Thu, 30 Jan 2020 14:57:02 +0100] rev 44293
rust-dirstatemap: cache non normal and other parent set
Performance of `hg update` was significantly worse since the introduction of
the Rust `dirstatemap`. This regression was noticed by Valentin Gatien-Baron
when working on a large repository, as it goes unnoticed for smaller
repositories like Mercurial itself.
This fix introduces the same getter/setter mechanism at `hg-core` level as
for `set/get_dirs`.
While this technique is, as previously discussed, quite suboptimal, it fixes an
important enough problem. Refactoring `hg-core` to use the typestate
pattern could be a good approach to improving code quality in a future patch.
Differential Revision: https://phab.mercurial-scm.org/D8048
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 16:01:32 -0500] rev 44292
tags: behave better if a tags cache entry is partially written
This is done by discarding any partial cache entry, instead of
filling the partial cache entry with 0xff before.
Differential Revision: https://phab.mercurial-scm.org/D8095
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 15:55:26 -0500] rev 44291
tags: show how hg behaves if a tags cache entry is truncated
I'm seeing an error of this form in production on the order of once a
month. I'm not sure how it happens, but I suspect interrupting a pull
might result in half written cache entries.
Differential Revision: https://phab.mercurial-scm.org/D8094
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 13:54:09 -0500] rev 44290
tags: add a debug command to display .hg/cache/hgtagsfnodes1
Differential Revision: https://phab.mercurial-scm.org/D8093
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 08 Feb 2020 10:22:47 -0500] rev 44289
purge: add -i flag to delete ignored files instead of untracked files
It's convenient for deleting build artifacts. Using --all instead
would delete other things too.
Differential Revision: https://phab.mercurial-scm.org/D8096
Matt Harbison <matt_harbison@yahoo.com> [Thu, 30 Jan 2020 19:50:43 -0500] rev 44288
pyoxidizer: use `legacy_windows_stdio` on Windows
The C executable sets this too, otherwise no output shows up (when paging?).
There is also `legacy_windows_fs_encoding`, but I'm not setting that for now
because the C executable doesn't either.
Differential Revision: https://phab.mercurial-scm.org/D8053
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 17:12:39 -0500] rev 44287
merge: use manifestdict.walk() instead of manifestdict.matches()
As with other patches in this series, this avoids making a
potentially-expensive copy of a manifest.
Differential Revision: https://phab.mercurial-scm.org/D8084
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 16:58:50 -0500] rev 44286
manifest: rewrite filesnotin to not make superfluous manifest copies
This also skips using diff() when all we care about is the filenames. I'm
expecting the built in set logic to be plenty fast. For really large manifests
with a matcher in play this should copy substantially less data around.
Differential Revision: https://phab.mercurial-scm.org/D8082
Pulkit Goyal <7895pulkit@gmail.com> [Sat, 08 Feb 2020 03:13:45 +0530] rev 44285
merge with stable
Augie Fackler <augie@google.com> [Thu, 06 Feb 2020 16:55:39 -0500] rev 44284
archival: use walk() instead of matches() on manifest
All we care about is the filepaths, so this avoids a pointless copy of the
manifest that we only used to extract matching filenames.
Differential Revision: https://phab.mercurial-scm.org/D8090
Raphaël Gomès <rgomes@octobus.net> [Fri, 24 Jan 2020 11:10:07 +0100] rev 44283
rust-dirs-multiset: improve temporary error message
While we wait on a future patch that could verify that the paths passed to
`DirsMultiset` have been audited, we still need to handle this error.
This patch makes it easier to bubble up and makes the error clearer.
Also, this patch introduces the `subslice_index` function that could be useful
for other - albeit niche - purposes.
Differential Revision: https://phab.mercurial-scm.org/D7921
Matt Harbison <matt_harbison@yahoo.com> [Wed, 22 Jan 2020 12:11:35 -0500] rev 44282
exchange: check the `ui.clonebundleprefers` form while processing (
issue6257)
Otherwise the clone command will emit a long stacktrace if there is no `=`
character.
Differential Revision: https://phab.mercurial-scm.org/D7969
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 13 Dec 2019 16:49:05 +0100] rev 44281
copies: add a new test dedicated to testing chain of changeset with merge
The copies test we currently have usually focus on simple case that do not dive
too much into longer chains involving merges. This new test file focus on
extensive testing of these case to validate their behavior and make sure the
various copies algorithm have the same behavior.
And… actually these test are currently broken for the changeset centric
algorithm since
99ebde4fec99, but it went undetected because these case were not
tested.
Differential Revision: https://phab.mercurial-scm.org/D8078
Joerg Sonnenberger <joerg@bec.de> [Wed, 18 Sep 2019 06:07:09 +0200] rev 44280
hgext: initial version of fastexport extension
Differential Revision: https://phab.mercurial-scm.org/D7733
Julien Cristau <jcristau@mozilla.com> [Fri, 07 Feb 2020 15:55:21 +0100] rev 44279
hghave: cache the result of gethgversion
hghave --test-features calls it 90 times, each one calling hg --version
which takes a tenth of a second on my workstation, adding up to about
10s win on test-hghave.t.
Fixes https://bugs.debian.org/939756
Differential Revision: https://phab.mercurial-scm.org/D8092
Augie Fackler <augie@google.com> [Mon, 03 Feb 2020 11:56:02 -0500] rev 44278
resourceutil: blacken
Martin von Zweigbergk <martinvonz@google.com> [Fri, 24 Jan 2020 14:11:43 -0800] rev 44277
clean: delete obsolete unlinking of .hg/graftstate
The responsibility for clearing it is now in
`cmdutil.clearunfinished()`, so we shouldn't have to unlink it in
`hg.clean()`.
Differential Revision: https://phab.mercurial-scm.org/D7992
Martin von Zweigbergk <martinvonz@google.com> [Tue, 04 Feb 2020 10:16:30 -0800] rev 44276
copies: avoid filtering by short-circuit dirstate-only copies earlier
The call to `y.ancestor(x)` triggered repo filtering, which we'd like
to avoid in the simple `hg status --copies` case.
Differential Revision: https://phab.mercurial-scm.org/D8071
Martin von Zweigbergk <martinvonz@google.com> [Tue, 04 Feb 2020 10:14:44 -0800] rev 44275
tests: add test showing that repo filter is calculated for `hg st --copies`
Differential Revision: https://phab.mercurial-scm.org/D8070
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 11:40:15 -0500] rev 44274
lfs: enable workers by default
With the stall issue seemingly fixed, there's no reason not to use workers. The
setting is left for now to keep the test output deterministic, and in case other
issues come up. If none do, this can be converted to a developer setting for
usage with the tests.
Differential Revision: https://phab.mercurial-scm.org/D7963
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 11:32:33 -0500] rev 44273
lfs: fix the stall and corruption issue when concurrently uploading blobs
We've avoided the issue up to this point by gating worker usage with an
experimental config. See
10e62d5efa73, and the thread linked there for some of
the initial diagnosis, but essentially some data was being read from the blob
before an error occurred and `keepalive` retried, but didn't rewind the file
pointer. So the leading data was lost from the blob on the server, and the
connection stalled, trying to send more data than available.
In trying to recreate this, I was unable to do so uploading from Windows to
CentOS 7. But it reproduced every time going from CentOS 7 to another CentOS 7
over https.
I found recent fixes in the FaceBook repo to address this[1][2]. The commit
message for the first is:
The KeepAlive HTTP implementation is bugged in it's retry logic, it supports
reading from a file pointer, but doesn't support rewinding of the seek cursor
when it performs a retry. So it can happen that an upload fails for whatever
reason and will then 'hang' on the retry event.
The sequence of events that get triggered are:
- Upload file A, goes OK. Keep-Alive caches connection.
- Upload file B, fails due to (for example) failing Keep-Alive, but LFS file
pointer has been consumed for the upload and fd has been closed.
- Retry for file B starts, sets the Content-Length properly to the expected
file size, but since file pointer has been consumed no data will be uploaded,
causing the server to wait for the uploaded data until either client or
server reaches a timeout, making it seem as our mercurial process hangs.
This is just a stop-gap measure to prevent this behavior from blocking Mercurial
(LFS has retry logic). A proper solutions need to be build on top of this
stop-gap measure: for upload from file pointers, we should support fseek() on
the interface. Since we expect to consume the whole file always anyways, this
should be safe. This way we can seek back to the beginning on a retry.
I ported those two patches, and it works. But I see that `url._sendfile()` does
a rewind on `httpsendfile` objects[3], so maybe it's better to keep this all in
one place and avoid a second seek. We may still want the first FaceBook patch
as extra protection for this problem in general. The other two uses of
`httpsendfile` are in the wire protocol to upload bundles, and to upload
largefiles. Neither of these appear to use a worker, and I'm not sure why
workers seem to trigger this, or if this could have happened without a worker.
Since `httpsendfile` already has a `close()` method, that is dropped. That
class also explicitly says there's no `__len__` attribute, so that is removed
too. The override for `read()` is necessary to avoid the progressbar usage per
file.
[1] https://github.com/facebookexperimental/eden/commit/
c350d6536d90c044c837abdd3675185644481469
[2] https://github.com/facebookexperimental/eden/commit/
77f0d3fd0415e81b63e317e457af9c55c46103ee
[3] https://www.mercurial-scm.org/repo/hg/file/5.2.2/mercurial/url.py#l176
Differential Revision: https://phab.mercurial-scm.org/D7962
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 10:34:15 -0500] rev 44272
lfs: add a method to the local blobstore to convert OIDs to file paths
This is less ugly than passing an open callback to the `httpsendfile`
constuctor.
Differential Revision: https://phab.mercurial-scm.org/D7961
Martin von Zweigbergk <martinvonz@google.com> [Wed, 15 Jan 2020 14:47:38 -0800] rev 44271
merge: introduce a revert_to() for that use-case
In the same vein as the previous patch.
Differential Revision: https://phab.mercurial-scm.org/D7901
Martin von Zweigbergk <martinvonz@google.com> [Wed, 15 Jan 2020 15:30:25 -0800] rev 44270
merge: introduce a clean_update() for that use-case
I find it hard to understand what value to pass for all the arguments
to `merge.update()`. I would like to introduce functions that are more
specific to each use-case. We already have `graft()`. This patch
introduces a `clean_update()` and uses it in some places to show that
it works.
Differential Revision: https://phab.mercurial-scm.org/D7902