Martin von Zweigbergk <martinvonz@google.com> [Tue, 28 Jan 2020 21:49:50 -0800] rev 44345
graphlog: use '%' for other context in merge conflict
This lets the user more easily find the commit that is involved in the
conflict, such as the source of `hg update -m` or the commit being
grafted by `hg graft`.
Differential Revision: https://phab.mercurial-scm.org/D8043
Martin von Zweigbergk <martinvonz@google.com> [Wed, 29 Jan 2020 14:42:54 -0800] rev 44344
tests: add `hg log -G` output when there are merge conflicts
The next commit will change the behavior for these. I've used slightly
different commands in the different tests to match the surrounding
style.
Differential Revision: https://phab.mercurial-scm.org/D8042
Martin von Zweigbergk <martinvonz@google.com> [Wed, 29 Jan 2020 11:30:35 -0800] rev 44343
revset: add a revset for parents in merge state
This may be particularly useful soon, when I'm going to change how `hg
rebase` sets its parents during conflict resolution.
Differential Revision: https://phab.mercurial-scm.org/D8041
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 17:46:10 -0800] rev 44342
tests: add test of rebase with conflict in merge commit
It doesn't seem like we had any tests of this. I think it's pretty
weird that the two parents we're merging are not the working copy
parents during the conflict resolution.
Differential Revision: https://phab.mercurial-scm.org/D7824
Martin von Zweigbergk <martinvonz@google.com> [Thu, 16 Jan 2020 00:03:19 -0800] rev 44341
rebase: always be graft-like, not merge-like, also for merges
Rebase works by updating to a commit and then grafting changes on
top. However, before this patch, it would actually merge in changes
instead of grafting them in in some cases. That is, it would use the
common ancestor as base instead of using one of the parents. That
seems wrong to me, so I'm changing it so `defineparents()` always
returns a value for `base`.
This fixes the bad behavior in test-rebase-newancestor.t, which was
introduced in
65f215ea3e8e (tests: add test for rebasing merges with
ancestors of the rebase destination, 2014-11-30).
The difference in test-rebase-dest.t is because the files in the tip
revision were A, D, E, F before this patch and A, D, F, G after it. I
think both files should ideally be there.
Differential Revision: https://phab.mercurial-scm.org/D7907
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:51:01 +0100] rev 44340
nodemap: update the index with the newly written data (when appropriate)
If we are to use mmap to read the nodemap data, and if the python code is
responsible for the IO, we need to refresh the mmap after each write and provide
it back to the index.
We start this dance without the mmap first.
Differential Revision: https://phab.mercurial-scm.org/D7893
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:52 +0100] rev 44339
nodemap: never read more than the expected data amount
Since we are tracking this number we can use it to detect corrupted rawdata file
and to only read the correct amount of data when possible.
Differential Revision: https://phab.mercurial-scm.org/D7892
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:43 +0100] rev 44338
nodemap: write new data from the expected current data length
If the amount of data in the file exceed the expect amount, we will overwrite
the extra data. This is a simple way to be safer.
Differential Revision: https://phab.mercurial-scm.org/D7891
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:33 +0100] rev 44337
nodemap: double check the source docket when doing incremental update
In theory, the index will have the information we expect it to have. However by
security, it seems safer to double check that the incremental data are generated
from the data currently on disk.
Differential Revision: https://phab.mercurial-scm.org/D7890
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:24 +0100] rev 44336
nodemap: track the total and unused amount of data in the rawdata file
We need to keep that information around:
* total data will allow transaction to start appending new information without
confusing other reader.
* unused data will allow to detect when we should regenerate new rawdata file.
Differential Revision: https://phab.mercurial-scm.org/D7889
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:14 +0100] rev 44335
nodemap: track the maximum revision tracked in the nodemap
We need a simple way to detect when the on disk data contains less revision
than the index we read from disk. The docket file is meant for this, we just had
to start tracking that data.
We should also try to detect strip operation, but we will deal with this in
later changesets. Right now we are focusing on defining the API for index
supporting persistent nodemap.
Differential Revision: https://phab.mercurial-scm.org/D7888
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:50:04 +0100] rev 44334
nodemap: add a flag to dump the details of the docket
We are about to add more information to the docket. We first introduce a way to
debug its content.
Differential Revision: https://phab.mercurial-scm.org/D7887
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:54 +0100] rev 44333
nodemap: introduce append-only incremental update of the persistent data
Rewriting the full nodemap for each transaction has a cost we would like to
avoid. We introduce a new way to write persistent nodemap data by adding new
information at the end for file. Any new and updated block as added at the end
of the file. The last block is the new root node.
With this method, some of the block already on disk get "dereferenced" and
become dead data. In later changesets, We'll start tracking the amount of dead
data to eventually re-generate a full nodemap.
Differential Revision: https://phab.mercurial-scm.org/D7886
Kyle Lippincott <spectral@google.com> [Thu, 20 Feb 2020 16:21:00 -0800] rev 44332
shelve: fix ordering of merge labels
Differential Revision: https://phab.mercurial-scm.org/D8140
Kyle Lippincott <spectral@google.com> [Thu, 20 Feb 2020 17:06:01 -0800] rev 44331
shelve: add test clearly demonstrating that the conflict labels are backwards
Differential Revision: https://phab.mercurial-scm.org/D8139
Matt Harbison <matt_harbison@yahoo.com> [Sun, 16 Feb 2020 17:05:18 -0500] rev 44330
import: don't ignore `--secret` when `--bypass` is specified
Differential Revision: https://phab.mercurial-scm.org/D8126
Matt Harbison <matt_harbison@yahoo.com> [Tue, 18 Feb 2020 13:46:10 -0500] rev 44329
phabricator: fix a phabsend crash when processing a renamed binary
This was a trivial fix, and some more tests are added to cover binary files.
Since the old filecontext is passed in, the old name is still available. But I
noticed some weirdness around what it marked as binary and not, and what is
viewable in Phabricator. Those things have been flagged, and will probably take
some digging.
Differential Revision: https://phab.mercurial-scm.org/D8133
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 13 Dec 2019 10:37:45 +0100] rev 44328
test: pin the number of CPU for
issue4074 tests
On machine with an hundreds of CPUs, the "user" CPU time reported can be
inflated by the status steps. Since the test especially focus on the diff
computation, we restrict the number of CPU to avoid potential issues.
Differential Revision: https://phab.mercurial-scm.org/D8112
Raphaël Gomès <rgomes@octobus.net> [Wed, 12 Feb 2020 23:23:59 +0100] rev 44327
rust-dirstatemap: add `NonNormalEntries` class
This fix introduces the same encapsulation as the `copymap`. There is no easy
way of doing this any better for now.
`hg up -r null && time HGRCPATH= HGMODULEPOLICY=rust+c hg up tip` on Mozilla
Central, (not super recent, but it doesn't matter):
Before: 7:44,08 total
After: 1:03,23 total
Pretty brutal regression!
This is a graft on stable of
cf1f8660e568
Differential Revision: https://phab.mercurial-scm.org/D8111
Raphaël Gomès <rgomes@octobus.net> [Thu, 30 Jan 2020 14:57:02 +0100] rev 44326
rust-dirstatemap: cache non normal and other parent set
Performance of `hg update` was significantly worse since the introduction of
the Rust `dirstatemap`. This regression was noticed by Valentin Gatien-Baron
when working on a large repository, as it goes unnoticed for smaller
repositories like Mercurial itself.
This fix introduces the same getter/setter mechanism at `hg-core` level as
for `set/get_dirs`.
While this technique is, as previously discussed, quite suboptimal, it fixes an
important enough problem. Refactoring `hg-core` to use the typestate
pattern could be a good approach to improving code quality in a future patch.
This is a graft of stable of
83b2b829c94e
Differential Revision: https://phab.mercurial-scm.org/D8110
Yuya Nishihara <yuya@tcha.org> [Tue, 11 Feb 2020 19:53:56 +0900] rev 44325
chgserver: spawn new process if schemes change
The schemes extension updates hg.schemes table. It's technically possible
for hg.repository() to look for e.g. ui.schemes instead of depending on
module-local table, but I don't think the change would make much sense
since [schemes] is usually specified in ~/.hgrc and thus it can be considered
static data.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 10 Feb 2020 15:52:52 -0800] rev 44324
tests: accept new bzr message about switching branches
The new version apparently prints "Switched to branch at " instead of
"Switched to branch: ".
Differential Revision: https://phab.mercurial-scm.org/D8106
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:45 +0100] rev 44323
nodemap: keep track of the docket for loaded data
To perform incremental update of the on disk data, we need to keep tracks of
some aspect of that data.
Differential Revision: https://phab.mercurial-scm.org/D7885
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:35 +0100] rev 44322
nodemap: introduce an explicit class/object for the docket
We are about to add more information to this docket, having a clear location to
stock them in memory will help.
Differential Revision: https://phab.mercurial-scm.org/D7884
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:26 +0100] rev 44321
nodemap: keep track of the ondisk id of nodemap blocks
If we are to incrementally update the files, we need to keep some details about
the data we read.
Differential Revision: https://phab.mercurial-scm.org/D7883
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:16 +0100] rev 44320
nodemap: provide the on disk data to indexes who support it
Time to start defining the API and prepare the rust index support. We provide
a method to do so. We use a distinct method instead of passing them in the
constructor because we will need this method anyway later (to refresh the mmap
once we update the data on disk).
Differential Revision: https://phab.mercurial-scm.org/D7847
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:49:06 +0100] rev 44319
nodemap: all check that revision and nodes match in the nodemap
More check is always useful.
Differential Revision: https://phab.mercurial-scm.org/D7846
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:57 +0100] rev 44318
nodemap: add basic checking of the on disk nodemap content
The simplest check it so verify we have all the revision we needs, and nothing
more.
Differential Revision: https://phab.mercurial-scm.org/D7845
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:47 +0100] rev 44317
nodemap: code to parse the persistent binary nodemap data
We now have code to read back what we persisted. This will be put to use in
later changesets.
Differential Revision: https://phab.mercurial-scm.org/D7844
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:38 +0100] rev 44316
nodemap: move the iteratio inside the Block object
Having the iteration inside the serialization function does not help
readability. Now that we have a `Block` object, let us move that code there.
Differential Revision: https://phab.mercurial-scm.org/D7843
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:28 +0100] rev 44315
nodemap: use an explicit "Block" object in the reference implementation
This will help us to introduce some test around the data currently written on
disk.
Differential Revision: https://phab.mercurial-scm.org/D7842
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:19 +0100] rev 44314
nodemap: add a optional `nodemap_add_full` method on indexes
This method can be used to obtains persistent data for a full nodemap. The end
goal is for some index implementation to managed the nodemap serialization them
selves (eg: the rust implementation)
Differential Revision: https://phab.mercurial-scm.org/D7841
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:48:09 +0100] rev 44313
nodemap: add a (python) index class for persistent nodemap testing
Using the persistent nodemap require a compeling performance boost and an
existing implementation. The benefit of the persistent nodemap for pure python
code is unclear and we don't have a C implementation for it. Yet we would like
to actually start testing it in more details and define an API for using that
persistent nodemap.
We introduce a new `devel` config option to use an index class dedicated to
Nodemap Testing. This feature is "pure" only because having using a pure-python
index with the `cext` policy proved more difficult than I would like.
There is nothing going on in that class for now, but the coming changeset will
change that.
Differential Revision: https://phab.mercurial-scm.org/D7840
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:59 +0100] rev 44312
nodemap: delete older raw data file when creating a new ones
When we write new full files, it replace an older one with a different name. We
add the associated cleanup for the older file to be removed after the
transaction.
We delete all file matching the expected pattern to give use extra chance to
delete orphan files we might have failed to delete earlier.
Note: eventually we won't rewrite all data for each transaction. This is coming
in later changesets.
Differential Revision: https://phab.mercurial-scm.org/D7839
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:50 +0100] rev 44311
nodemap: use an intermediate "docket" file to carry small metadata
This intermediate file will make mmapping, transaction and content validation
easier. (Most of this usefulness will arrive gradually in later changeset). In
particular it will become very useful to append new data are the end of raw file
instead of rewriting on the file on each transaction.
See in code comments for details.
Differential Revision: https://phab.mercurial-scm.org/D7838
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:40 +0100] rev 44310
nodemap: only use persistent nodemap for non-inlined revlog
Revlog are inlined while they are small (to avoid having too many file to deal
with). The persistent nodemap will only provides a significant boost for large
enough revlog index. So it does not make sens to add an extra file to store
nodemap for small revlog.
We could consider inclining the nodemap data inside the revlog itself, but the
benefit is unclear so let it be an adventure for another time.
Differential Revision: https://phab.mercurial-scm.org/D7837
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:31 +0100] rev 44309
nodemap: add a function to read the data from disk
This changeset is small and mostly an excuse to introduce an API function
reading the data from disk.
Differential Revision: https://phab.mercurial-scm.org/D7836
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:21 +0100] rev 44308
nodemap: write nodemap data on disk
Let us start writing data on disk (so that we can read it from there later).
This series of changeset is going to focus first on having data on disk and
updating it.
Right now the data is written right next to the revlog data, in the store. We
might move it to cache (with proper cache validation mechanism) later, but for
now revlog have a storevfs instance and it is simpler to us it. The right
location for this data is not the focus of this series.
Differential Revision: https://phab.mercurial-scm.org/D7835
Pierre-Yves David <pierre-yves.david@octobus.net> [Wed, 15 Jan 2020 15:47:12 +0100] rev 44307
nodemap: have some python code writing a nodemap in persistent binary form
This python code aims to be as "simple" as possible. It is a reference
implementation of the data we are going to write on disk (and possibly,
later a way for pure python install to make sure the on disk data are up to
date).
It is not optimized for performance and rebuild the full data structure from
the index every time.
This is a stepping stone toward a persistent nodemap on disk.
Differential Revision: https://phab.mercurial-scm.org/D7834
Augie Fackler <augie@google.com> [Mon, 10 Feb 2020 17:31:05 -0500] rev 44306
cleanup: re-run black on the codebase
Looks like a few patches have landed without having been blackened. I
strongly suspect I should write a patch for baymax that blackens
things on the way in...
# skip-blame automatic formatting
Differential Revision: https://phab.mercurial-scm.org/D8104
Raphaël Gomès <rgomes@octobus.net> [Thu, 16 Jan 2020 13:34:04 +0100] rev 44305
rust-re2: add wrapper for calling Re2 from Rust
This assumes that Re2 is installed following Google's guide. I am not sure
how we want to integrate it in the project, but I think a follow-up patch would
be more appropriate for such work.
As it stands, *not* having Re2 installed results in a compilation error, which
is a problem as it breaks install compatibility. Hence, this is gated behind
a non-default `with-re2` compilation feature.
Differential Revision: https://phab.mercurial-scm.org/D7910
Raphaël Gomès <rgomes@octobus.net> [Thu, 16 Jan 2020 11:27:12 +0100] rev 44304
rust-filepatterns: add support for `include` and `subinclude` patterns
This prepares a future patch for `IncludeMatcher` on the road to bare
`hg status` support.
Differential Revision: https://phab.mercurial-scm.org/D7909
Raphaël Gomès <rgomes@octobus.net> [Thu, 16 Jan 2020 10:28:40 +0100] rev 44303
rust-filepatterns: improve API and robustness for pattern files parsing
Within the next few patches we will be using this new API.
Differential Revision: https://phab.mercurial-scm.org/D7908
Martin von Zweigbergk <martinvonz@google.com> [Mon, 10 Feb 2020 15:50:26 -0800] rev 44302
tests: add workaround for bzr bug
This started failing for me today. I guess my bzr was upgraded.
Differential Revision: https://phab.mercurial-scm.org/D8105
Raphaël Gomès <rgomes@octobus.net> [Tue, 14 Jan 2020 17:10:20 +0100] rev 44301
rust-utils: add util for canonical path
Differential Revision: https://phab.mercurial-scm.org/D7871
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 01 Feb 2020 09:14:36 +0100] rev 44300
test: simplify test-amend.t to avoid race condition
Insted on relying on sleep, we could simply have the editor do the file change.
This remove the reliance on "sleep" and avoid test failing on heavy load
machine.
To test this, I reverted the code change in
5558e3437872 and the test started
failing again.
This is a graft on stable of
141ceec06b55 which should have targeted for stable.
Differential Revision: https://phab.mercurial-scm.org/D8103
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 09 Feb 2020 01:34:37 +0100] rev 44299
remotefilelog-test: glob some flaky output line (
issue6083)
The two following lines are flaky underload, yet the final result is correct.
The command involves background pre-check of output, these are not stable
probably because they run in parallel in multiple process.
I spent a couple of hours trying to understand the pattern and gave up. The
documented intend of these tests is safely guaranteed by checking the cache
content after the command.
If it become useful to start testing precise internal details of the, they will
have to be tested in a more appropriate framework than `.t` tests.
Differential Revision: https://phab.mercurial-scm.org/D8102
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 10:24:32 -0500] rev 44298
httpconnection: allow `httpsendfile` subclasses to suppress the progressbar
This will be neccessary for LFS, which manages the progressbar outside of the
file.
Differential Revision: https://phab.mercurial-scm.org/D7960
Raphaël Gomès <rgomes@octobus.net> [Mon, 10 Feb 2020 21:54:12 +0100] rev 44297
rust-dirstatemap: add `NonNormalEntries` class
This fix introduces the same encapsulation as the `copymap`. There is no easy
way of doing this any better for now.
`hg up -r null && time HGRCPATH= HGMODULEPOLICY=rust+c hg up tip` on Mozilla
Central, (not super recent, but it doesn't matter):
Before: 7:44,08 total
After: 1:03,23 total
Pretty brutal regression!
Differential Revision: https://phab.mercurial-scm.org/D8049
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sun, 09 Feb 2020 16:18:26 -0500] rev 44296
help: when possible, indicate flags implied by tweakdefaults
Differential Revision: https://phab.mercurial-scm.org/D8101
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sun, 09 Feb 2020 15:50:36 -0500] rev 44295
help: add a mechanism to change flags' help depending on config
It seems reasonable to have a similar mechanism for the rest of the
help, but no such thing is implemented.
The goal is to make the help of commands clearer in the presence of
significant default changes, like tweakdefaults or with company-wide
hgrcs. In these cases, a user looking at the help of a command doesn't
exactly know what his hgrc is doing.
Apply to this to the --git option of commands that display diffs, as
this option in particular causes confusion for some reason.
Differential Revision: https://phab.mercurial-scm.org/D8100
Matt Harbison <matt_harbison@yahoo.com> [Sat, 08 Feb 2020 23:39:55 -0500] rev 44294
lfs: use str for the open() mode when opening a blob for py3
The other fix for this was to leave the mode as bytes, and import
`pycompat.open()` like a bunch of other modules do. But I think it's confusing
to still use bytes at the python boundary, and obviously error prone. Grepping
for ` open\(.+, ['"][a-z]+['"]\)` and ` open\(.+, b['"][a-z]+['"]\)` outside of
`tests`, there are 51 and 87 uses respectively, so it's not like this is a rare
direct usage.
Differential Revision: https://phab.mercurial-scm.org/D8099
Raphaël Gomès <rgomes@octobus.net> [Thu, 30 Jan 2020 14:57:02 +0100] rev 44293
rust-dirstatemap: cache non normal and other parent set
Performance of `hg update` was significantly worse since the introduction of
the Rust `dirstatemap`. This regression was noticed by Valentin Gatien-Baron
when working on a large repository, as it goes unnoticed for smaller
repositories like Mercurial itself.
This fix introduces the same getter/setter mechanism at `hg-core` level as
for `set/get_dirs`.
While this technique is, as previously discussed, quite suboptimal, it fixes an
important enough problem. Refactoring `hg-core` to use the typestate
pattern could be a good approach to improving code quality in a future patch.
Differential Revision: https://phab.mercurial-scm.org/D8048
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 16:01:32 -0500] rev 44292
tags: behave better if a tags cache entry is partially written
This is done by discarding any partial cache entry, instead of
filling the partial cache entry with 0xff before.
Differential Revision: https://phab.mercurial-scm.org/D8095
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 15:55:26 -0500] rev 44291
tags: show how hg behaves if a tags cache entry is truncated
I'm seeing an error of this form in production on the order of once a
month. I'm not sure how it happens, but I suspect interrupting a pull
might result in half written cache entries.
Differential Revision: https://phab.mercurial-scm.org/D8094
Valentin Gatien-Baron <vgatien-baron@janestreet.com> [Fri, 07 Feb 2020 13:54:09 -0500] rev 44290
tags: add a debug command to display .hg/cache/hgtagsfnodes1
Differential Revision: https://phab.mercurial-scm.org/D8093
Valentin Gatien-Baron <valentin.gatienbaron@gmail.com> [Sat, 08 Feb 2020 10:22:47 -0500] rev 44289
purge: add -i flag to delete ignored files instead of untracked files
It's convenient for deleting build artifacts. Using --all instead
would delete other things too.
Differential Revision: https://phab.mercurial-scm.org/D8096
Matt Harbison <matt_harbison@yahoo.com> [Thu, 30 Jan 2020 19:50:43 -0500] rev 44288
pyoxidizer: use `legacy_windows_stdio` on Windows
The C executable sets this too, otherwise no output shows up (when paging?).
There is also `legacy_windows_fs_encoding`, but I'm not setting that for now
because the C executable doesn't either.
Differential Revision: https://phab.mercurial-scm.org/D8053
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 17:12:39 -0500] rev 44287
merge: use manifestdict.walk() instead of manifestdict.matches()
As with other patches in this series, this avoids making a
potentially-expensive copy of a manifest.
Differential Revision: https://phab.mercurial-scm.org/D8084
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 16:58:50 -0500] rev 44286
manifest: rewrite filesnotin to not make superfluous manifest copies
This also skips using diff() when all we care about is the filenames. I'm
expecting the built in set logic to be plenty fast. For really large manifests
with a matcher in play this should copy substantially less data around.
Differential Revision: https://phab.mercurial-scm.org/D8082
Pulkit Goyal <7895pulkit@gmail.com> [Sat, 08 Feb 2020 03:13:45 +0530] rev 44285
merge with stable
Augie Fackler <augie@google.com> [Thu, 06 Feb 2020 16:55:39 -0500] rev 44284
archival: use walk() instead of matches() on manifest
All we care about is the filepaths, so this avoids a pointless copy of the
manifest that we only used to extract matching filenames.
Differential Revision: https://phab.mercurial-scm.org/D8090
Raphaël Gomès <rgomes@octobus.net> [Fri, 24 Jan 2020 11:10:07 +0100] rev 44283
rust-dirs-multiset: improve temporary error message
While we wait on a future patch that could verify that the paths passed to
`DirsMultiset` have been audited, we still need to handle this error.
This patch makes it easier to bubble up and makes the error clearer.
Also, this patch introduces the `subslice_index` function that could be useful
for other - albeit niche - purposes.
Differential Revision: https://phab.mercurial-scm.org/D7921
Matt Harbison <matt_harbison@yahoo.com> [Wed, 22 Jan 2020 12:11:35 -0500] rev 44282
exchange: check the `ui.clonebundleprefers` form while processing (
issue6257)
Otherwise the clone command will emit a long stacktrace if there is no `=`
character.
Differential Revision: https://phab.mercurial-scm.org/D7969
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 13 Dec 2019 16:49:05 +0100] rev 44281
copies: add a new test dedicated to testing chain of changeset with merge
The copies test we currently have usually focus on simple case that do not dive
too much into longer chains involving merges. This new test file focus on
extensive testing of these case to validate their behavior and make sure the
various copies algorithm have the same behavior.
And… actually these test are currently broken for the changeset centric
algorithm since
99ebde4fec99, but it went undetected because these case were not
tested.
Differential Revision: https://phab.mercurial-scm.org/D8078
Joerg Sonnenberger <joerg@bec.de> [Wed, 18 Sep 2019 06:07:09 +0200] rev 44280
hgext: initial version of fastexport extension
Differential Revision: https://phab.mercurial-scm.org/D7733
Julien Cristau <jcristau@mozilla.com> [Fri, 07 Feb 2020 15:55:21 +0100] rev 44279
hghave: cache the result of gethgversion
hghave --test-features calls it 90 times, each one calling hg --version
which takes a tenth of a second on my workstation, adding up to about
10s win on test-hghave.t.
Fixes https://bugs.debian.org/939756
Differential Revision: https://phab.mercurial-scm.org/D8092
Augie Fackler <augie@google.com> [Mon, 03 Feb 2020 11:56:02 -0500] rev 44278
resourceutil: blacken
Martin von Zweigbergk <martinvonz@google.com> [Fri, 24 Jan 2020 14:11:43 -0800] rev 44277
clean: delete obsolete unlinking of .hg/graftstate
The responsibility for clearing it is now in
`cmdutil.clearunfinished()`, so we shouldn't have to unlink it in
`hg.clean()`.
Differential Revision: https://phab.mercurial-scm.org/D7992
Martin von Zweigbergk <martinvonz@google.com> [Tue, 04 Feb 2020 10:16:30 -0800] rev 44276
copies: avoid filtering by short-circuit dirstate-only copies earlier
The call to `y.ancestor(x)` triggered repo filtering, which we'd like
to avoid in the simple `hg status --copies` case.
Differential Revision: https://phab.mercurial-scm.org/D8071
Martin von Zweigbergk <martinvonz@google.com> [Tue, 04 Feb 2020 10:14:44 -0800] rev 44275
tests: add test showing that repo filter is calculated for `hg st --copies`
Differential Revision: https://phab.mercurial-scm.org/D8070
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 11:40:15 -0500] rev 44274
lfs: enable workers by default
With the stall issue seemingly fixed, there's no reason not to use workers. The
setting is left for now to keep the test output deterministic, and in case other
issues come up. If none do, this can be converted to a developer setting for
usage with the tests.
Differential Revision: https://phab.mercurial-scm.org/D7963
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 11:32:33 -0500] rev 44273
lfs: fix the stall and corruption issue when concurrently uploading blobs
We've avoided the issue up to this point by gating worker usage with an
experimental config. See
10e62d5efa73, and the thread linked there for some of
the initial diagnosis, but essentially some data was being read from the blob
before an error occurred and `keepalive` retried, but didn't rewind the file
pointer. So the leading data was lost from the blob on the server, and the
connection stalled, trying to send more data than available.
In trying to recreate this, I was unable to do so uploading from Windows to
CentOS 7. But it reproduced every time going from CentOS 7 to another CentOS 7
over https.
I found recent fixes in the FaceBook repo to address this[1][2]. The commit
message for the first is:
The KeepAlive HTTP implementation is bugged in it's retry logic, it supports
reading from a file pointer, but doesn't support rewinding of the seek cursor
when it performs a retry. So it can happen that an upload fails for whatever
reason and will then 'hang' on the retry event.
The sequence of events that get triggered are:
- Upload file A, goes OK. Keep-Alive caches connection.
- Upload file B, fails due to (for example) failing Keep-Alive, but LFS file
pointer has been consumed for the upload and fd has been closed.
- Retry for file B starts, sets the Content-Length properly to the expected
file size, but since file pointer has been consumed no data will be uploaded,
causing the server to wait for the uploaded data until either client or
server reaches a timeout, making it seem as our mercurial process hangs.
This is just a stop-gap measure to prevent this behavior from blocking Mercurial
(LFS has retry logic). A proper solutions need to be build on top of this
stop-gap measure: for upload from file pointers, we should support fseek() on
the interface. Since we expect to consume the whole file always anyways, this
should be safe. This way we can seek back to the beginning on a retry.
I ported those two patches, and it works. But I see that `url._sendfile()` does
a rewind on `httpsendfile` objects[3], so maybe it's better to keep this all in
one place and avoid a second seek. We may still want the first FaceBook patch
as extra protection for this problem in general. The other two uses of
`httpsendfile` are in the wire protocol to upload bundles, and to upload
largefiles. Neither of these appear to use a worker, and I'm not sure why
workers seem to trigger this, or if this could have happened without a worker.
Since `httpsendfile` already has a `close()` method, that is dropped. That
class also explicitly says there's no `__len__` attribute, so that is removed
too. The override for `read()` is necessary to avoid the progressbar usage per
file.
[1] https://github.com/facebookexperimental/eden/commit/
c350d6536d90c044c837abdd3675185644481469
[2] https://github.com/facebookexperimental/eden/commit/
77f0d3fd0415e81b63e317e457af9c55c46103ee
[3] https://www.mercurial-scm.org/repo/hg/file/5.2.2/mercurial/url.py#l176
Differential Revision: https://phab.mercurial-scm.org/D7962
Matt Harbison <matt_harbison@yahoo.com> [Tue, 21 Jan 2020 10:34:15 -0500] rev 44272
lfs: add a method to the local blobstore to convert OIDs to file paths
This is less ugly than passing an open callback to the `httpsendfile`
constuctor.
Differential Revision: https://phab.mercurial-scm.org/D7961
Martin von Zweigbergk <martinvonz@google.com> [Wed, 15 Jan 2020 14:47:38 -0800] rev 44271
merge: introduce a revert_to() for that use-case
In the same vein as the previous patch.
Differential Revision: https://phab.mercurial-scm.org/D7901
Martin von Zweigbergk <martinvonz@google.com> [Wed, 15 Jan 2020 15:30:25 -0800] rev 44270
merge: introduce a clean_update() for that use-case
I find it hard to understand what value to pass for all the arguments
to `merge.update()`. I would like to introduce functions that are more
specific to each use-case. We already have `graft()`. This patch
introduces a `clean_update()` and uses it in some places to show that
it works.
Differential Revision: https://phab.mercurial-scm.org/D7902
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 16:16:15 -0500] rev 44269
manifest: fix _very_ subtle bug with exact matchers passed to walk()
Prior to this fix, manifestdict.walk() with an exact matcher would blindly
list the files in the matcher, even if they weren't in the manifest. This was
exposed by my next patch where I rewrite filesnotin() to use walk() instead of
matches().
Differential Revision: https://phab.mercurial-scm.org/D8081
Raphaël Gomès <rgomes@octobus.net> [Tue, 14 Jan 2020 17:08:45 +0100] rev 44268
rust-utils: add `Escaped` trait
This will be used as a general interface for displaying things to the user.
The upcoming `IncludeMatcher` will use it to store its patterns in a
user-displayable string.
Differential Revision: https://phab.mercurial-scm.org/D7870
Raphaël Gomès <rgomes@octobus.net> [Tue, 14 Jan 2020 17:04:32 +0100] rev 44267
rust-dirs-multiset: add `DirsChildrenMultiset`
In a future patch, this structure will be needed to store information needed by
the (also upcoming) `IgnoreMatcher`.
Differential Revision: https://phab.mercurial-scm.org/D7869
Raphaël Gomès <rgomes@octobus.net> [Tue, 14 Jan 2020 16:50:35 +0100] rev 44266
rust-hg-path: add useful methods to `HgPath`
This changeset introduces the use of the `pretty_assertions` crate for easier
to read test output.
Differential Revision: https://phab.mercurial-scm.org/D7867
Raphaël Gomès <rgomes@octobus.net> [Wed, 05 Feb 2020 17:05:37 +0100] rev 44265
rust-pathauditor: add Rust implementation of the `pathauditor`
It does not offer the same flexibility as the Python implementation, but
should check incoming paths just as well.
Differential Revision: https://phab.mercurial-scm.org/D7866
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 22 Jan 2020 03:17:06 +0530] rev 44264
py3: catch AttributeError too with ImportError
Looks like py3 raises AttributeError instead of ImportError. This is caught on
windows.
Differential Revision: https://phab.mercurial-scm.org/D7965
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 15:15:18 -0500] rev 44263
context: use manifest.walk() instead of manifest.match() to get file list
The former doesn't create a whole extra manifest in order to produce the
matching file list, which is all we actually cared about here. Sigh.
Differential Revision: https://phab.mercurial-scm.org/D8080
Augie Fackler <augie@google.com> [Wed, 05 Feb 2020 15:01:22 -0500] rev 44262
manifest: remove `.new()` from the interface
Nothing used it.
Differential Revision: https://phab.mercurial-scm.org/D8079
Kyle Lippincott <spectral@google.com> [Wed, 29 Jan 2020 13:39:50 -0800] rev 44261
chg: force-set LC_CTYPE on server start to actual value from the environment
Python 3.7+ will "coerce" the LC_CTYPE variable in many instances, and this can
cause issues with chg being able to start up. D7550 attempted to fix this, but a
combination of a misreading of the way that python3.7 does the coercion and an
untested state (LC_CTYPE being set to an invalid value) meant that this was
still not quite working.
This change will cause differences between chg and hg: hg will have the LC_CTYPE
environment variable coerced, while chg will not. This is unlikely to cause any
detectable behavior differences in what Mercurial itself outputs, but it does
have two known effects:
- When using hg, the coerced LC_CTYPE will be passed to subprocesses, even
non-python ones. Using chg will remove the coercion, and this will not
happen. This is arguably more correct behavior on chg's part.
- On macOS, if you set your region to Brazil but your language to English,
this isn't representable in locale strings, so macOS sets LC_CTYPE=UTF-8. If
this value is passed along when ssh'ing to a non-macOS machine, some
functions (such as locale.setlocale()) may raise an exception due to an
unsupported locale setting. This is most easily encountered when doing an
interactive commit/split/etc. when using ui.interface=curses.
Differential Revision: https://phab.mercurial-scm.org/D8039
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 03 Feb 2020 09:00:05 +0100] rev 44260
perf: fix list formatting in perfindex documentation
Differential Revision: https://phab.mercurial-scm.org/D8067
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 01 Feb 2020 09:14:36 +0100] rev 44259
test: simplify test-amend.t to avoid race condition
Insted on relying on sleep, we could simply have the editor do the file change.
This remove the reliance on "sleep" and avoid test failing on heavy load
machine.
To test this, I reverted the code change in
5558e3437872 and the test started
failing again.
Differential Revision: https://phab.mercurial-scm.org/D8065
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 13 Dec 2019 11:32:36 +0100] rev 44258
test: document test-copy-move-merge.t
Differential Revision: https://phab.mercurial-scm.org/D8077
Augie Fackler <augie@google.com> [Mon, 03 Feb 2020 22:16:36 -0500] rev 44257
manifest: remove optional default= argument on flags(path)
It had only one caller inside manifest.py, and treemanifest was
actually incorrectly implemented. treemanifest is still missing the
fastdelta() method from the interface (and so doesn't yet conform),
but this is at least progress.
Differential Revision: https://phab.mercurial-scm.org/D8069
Kyle Lippincott <spectral@google.com> [Thu, 06 Feb 2020 15:46:55 -0800] rev 44256
py3: fully fix bundlepart.__repr__ to return str not bytes
My previous fix did not fully fix the issue: it would attempt to use
%-formatting to combine two strs into a bytes, which won't work. Let's just
switch the entire function to operating in strs. This can cause a small output
difference that will likely not be noticed since no one noticed that the method
wasn't working at all before: if `id` or `type` are not-None, they'll be shown
as `b'val'` instead of `val`. Since this is a debugging aid and these strings
shouldn't be shown to the user, slightly rough output is most likely fine and
it's likely not worthwhile to add the necessary conditionals to marginally
improve it.
Differential Revision: https://phab.mercurial-scm.org/D8091
Pierre-Yves David <pierre-yves.david@octobus.net> [Sun, 17 Nov 2019 01:18:14 +0100] rev 44255
heptapod-ci: add a job to test the rust version of Mercurial
The rust version of Mercurial is not currently tested by anything else. So it
get quite important that developer runs it.
Differential Revision: https://phab.mercurial-scm.org/D8017
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 16 Nov 2019 12:26:54 +0100] rev 44254
heptapod-ci: run the --pure test too
These are usually rarely run by individual developper because they are slow.
However it is important that they stay happy.
Differential Revision: https://phab.mercurial-scm.org/D8016
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 25 Jan 2020 14:56:36 +0100] rev 44253
heptapod-ci: run the normal test suite
The usual tests should be run too. We skip the "tests-check*.t" one because
their are already covered by another Ci step.
Differential Revision: https://phab.mercurial-scm.org/D8015
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 18 Nov 2019 09:38:40 +0100] rev 44252
heptapod-ci: also run the dedicated rust test for the rust code
The Rust code has various standard rust test that are fast to run. So let's run them.
Differential Revision: https://phab.mercurial-scm.org/D8014
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 16 Nov 2019 12:25:53 +0100] rev 44251
heptapod-ci: run test with python3 too
Python3 is the future^W present, it is important to run tests with it too.
Differential Revision: https://phab.mercurial-scm.org/D8013
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 24 Jan 2020 23:22:29 +0100] rev 44250
heptapod-ci: colorize output
The run result are nicer to read with color.
Differential Revision: https://phab.mercurial-scm.org/D8012
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 25 Jan 2020 17:57:40 +0100] rev 44249
heptapod-ci: add a basic file to be able to run tests with heptapod
Having this yaml file somewhere in the main mercurial repository makes it
trivial for contributors using heptapod to run CI on their in-progress work.
There are alot of different combination (python2/python3 pure/cext/rust/pypy)
to be tested and making sure all of them are covered manually is cumbersome.
Automatic CI runnig on draft really helps in that matters. We start small bu
later changesets will add more step testing more of the variants.
The series is targetted on stable to make it available to the widest amount of contribution possible.
The definition of the docker files used for this are available here:
https://dev.heptapod.net/octobus/ci-dockerfiles
Differential Revision: https://phab.mercurial-scm.org/D8011
Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> [Tue, 04 Feb 2020 22:07:36 +0100] rev 44248
worker: manually buffer reads from pickle stream
My previous fix (D8051,
cb52e619c99e, which added Python's built-in buffering
to the pickle stream) has the problem that the selector will ignore the buffer.
When multiple pickled objects are read from the pipe into the buffer at once,
only one object will be loaded.
This can repeat until the buffer is full and delays the processing of completed
items until the worker exits, at which point the pipe is always considered
readable and all remaining items are processed.
This changeset reverts D8051, removing the buffer again. Instead, on Python 3
only, we use a wrapper to modify the "read" provided to the Unpickler to behave
more like a buffered read. We never read more bytes from the pipe than the
Unpickler requests, so the selector behaves as expected.
Also add a test case for "pickle data was truncated" issue.
https://phab.mercurial-scm.org/D8051#119193
Differential Revision: https://phab.mercurial-scm.org/D8076
Kyle Lippincott <spectral@google.com> [Thu, 02 Jan 2020 11:04:18 -0800] rev 44247
py3: __repr__ needs to return str, not bytes
Differential Revision: https://phab.mercurial-scm.org/D8089
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 04 Feb 2020 12:07:37 +0100] rev 44246
config: also respect HGRCSKIPREPO in the zeroconf extension
Differential Revision: https://phab.mercurial-scm.org/D8075
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 04 Feb 2020 12:07:42 +0100] rev 44245
config: also respect HGRCSKIPREPO in hgwebdir_mod
Differential Revision: https://phab.mercurial-scm.org/D8074
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 03 Feb 2020 20:41:11 +0100] rev 44244
config: also respect HGRCSKIPREPO in `dispatch._getlocal`
For some reason, we are also reading the local config in that function.
Differential Revision: https://phab.mercurial-scm.org/D8073
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 04 Feb 2020 12:31:19 +0100] rev 44243
config: add a function in `rcutil` to abstract HGRCSKIPREPO
We wil need to respect this environment variable in more place.
Differential Revision: https://phab.mercurial-scm.org/D8072
Matt Harbison <matt_harbison@yahoo.com> [Mon, 03 Feb 2020 20:12:47 -0500] rev 44242
packaging: make the path to Win32 requirements absolute when building WiX
Otherwise this broke automation when not launched from `contrib/packaging`.
Differential Revision: https://phab.mercurial-scm.org/D8068
Augie Fackler <augie@google.com> [Mon, 03 Feb 2020 11:56:02 -0500] rev 44241
resourceutil: blacken
Augie Fackler <augie@google.com> [Mon, 03 Feb 2020 11:51:52 -0500] rev 44240
merge with stable
Martin von Zweigbergk <martinvonz@google.com> [Fri, 31 Jan 2020 10:53:50 -0800] rev 44239
rebase: abort if the user tries to rebase the working copy
I think it's more correct to treat `hg rebase -r 'wdir()' -d foo`
as `hg co -m foo`, but I'm instead making it error out. That's partly
because it's probably what the user wanted (in the case I heard from a
user, they had done `hg rebase -s f` where `f` resolved to `wdir()`)
and partly because I don't want to think about more complicated cases
where the user specifies the working copy together with other commits.
Differential Revision: https://phab.mercurial-scm.org/D8057
Martin von Zweigbergk <martinvonz@google.com> [Fri, 31 Jan 2020 10:41:50 -0800] rev 44238
tests: add tests for rebasing wdir() revision
Differential Revision: https://phab.mercurial-scm.org/D8056
Martin von Zweigbergk <martinvonz@google.com> [Wed, 22 Jan 2020 13:29:26 -0800] rev 44237
merge: when rename was made on both sides, use ancestor as merge base
When both sides of a merge have renamed a file to the same place, we
would treat that as a "both created" action in merge.py. That means
that we'd use an empty diffbase. It seems better to use the copy
source as diffbase. That can be done by simply dropping code that
prevented us from doing that. I think I did it that way in
57203e0210f8 (copies: calculate mergecopies() based on pathcopies(),
2019-04-11) only to preserve the existing behavior. I also suspect it
was just an accident that it behaved that way before that commit.
Note that until
fa9ad1da2e77 (merge: start using the per-side copy
dicts, 2020-01-23), it was non-deterministic (depending on iteration
order of the `allsources` set in `copies._fullcopytracing()`) which
source was used in the affected test case in test-rename-merge1.t. We
could easily have fixed that by sorting them, but now we can instead
detect the case (the TODO added in the previous patch).
Differential Revision: https://phab.mercurial-scm.org/D7974
Martin von Zweigbergk <martinvonz@google.com> [Fri, 31 Jan 2020 08:47:32 -0800] rev 44236
absorb: graduate -i flag from experimental
The interactive mode seems to work well. I have previously thought
that `-i` should be what `-e` does, but the current behavior matches
what other `-i` flags do (select a subset of the hunks), so I think
that is what we want.
Differential Revision: https://phab.mercurial-scm.org/D8055
Yuya Nishihara <yuya@tcha.org> [Sat, 25 Jan 2020 17:30:24 +0900] rev 44235
rust-cpython: remove PySharedRefCell and its companion structs
Also updates py_shared_iterator!() documentation accordingly.
Yuya Nishihara <yuya@tcha.org> [Sat, 25 Jan 2020 17:26:23 +0900] rev 44234
rust-cpython: switch to upstreamed version of PySharedRefCell
Our PyLeaked is identical to cpython::UnsafePyLeaked. I've renamed it because
it provides mostly unsafe functions.
Yuya Nishihara <yuya@tcha.org> [Sat, 25 Jan 2020 17:21:06 +0900] rev 44233
rust-cpython: rename inner_shared() to inner()
The "shared" accessor will be automatically generated, and will have the
same name as the data itself.
Yuya Nishihara <yuya@tcha.org> [Fri, 31 Jan 2020 00:08:30 +0900] rev 44232
rust-cpython: use PyList.insert() instead of .insert_item()
Silences the deprecated warning.
https://github.com/dgrunwald/rust-cpython/commit/
e8cbe864841714c5555db8c90e057bd11e360c7f
Yuya Nishihara <yuya@tcha.org> [Fri, 31 Jan 2020 00:01:29 +0900] rev 44231
rust-cpython: bump cpython to 0.4 to switch to upstreamed PySharedRef
Yuya Nishihara <yuya@tcha.org> [Thu, 30 Jan 2020 23:57:19 +0900] rev 44230
rust: update dependencies
For no particular reason, but just because I'll bump the rust-cpython version.
Augie Fackler <raf@durin42.com> [Mon, 03 Feb 2020 11:07:34 -0500] rev 44229
Added signature for changeset
7f5410dfc8a6
Augie Fackler <raf@durin42.com> [Mon, 03 Feb 2020 11:07:33 -0500] rev 44228
Added tag 5.3 for changeset
7f5410dfc8a6
Raphaël Gomès <rgomes@octobus.net> [Wed, 29 Jan 2020 11:11:18 +0100] rev 44227
rust-dirstatemap: add missing @propertycache
While investigating a regression on `hg update` performance introduced by the
Rust `dirstatemap`, two missing `@propertycache` were identified when comparing
against the Python implementation. This adds back the first one, that has
no observable impact on behavior. The second one (`nonnormalset`) is going to
be more involved, as the caching has to be done from the Rust side of things.
Differential Revision: https://phab.mercurial-scm.org/D8047
Jan Alexander Steffens (heftig) <jan.steffens@gmail.com> [Thu, 30 Jan 2020 19:16:12 +0100] rev 44226
worker: Use buffered input from the pickle stream
On Python 3, "pickle.load" will raise an exception ("_pickle.UnpicklingError:
pickle data was truncated") when it gets a short read, i.e. it receives fewer
bytes than it requested.
On our build machine, Mercurial seems to frequently hit this problem while
updating a mozilla-central clone iff it gets scheduled in batch mode. It is easy
to trigger with:
#wipe the workdir
rm -rf *
hg update null
chrt -b 0 hg update default
I've also written the following program, which demonstrates the core problem:
from __future__ import print_function
import io
import os
import pickle
import time
obj = {"a": 1, "b": 2}
obj_data = pickle.dumps(obj)
assert len(obj_data) > 10
rfd, wfd = os.pipe()
pid = os.fork()
if pid == 0:
os.close(rfd)
for _ in range(4):
time.sleep(0.5)
print("First write")
os.write(wfd, obj_data[:10])
time.sleep(0.5)
print("Second write")
os.write(wfd, obj_data[10:])
os._exit(0)
try:
os.close(wfd)
rfile = os.fdopen(rfd, "rb", 0)
print("Reading")
while True:
try:
obj_copy = pickle.load(rfile)
assert obj == obj_copy
except EOFError:
break
print("Success")
finally:
os.kill(pid, 15)
The program reliably fails with Python 3.8 and succeeds with Python 2.7.
Providing the unpickler with a buffered reader fixes the issue, so let
"os.fdopen" create one.
https://bugzilla.mozilla.org/show_bug.cgi?id=1604486
Differential Revision: https://phab.mercurial-scm.org/D8051