Aay Jay Chan <aayjaychan@itopia.com.hk> [Thu, 16 Jan 2020 00:30:08 +0800] rev 44088
rust-core: fix typo in comment
Differential Revision: https://phab.mercurial-scm.org/D7895
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 14 Jan 2020 18:59:49 -0800] rev 44087
sha1dc: use buffer protocol when parsing arguments
Without this, functions won't accept bytearray, memoryview,
or other types that can be exposed as bytes to the C API.
The most resilient way to obtain a bytes-like object from
the C API is using the Py_buffer interface.
This commit converts use of s#/y# to s*/y* and uses
Py_buffer for accessing the underlying bytes array.
I checked how hashlib is implemented in CPython and the
the implementation agrees with its use of the Py_buffer
interface as well as using BufferError in cases of bad
buffer types. Sadly, there's no good way to test for
ndim > 1 without writing our own C-backed Python type.
Differential Revision: https://phab.mercurial-scm.org/D7879
Matt Harbison <matt_harbison@yahoo.com> [Tue, 14 Jan 2020 20:05:37 -0500] rev 44086
lfs: avoid quadratic performance in processing server responses
This is also adapted from the Facebook repo[1]. Unlike there, we were already
reading the download stream in chunks and immediately writing it to disk, so we
basically avoided the problem on download. There shouldn't be a lot of data to
read on upload, but it's better to get rid of this pattern.
[1] https://github.com/facebookexperimental/eden/commit/
82df66ffe97e21f3ee73dfec093c87500fc1f6a7
Differential Revision: https://phab.mercurial-scm.org/D7882
Matt Harbison <matt_harbison@yahoo.com> [Tue, 14 Jan 2020 19:42:24 -0500] rev 44085
lfs: check content length after downloading content
Adapted from the Facebook repo[1]. The intent is to distinguish between the
connection dying and getting served a corrupt blob.
The original message:
HTTP makes no provision to tell your client that you failed halfway through
producing your response and won't have the answer they're looking for. So, if a
LFS server fails while producing a response, then we'll report an OID mismatch.
We can do a little better and disambiguate between "the server sent us the
wrong blob" (very scary) and "the server crashed" (merely annoying) by looking
at the content length of the response we got back. If it's not what was
advertised, we can reasonably safely assume the server crashed.
[1] https://github.com/facebookexperimental/eden/commit/
2a4a6fab4e882ed89b948bfc1e7d56d7c3c99dd2
Differential Revision: https://phab.mercurial-scm.org/D7881
Matt Harbison <matt_harbison@yahoo.com> [Tue, 14 Jan 2020 18:02:20 -0500] rev 44084
lfs: rename a variable to clarify its use
This is the response object, not a request.
Differential Revision: https://phab.mercurial-scm.org/D7880
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 14 Jan 2020 17:53:43 -0800] rev 44083
sha1dc: use proper string functions on Python 2/3
PyString_FromStringAndSize doesn't exist on Python 3: we need
to use PyUnicode_FromStringAndSize.
The extension now compiles without warnings on Python 2 and 3.
Differential Revision: https://phab.mercurial-scm.org/D7878
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 14 Jan 2020 17:39:12 -0800] rev 44082
sha1dc: declare all variables at begininng of block
This is required to appease ancient C language standards, which
msvc 2008 still requires for Python 2.7 on Windows.
Differential Revision: https://phab.mercurial-scm.org/D7877
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 14 Jan 2020 17:37:04 -0800] rev 44081
sha1dc: manually define integer types on msvc 2008
Python 2.7 on Windows builds with MSVC 2008, which
doesn't include stdint.h. So we need to check for the
compiler version and manually define missing types when it
is ancient.
Differential Revision: https://phab.mercurial-scm.org/D7876
Martin von Zweigbergk <martinvonz@google.com> [Tue, 14 Jan 2020 14:18:11 -0800] rev 44080
packaging: leverage os.path.relpath() in setup.py
`os.path.relpath()` has existed since Python 2.6, so we can safely use
it. This fixes a bug in the current code when the common prefix is "/"
(in which case `uplevel` would be one less than it should).
Differential Revision: https://phab.mercurial-scm.org/D7875
Raphaël Gomès <rgomes@octobus.net> [Tue, 14 Jan 2020 18:00:05 +0100] rev 44079
rust-utils: add util to find a slice in another slice
Differential Revision: https://phab.mercurial-scm.org/D7863
Raphaël Gomès <rgomes@octobus.net> [Tue, 14 Jan 2020 16:00:57 +0100] rev 44078
dirstate: move rust fast-path calling code to its own method
This logic is about to get bigger, this will make it easier to read and not
pollute the main Python logic.
Differential Revision: https://phab.mercurial-scm.org/D7862
Matt Harbison <matt_harbison@yahoo.com> [Tue, 14 Jan 2020 00:52:53 -0500] rev 44077
lfs: add "bytes" as the unit to the upload/download progress bar
Facebook also passes `util.bytecount()` as a pretty formatter here, but our
progress bar doesn't support that.
Differential Revision: https://phab.mercurial-scm.org/D7872
Matt Harbison <matt_harbison@yahoo.com> [Tue, 14 Jan 2020 16:37:45 -0500] rev 44076
phabricator: post revisions in ascending topological order (
issue6241)
The parent in phabricator ends up being the last revision posted, so sorting the
user input into ascending order should be enough to preserve the proper
relationships.
Differential Revision: https://phab.mercurial-scm.org/D7874
Matt Harbison <matt_harbison@yahoo.com> [Tue, 14 Jan 2020 16:29:03 -0500] rev 44075
doc: fix references to `revset.abstractsmartset`
Differential Revision: https://phab.mercurial-scm.org/D7873
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 13 Jan 2020 20:09:32 -0800] rev 44074
fsmonitor: properly handle str ex.msg
ex.msg is always a str, since pywatchman uses str for exception messages.
This commit removes a b'' from a string compare to avoid types
mismatch and adds a coercion to bytes before stuffing the exception
message on our local exception type, which uses bytes for the message
elsewhere in this file.
Differential Revision: https://phab.mercurial-scm.org/D7855
Matt Harbison <matt_harbison@yahoo.com> [Mon, 23 Dec 2019 01:12:20 -0500] rev 44073
verify: allow the storage to signal when renames can be tested on `skipread`
This applies the new marker in the lfs handler to show it in action, and adds
the test mentioned at the beginning of the series to show that fulltext isn't
necessary in the LFS case.
The existing `skipread` isn't enough, because it is also set if an error occurs
reading the revlog data, or the data is censored. It could probably be cleared,
but then it technically violates the interface contract. That wouldn't matter
for the existing verify algorithm, but it isn't clear how that will change as
alternate storage support is added.
The flag is probably pretty revlog specific, given the comments in verify.py.
But there's already filelog specific stuff in there and I'm not sure what future
storage will bring, so I don't want to over-engineer this. Likewise, I'm not
sure that we want the verify method for each storage type to completely drive
the bus when it comes to detecting renames, so I don't want to go down the
rabbithole of having verifyintegrity() return metadata hints at this point.
Differential Revision: https://phab.mercurial-scm.org/D7713
Matt Harbison <matt_harbison@yahoo.com> [Sun, 22 Dec 2019 23:50:19 -0500] rev 44072
lfs: don't skip locally available blobs when verifying
The `skipflags` config was introduced in
a2ab9ebcd85b, which specifically calls
out downloading and storing all blobs as potentially too expensive. But I don't
see any reason to skip blobs that are already available locally. Hashing the
blob is the only way to indirectly verify the rawdata content stored in the
revlog.
(The note in that commit about skipping renamed is still correct, but the reason
given about needing fulltext isn't.)
Differential Revision: https://phab.mercurial-scm.org/D7712
Matt Harbison <matt_harbison@yahoo.com> [Fri, 20 Dec 2019 01:11:35 -0500] rev 44071
lfs: add a switch to `hg verify` to ignore the content of blobs
Trying to validate the fulltext of an external revision causes missing blobs to
be downloaded and cached. Since the downloads aren't batch prefetched[1] and
aren't compressed, this can be expensive both in terms of time and space.
I made this a tri-state instead of a simple bool because there's an existing
(undocumented) config to handle this, and it would be weird if `hg verify` were
to suddenly start ignoring that config but an `hg recover` initiated verify
honors it. Since this uses the same config setting, it too will skip
rename verification (which requires fulltext, but not for LFS).
[1] https://www.mercurial-scm.org/pipermail/mercurial-devel/2018-April/116118.html
Differential Revision: https://phab.mercurial-scm.org/D7708
Augie Fackler <augie@google.com> [Wed, 08 Jan 2020 14:37:54 -0500] rev 44070
revlog: run rustfmt nightly
I'm a little nervous about folding this back (might be nightly rustfmt
mismatches?) so I want someone to review this.
Differential Revision: https://phab.mercurial-scm.org/D7813
Augie Fackler <augie@google.com> [Wed, 08 Jan 2020 14:37:01 -0500] rev 44069
examples: specify rustfmt nightly using a $() construct
This is ugly, but it's how we have to configure rustfmt for now as we
require nightly rustfmt.
Differential Revision: https://phab.mercurial-scm.org/D7812
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 07 Dec 2019 13:13:48 -0800] rev 44068
hg-core: rustfmt path.rs
The file as vendored does not conform to our source formatting
conventions. Let's reformat it so it does.
# skip-blame automated code reformatting
Differential Revision: https://phab.mercurial-scm.org/D7580
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 07 Dec 2019 10:26:28 -0800] rev 44067
hg-core: vendor Facebook's path utils module
The added file was imported from
https://github.com/facebookexperimental/eden/blob/
d1d8fb939a39aa331ae7f162c39cbcaa511d474b/eden/scm/lib/util/src/path.rs
without modifications. The file is not yet integrated into our project. This
will be done in subsequent commits.
Differential Revision: https://phab.mercurial-scm.org/D7573
Georges Racinet <georges.racinet@octobus.net> [Tue, 14 Jan 2020 12:04:12 +0100] rev 44066
revlog-native: introduced ABI version in capsule
Concerns that an inconsistency could arise between the actual contents
of the capsule in revlog.c and the Rust consumer have been raised after
the switch to the array of data and function pointers in
f384d68d8ea8.
It has been suggested that the `version` from parsers.c could be use for
this. In this change, we introduce instead a separate ABI version number,
which should have the following advantages:
- no need to change the consuming Rust code for changes that have nothing
to do with the contents of the capsule
- the version number in parsers.c is not explicitely flagged as ABI. It's
not obvious to me whether an ABI change that would be invisible to Python
would warrant an increment
The drawback is that developers now have to consider two version numbers.
We expect the added cost of the check to be negligible because it occurs
at instantiation of `CIndex` only, which in turn is tied to instantiation
of Python objects such as `LazyAncestors` and `MixedIndex`. Frequent calls
to `Cindex::new` should also probably hit the CPU branch predictor.
Differential Revision: https://phab.mercurial-scm.org/D7856
Rodrigo Damazio Bovendorp <rdamazio@google.com> [Mon, 13 Jan 2020 19:11:44 -0800] rev 44065
phases: make phasecache._phasesets immutable
Previously, some code paths would mutate the cache itself, which
could give weird results if multiple revsets got evaluated through
that path.
Differential Revision: https://phab.mercurial-scm.org/D7854
Rodrigo Damazio Bovendorp <rdamazio@google.com> [Mon, 13 Jan 2020 19:06:36 -0800] rev 44064
phases: reduce code duplication in phasecache.getrevset
This is a functional NOP other than reducing some of the duplication
in that method.
Differential Revision: https://phab.mercurial-scm.org/D7853
Matt Harbison <matt_harbison@yahoo.com> [Mon, 13 Jan 2020 17:18:03 -0500] rev 44063
scmutil: fix an unbound variable with progressbar debug enabled
Differential Revision: https://phab.mercurial-scm.org/D7852
Augie Fackler <augie@google.com> [Mon, 13 Jan 2020 14:12:31 -0500] rev 44062
hgext: replace references to hashlib.sha1 with hashutil.sha1
When in a non-pure build of Mercurial, this will provide protections
against SHA1 collision attacks.
Differential Revision: https://phab.mercurial-scm.org/D7851
Augie Fackler <augie@google.com> [Mon, 13 Jan 2020 17:16:54 -0500] rev 44061
sslutil: migrate to hashutil.sha1 instead of hashlib.sha1
This is a straight-line replacement like the others, but I split it
out since it's used in a network context and I'm not sure this is
appropriate (we should probably drop support for sha1
fingerprints over TLS) and wanted this to be easily dropped.
Differential Revision: https://phab.mercurial-scm.org/D7850
Augie Fackler <augie@google.com> [Mon, 13 Jan 2020 17:15:14 -0500] rev 44060
core: migrate uses of hashlib.sha1 to hashutil.sha1
Differential Revision: https://phab.mercurial-scm.org/D7849
Augie Fackler <augie@google.com> [Mon, 13 Jan 2020 17:14:19 -0500] rev 44059
hashutil: new package for hashing-related features
Right now this just tries to use our sha1dc and if it's missing (eg a
--pure build) we fall back to hashlib. I imagine in the future we'll
want some other things in here for detecting what hasher is in use as
we transition off sha1.
Differential Revision: https://phab.mercurial-scm.org/D7848
Augie Fackler <augie@google.com> [Wed, 08 Jan 2020 15:59:52 -0500] rev 44058
sha1dc: initial implementation of Python extension
A future change will use this when available to avoid sha1 collision
issues until we can get moved to something else.
Differential Revision: https://phab.mercurial-scm.org/D7815
Augie Fackler <augie@google.com> [Wed, 08 Jan 2020 15:09:01 -0500] rev 44057
sha1dc: import latest version from github
After the recent SHA1 news, the attacks are serious enough we should
be more proactive. This code will at least allow detection of attacks
early. It's already widely deployed in Git.
This is git revision
855827c583bc30645ba427885caa40c5b81764d2 of the
sha1collisiondetection repo[0], with most of the files omitted. A
follow-up change will introduce Python bindings for this code.
0: https://github.com/cr-marcstevens/sha1collisiondetection
Differential Revision: https://phab.mercurial-scm.org/D7814
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 11 Jan 2020 05:44:58 +0100] rev 44056
transaction: add a `hasfinalize` method
The method allow code to check if an existing callback exists. It allow them to
skip potentially expensive setup for a callback.
Differential Revision: https://phab.mercurial-scm.org/D7832
Pierre-Yves David <pierre-yves.david@octobus.net> [Sat, 11 Jan 2020 04:57:29 +0100] rev 44055
changelog: fix the diverted opener to accept more kwargs
The current code prevent the use of `atomictemp` file with the changelog
opener. I do not see a good reason for this limitation.
Differential Revision: https://phab.mercurial-scm.org/D7831
Pierre-Yves David <pierre-yves.david@octobus.net> [Mon, 06 Jan 2020 08:08:06 +0100] rev 44054
revlog: reorder a conditionnal about revlogio
if we are using REVLOGV0, we will not use a rust based index. This small line
movement make it clearer.
Differential Revision: https://phab.mercurial-scm.org/D7830
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 15:47:39 -0800] rev 44053
rebase: delete seemingly unnecessary needupdate()
This seemed to be about checking that the user hasn't updated away
when we asked them to resolve merge conflicts. These days we call
`cmdutil.checkunfinished()` and refuse to update, so the user
shouldn't be able to get into this state.
`test-rebase-interruptions.t` actually has some tests where it
disables the rebase extension in order to be allowed to do some of
these updates. That still passes, but I wouldn't personally haved
cared if that failed.
Differential Revision: https://phab.mercurial-scm.org/D7825
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 13:24:25 -0800] rev 44052
workingctx: move setparents() logic from localrepo to mirror overlayworkingctx
It would be nice to later be able to call `wctx.setparents()` whether
`wctx` is a `workingctx` or an `overlayworkingctx`.
Differential Revision: https://phab.mercurial-scm.org/D7823
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 21:41:28 -0800] rev 44051
overlayworkginctx: implement a setparents() to mirror dirstate.setparents()
This lets us make the in-memory and on-disk code a bit more
similar. I'll soon also implement setparents() on the regular
workingctx.
Differential Revision: https://phab.mercurial-scm.org/D7822
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Jan 2020 17:03:23 -0800] rev 44050
overlayworkingctx: default branch to base context's branch
This matches what the dirstate does (it reuses working copy parent's
branch unless told otherwise). By moving the default out of
`rebase.commitmemorynode()`, it will let us clean that up better
later.
Differential Revision: https://phab.mercurial-scm.org/D7821
Martin von Zweigbergk <martinvonz@google.com> [Thu, 09 Jan 2020 15:41:40 -0800] rev 44049
grep: speed up `hg grep --all-files some/path` by using ctx.matches(match)
ctx.matches(match) avoids walking unintersting parts of the tree when
using tree manifests. That can make a very big difference when
grepping in a small subset of the tree (2.0s -> 0.7s in my case, but
can of course be made more extreme by picking a smaller subset of
files).
Differential Revision: https://phab.mercurial-scm.org/D7820
timeless <timeless@mozdev.org> [Thu, 09 Jan 2020 14:19:20 -0500] rev 44048
fix: fix grammar/typos in hg help -e fix
Matt Harbison <matt_harbison@yahoo.com> [Thu, 09 Jan 2020 10:17:10 -0500] rev 44047
py3: byteify the opener option to use `rust.index` to allow Rust revlogs
It looks like this corresponds to the byteified key written in localrepo in
8042856c90b6.
Differential Revision: https://phab.mercurial-scm.org/D7818
Martin von Zweigbergk <martinvonz@google.com> [Fri, 27 Dec 2019 21:11:36 -0800] rev 44046
graft: use revset for intersecting with ancestor set
This addresses a TODO added in
a1381eea7c7d (graft: do not use
`.remove` on a smart set (regression), 2014-04-28).
Differential Revision: https://phab.mercurial-scm.org/D7806
Martin von Zweigbergk <martinvonz@google.com> [Fri, 27 Dec 2019 21:11:33 -0800] rev 44045
graft: don't remove from a list in a loop
This addresses a TODO added in
a1381eea7c7d (graft: do not use
`.remove` on a smart set (regression), 2014-04-28). I couldn't measure
any speedup.
Differential Revision: https://phab.mercurial-scm.org/D7805
Martin von Zweigbergk <martinvonz@google.com> [Fri, 27 Dec 2019 22:40:52 -0800] rev 44044
tests: avoid grafting the same change over and over
The test case added in
a1381eea7c7d (graft: do not use `.remove` on a
smart set (regression), 2014-04-28) added a test case that grafted the
same change (renaming 'a' to 'b') three times over. It had description
"graft works on complex revset", but AFACT, all that it cared about
was that some ancestor of the working copy was in the set of revisions
to graft. So this patch changes the test to do that instead.
(I plan to later make it so that grafting these renames on top of each
won't create the empty commits they currently create.)
Differential Revision: https://phab.mercurial-scm.org/D7804
Matt Harbison <matt_harbison@yahoo.com> [Wed, 08 Jan 2020 20:23:24 -0500] rev 44043
py3: byteify some `ui.configbool()` parameters
This popped up in
8042856c90b6.
Differential Revision: https://phab.mercurial-scm.org/D7817
Georges Racinet <georges.racinet@octobus.net> [Mon, 23 Dec 2019 17:47:31 +0100] rev 44042
rust-discovery: type alias for random generator seed
It just makes our life easier
Differential Revision: https://phab.mercurial-scm.org/D7715
Martin von Zweigbergk <martinvonz@google.com> [Fri, 27 Dec 2019 15:53:16 -0800] rev 44041
tests: split out another ~1/2 of test-graft.t
The tests involving renames were also quite independent from the rest.
Differential Revision: https://phab.mercurial-scm.org/D7803