Boris Feld <boris.feld@octobus.net> [Mon, 01 Oct 2018 17:37:38 +0200] rev 40142
perf: accept formatter option for perfmanifest
Boris Feld <boris.feld@octobus.net> [Mon, 01 Oct 2018 17:53:47 +0200] rev 40141
perf: fix -T json
The previous code was mixing formatting and data, breaking `-T json` with
unexpected data. We fix the issue and add a test to prevent future regression.
Boris Feld <boris.feld@octobus.net> [Mon, 01 Oct 2018 17:37:53 +0200] rev 40140
formatter: more details on assertion failure
This is useful when the assertion fails.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 10 Oct 2018 23:19:42 -0700] rev 40139
wireprotov2: raise ProgrammingError on unknown action
Suggested by @durin42 in review of D4923.
Differential Revision: https://phab.mercurial-scm.org/D4935
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 Oct 2018 17:24:28 -0700] rev 40138
wireprotov2: send content encoded frames from server
Now that we have support for negotiating encodings and configuring
an encoder, we can start sending content encoded frames from the
server.
This commit teaches the wireprotov2 server code to send content
encoded frames.
On the mozilla-unified repository with zstd enabled peers, this change
reduces the total amount of data transferred from server to client
drastically:
befor: 7,190,995,812 bytes
after: 1,605,508,691 bytes
Differential Revision: https://phab.mercurial-scm.org/D4927
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 Oct 2018 15:19:32 -0700] rev 40137
wireprotov2: raise exception in objects() if future has been resolved
Differential Revision: https://phab.mercurial-scm.org/D4926
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 05 Oct 2018 23:49:18 +0000] rev 40136
wireprotov2: don't emit empty frames
Staring at logs revealed the presence of empty frames that should have
contained payload. Let's stop that from happening.
Differential Revision: https://phab.mercurial-scm.org/D4925
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 05 Oct 2018 10:29:36 -0700] rev 40135
wireprotov2: remove functions for creating response frames from bytes
All code in the actual server uses oncommandresponsereadyobjects().
Test code was ported to that method. This resulted in a handful of
subtle test changes.
Differential Revision: https://phab.mercurial-scm.org/D4924
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 05 Oct 2018 09:23:06 -0700] rev 40134
wireprotov2: handle noop action
This action can be returned from the client reactor. We should
handle it.
Differential Revision: https://phab.mercurial-scm.org/D4923
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 Oct 2018 17:00:16 -0700] rev 40133
wireprotov2: send protocol settings frame from client
Now that we have client and server reactor support for protocol
settings and encoding frames, we can start to send them out over
the wire!
This commit teaches the client reactor to send out a protocol
settings frame when needed. The httpv2 peer has been taught to
gather a list of supported content encoders and to advertise them
through the client reactor.
Because the client is now sending new frame types by default, this
constitutes a compatibility break in the framing protocol. The
media type version has been bumped accordingly. This will ensure
existing clients won't attempt to send the new frames to old
servers not supporting this explicit media type. I'm not bothering
with the BC annotation because everything wireprotov2 is highly
experimental and nobody should be running a server yet.
Differential Revision: https://phab.mercurial-scm.org/D4922
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 Oct 2018 17:10:59 -0700] rev 40132
wireprotov2: define and use stream encoders
Now that we have basic support for defining stream encoding, it is
time to start doing something with it.
We define various classes implementing stream encoders/decoders for
the defined encoding profiles. This is relatively straightforward.
We teach the inputstream and outputstream classes how to encode,
decode, and flush data.
We then teach the clientreactor how to filter received data through
the inputstream decoder.
One of the features of the framing format is that streams can span
requests. This is a differentiating feature from say HTTP/2, which
associates streams with requests. By allowing streams to span requests,
we can reuse compression context data across requests/responses. But
in order to do this, we need a mechanism to "flush" the encoder at
logical boundaries so that receivers receive all data where it is
expected. And a "flush" event is distinct from a "finish" event from
the perspective of certain compressors because a "flush" will retain
compression context state whereas a "finish" operation will not. This
is why encoders have both a flush() and a finish() and each uses
specific flushing semantics on the underlying compressor.
The added tests verify various behavior of decoders via clientreactor.
These tests do test some compression behavior via use of outputstream.
But for all intents and purposes, server reactor support for encoding
is not yet implemented.
Differential Revision: https://phab.mercurial-scm.org/D4921
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 04 Oct 2018 17:39:16 -0700] rev 40131
wireprotov2: establish dedicated classes for input and output streams
Streams are unidirectional. As part of implementing encoding/decoding
support, it became clear that it didn't make sense for a generic
"stream" class to hold functionality related to both encoding and
decoding. So we create new classes to represent the flavor of
stream.
Differential Revision: https://phab.mercurial-scm.org/D4920
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 04 Oct 2018 17:17:57 -0700] rev 40130
wireprotov2: pass ui into clientreactor and serverreactor
This will allow us to use config options to influence compression
settings.
Differential Revision: https://phab.mercurial-scm.org/D4919
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 04 Oct 2018 16:44:21 -0700] rev 40129
wireprotov2: handle stream encoding settings frames
Like what we just did for the server reactor, we teach the client
reactor to handle stream encoding settings frames. The code is
very similar.
We define a method on the stream class to handle processing the data
within the decoded frames. However, it doesn't yet do anything useful.
Differential Revision: https://phab.mercurial-scm.org/D4918
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 04 Oct 2018 15:43:21 -0700] rev 40128
wireprotov2: document client reactor actions
We should document these so consumers have an easier life.
Differential Revision: https://phab.mercurial-scm.org/D4917
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 04 Oct 2018 16:26:45 -0700] rev 40127
wireprotov2: handle sender protocol settings frames
We teach the server reactor to handle the optional sender protocol
settings frames, which can only be sent at the beginning of frame
exchange.
Right now, we simply decode the data and record the sender protocol
settings on the server reactor instance: we don't yet do anything
meaningful with the data.
Differential Revision: https://phab.mercurial-scm.org/D4916
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 04 Oct 2018 14:05:16 -0700] rev 40126
wireprotov2: update stream encoding specification
The encoding of data within streams in the frame-based protocol is
not yet defined or implemented. This means that all data in wire
protocol version 2 is currently being sent out raw, without
compression. That's obviously not ideal.
This commit formalizes the beginnings of stream encoding support
in the protocol.
I suspect we'll change behavior substantially in the future. My goal
is to get something landed so we can use compression. We can build
out more robust support later.
Because the frame type ID changed, this is strictly BC. But existing
code wasn't using the frame. I'll bump the framing protocol version
later once code is introduced to use the new frame.
Differential Revision: https://phab.mercurial-scm.org/D4915
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 04 Oct 2018 15:08:42 -0700] rev 40125
cborutil: cast bytearray to bytes
This code didn't like passing in bytearray instances. Let's cast
bytearray to bytes so it works.
Differential Revision: https://phab.mercurial-scm.org/D4914
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 Oct 2018 17:06:24 -0700] rev 40124
tests: disable zstd in test
This makes the test pass in pure installs.
Differential Revision: https://phab.mercurial-scm.org/D4913
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 Oct 2018 17:20:41 -0700] rev 40123
wireprotov2: remove "compression" from capabilities response
This is not used. And future commits will change how this mechanism
works. Let's remove it.
As a bonus, this fixes some test failures on pure installs (due to
zstd references).
Differential Revision: https://phab.mercurial-scm.org/D4912
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 Oct 2018 16:27:40 -0700] rev 40122
zstandard: vendor python-zstandard 0.10.1
This was just released.
The upstream source distribution from PyPI was extracted. Unwanted
files were removed.
The clang-format ignore list was updated to reflect the new source
of files.
setup.py was updated to pass a new argument to python-zstandard's
function for returning an Extension instance. Upstream had to change
to use relative paths because Python 3.7's packaging doesn't
seem to like absolute paths when defining sources, includes, etc.
The default relative path calculation is relative to setup_zstd.py
which is different from the directory of Mercurial's setup.py.
The project contains a vendored copy of zstandard 1.3.6. The old
version was 1.3.4.
The API should be backwards compatible and nothing in core should
need adjusted. However, there is a new "chunker" API that we
may find useful in places where we want to emit compressed chunks
of a fixed size.
There are a pair of bug fixes in 0.10.0 with regards to
compressobj() and decompressobj() when block flushing is used. I
actually found these bugs when introducing these APIs in Mercurial!
But existing Mercurial code is not affected because we don't
perform block flushing.
# no-check-commit because 3rd party code has different style guidelines
Differential Revision: https://phab.mercurial-scm.org/D4911
Yuya Nishihara <yuya@tcha.org> [Tue, 25 Sep 2018 20:55:03 +0900] rev 40121
rust-chg: install signal handlers to forward signals to server
I use sync::Once as a synchronization primitive because it's quite easy
to use, and is good enough to prevent data race in these C functions.
Yuya Nishihara <yuya@tcha.org> [Mon, 24 Sep 2018 22:19:49 +0900] rev 40120
rust-chg: remove SIGCHLD handler which won't work in oxidized chg
Since pager is managed by the Rust part, the C code doesn't know the pager
pid. I could make the Rust part teach the pid to C, but still installing
SIGCHLD handler seems horrible idea since we no longer use handcrafted
low-level process management functions.
Instead, I'm thinking of adding async handler to send SIGPIPE at the exit
of the pager.
Yuya Nishihara <yuya@tcha.org> [Mon, 24 Sep 2018 22:04:57 +0900] rev 40119
rust-chg: extract signal handlers from chg/procutil.c
abortmsgerrno() and debugmsg() are removed, and the public interface instead
returns success/error status. Since signal handlers can't propagate errors,
the result of kill() is just ignored.
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 23:19:49 +0900] rev 40118
help: document about "version" template keywords
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 23:14:21 +0900] rev 40117
help: document about "tags" template keywords
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 23:12:04 +0900] rev 40116
help: document about "status" template keywords
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 23:05:00 +0900] rev 40115
help: document about "resolve" template keywords
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 23:00:50 +0900] rev 40114
help: document about "paths" template keywords
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 22:56:37 +0900] rev 40113
help: document about "identify" template keywords
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 22:50:12 +0900] rev 40112
help: document about "grep" template keywords
Yuya Nishihara <yuya@tcha.org> [Sun, 07 Oct 2018 17:35:25 +0900] rev 40111
chgserver: catch Abort while parsing early args to shut down cleanly
_loadnewui() calls dispatcher functions, which may raise Abort if unparsable
arguments are passed in. The server should catch such errors and translate
them to the "exit 255" instruction so the client can finish the IPC session
cleanly.
Spotted while porting the chg client to Rust.
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 22:08:37 +0900] rev 40110
chg: upgrade client to use "setumask2" command
No compatibility code is added to the client side, since it's unlikely for
new client to communicate with the old server.
Yuya Nishihara <yuya@tcha.org> [Thu, 04 Oct 2018 23:25:55 +0900] rev 40109
chgserver: add "setumask2" command which uses correct message frame
The first 4 bytes should be a length field, not a value. Spotted while
porting chg functions to the Rust one.
muxator <a.mux@inwind.it> [Tue, 09 Oct 2018 22:29:10 +0200] rev 40108
packaging: "make deb" no longer fails
Release 4.7 rationalized the layout of the build scripts.
Unfortunately, while "make docker-ubuntu-*" and "make docker-debian-*" worked as
expected, "make deb" was broken.
Before this change "make deb" was failing with the following error:
You are not inside a Mercurial repository!
Or, after the latest changes:
You are inside <fullpath>, which is not the root of a Mercurial repository
Moreover, when "make deb" failed, the cleanup routine deleted the wrong
directory (contrib/packaging/debian instead of <reporoot>/debian) resulting in
a corrupted working copy that needed to be hg revert-ed.
After this change the docker targets continue to work, and the deb one is able
to finish.
muxator <a.mux@inwind.it> [Tue, 09 Oct 2018 22:24:38 +0200] rev 40107
packaging: cleanup() did not read the value of $CLEANUP
When the original author put CLEANUP in a conditional statement he was probably
willing to use it to control the "if". This change tries to restore that
behaviour: the "rm" clause is triggered if and only if CLEANUP is defined and
not empty.
muxator <a.mux@inwind.it> [Tue, 09 Oct 2018 22:18:35 +0200] rev 40106
packaging: builddeb's cleanup needs to expand PWD, safely
Single quotes would not expand the variable.
muxator <a.mux@inwind.it> [Tue, 09 Oct 2018 22:16:25 +0200] rev 40105
packaging: blindly factor out trap's cleanup function in builddeb
This commit blindly extracts builddeb's trap routine in a dedicated function.
While doing so, I think two bugs are exposed, which will be addressed in the
next commits:
- single quoting around '$CLEANUP' will always evaluate to the literal
'$CLEANUP' regardless of the variable's value. The "if" will always be true.
- the removal operation will not expand $PWD (and a variable expansion would
need double quotes, anyways.
muxator <a.mux@inwind.it> [Tue, 09 Oct 2018 21:40:49 +0200] rev 40104
packaging: print full path to the packages when builddeb finishes successfully
muxator <a.mux@inwind.it> [Tue, 09 Oct 2018 21:39:39 +0200] rev 40103
packaging: print more specific error messages when builddeb fails
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 09 Oct 2018 12:56:11 -0700] rev 40102
cmdutil: sort unresolved paths
I noticed that `hg status` was printing unresolved paths in a
non-deterministic order. This patch fixes that.
I'm not sure if the sorting should be done in
merge.mergestate.unresolved() instead. Either way fixes the
presentation issue.
Differential Revision: https://phab.mercurial-scm.org/D4929
Yuya Nishihara <yuya@tcha.org> [Tue, 09 Oct 2018 07:46:01 +0900] rev 40101
fuzz: report error if Python code raised exception
I think that's what we wanted to do, given the most of the code block is
surrounded by try-except. 'lazymanifest(mdata)' is moved to the try block
as it can fail.
Yuya Nishihara <yuya@tcha.org> [Tue, 09 Oct 2018 07:42:05 +0900] rev 40100
revlog: explicitly initialize static variables
I know .bss section is zero-filled, but explicit initialization should be
better as we rely on that.
Joerg Sonnenberger <joerg@bec.de> [Mon, 08 Oct 2018 21:53:32 +0200] rev 40099
tests: do not change sys.path, it can break loading cext.parsers
When running this tests with run-tests, the prefix would resolve
mercurial.cext to the source tree and the attempt to load
mercurial.cext.parsers would therefore fail since it doesn't exist in
it. With the regular search path from run-tests, it is picked up from
the temporary prefix correctly.
Differential Revision: https://phab.mercurial-scm.org/D4910
Joerg Sonnenberger <joerg@bec.de> [Mon, 08 Oct 2018 21:51:20 +0200] rev 40098
tests: deal with differences in tic from ncurses and NetBSD
Differential Revision: https://phab.mercurial-scm.org/D4909
Joerg Sonnenberger <joerg@bec.de> [Mon, 08 Oct 2018 20:07:13 +0200] rev 40097
closehead: fix close-head -r listification
Differential Revision: https://phab.mercurial-scm.org/D4908
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Thu, 23 Aug 2018 12:25:54 +0900] rev 40096
import-checker: use testparseutil.embedded() to centralize detection logic
This patch fixes issues of embedded() in import-checker.py below, too.
- overlook (or mis-detect) the end of inline script in doctest style
- overlook inline script in doctest style at the end of file
(and ignore invalid un-closed heredoc at the end of file, too)
- overlook code fragment in styles below
- "python <<EOF" (heredoc should be "cat > file <<EOF" style)
- "cat > foobar.py << ANYLIMIT" (limit mark should be "EOF")
- "cat << EOF > foobar.py" (filename should be placed before limit mark)
- "cat >> foobar.py << EOF" (appending is ignored)
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Thu, 23 Aug 2018 12:25:54 +0900] rev 40095
tests: use NO_CHECK_EOF as heredoc limit mark to omit checking code fragments
This patch uses NO_CHECK_EOF as heredoc limit mark instead of EOF, in
order to avoid checking all python code fragments in
test-contrib-check-code.t, because almost all of them has
un-recommended implementations intentionally.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Thu, 23 Aug 2018 12:25:54 +0900] rev 40094
contrib: add an utility module to parse test scripts
This patch centralizes the logic to pick up code fragments embedded in
*.t script, in order to:
- apply checking with patterns in check-code.py on such embedded
code fragments
Now, check-code.py completely ignores embedded code
fragments. I'll post another patch series to check them.
- replace similar code path in contrib/import-checker.py
Current import-checker.py has problems below. Fixing each of them
is a little difficult, because parsing logic and pattern strings
are tightly coupled.
- overlook (or mis-detect) the end of inline script in doctest
style
8a8dd6e4a97a fixed a part of this issue, but not enough.
- it overlooks inline script in doctest style at the end of file
(and ignores invalid un-closed heredoc at the end of file, too)
- it overlooks code fragment in styles below
- "python <<EOF" (heredoc should be "cat > file <<EOF" style)
- "cat > foobar.py << ANYLIMIT" (limit mark should be "EOF")
- "cat << EOF > foobar.py" (filename should be placed before limit mark)
- "cat >> foobar.py << EOF" (appending is ignored)
- it is not extensible for other than python code fragments
(e.g. shell script, hgrc file, and so on)
This new module can detect python code fragments in styles below:
- inline script in doctest style (starting by " >>> " line)
- python invocation with heredoc script ("python <<EOF")
- python script in heredoc style (redirected into ".py" file)
As an example of extensibility of new module, this patch also contains
implementation to pick up code fragment below. This will be useful to
add additional restriction for them, for example.
- shell script in heredoc style (redirected into ".sh" file)
- hgrc configuration in heredoc style (redirected into hgrc or $HGRCPATH)
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Thu, 23 Aug 2018 12:24:41 +0900] rev 40093
tests: use environment variable indirectly
Using environment variable directly in heredoc python code will cause
syntax error at checking module importation by import-checker.py
strictly, because "$varname" is invalid in Python syntax. "$varname"
becomes valid after environment variable substitution by shell at
writing text into file.
Current import-checker.py overlooks code fragment changed in this
patch, because of a restriction below for a line starting code
fragment.
- filename must be specified before limit mark
NG: cat <<EOF > FILE.py
OK: cat > FILE.py <<EOF
import-checker.py itself is fixed in subsequent patch.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Thu, 23 Aug 2018 12:20:41 +0900] rev 40092
tests: import multiple modules separately
Current import-checker.py overlooks code fragment changed in this
patch, because of restrictions below for a line starting code
fragment.
- filename must be specified before limit mark
NG: cat <<EOF > FILE.py
OK: cat > FILE.py <<EOF
- limit mark must not be quoted
NG: cat > FILE.py <<'EOF'
OK: cat > FILE.py <<EOF
import-checker.py itself is fixed in subsequent patch.
Augie Fackler <augie@google.com> [Mon, 08 Oct 2018 11:50:25 -0400] rev 40091
fuzz: allow manifest fuzzer to detect leaks
Huzzah!
Differential Revision: https://phab.mercurial-scm.org/D4907
Augie Fackler <augie@google.com> [Mon, 08 Oct 2018 11:47:25 -0400] rev 40090
fuzzers: init Python in LLVMFuzzerInitialize and intentionally leak it
This sidesteps leaks (or "leaks", I'm not sure) in CPython, and lets
our fuzzer work.
Differential Revision: https://phab.mercurial-scm.org/D4906
Augie Fackler <augie@google.com> [Mon, 08 Oct 2018 11:42:06 -0400] rev 40089
revlog: if the module is initialized more than once, don't leak nullentry
Caught (annoyingly) by the manifest fuzzer.
Differential Revision: https://phab.mercurial-scm.org/D4905
Martin von Zweigbergk <martinvonz@google.com> [Mon, 01 Oct 2018 14:31:15 -0700] rev 40088
narrow: move remaining narrow-limited dirstate walks to core
In most places we now filter at a higher level (the context object),
but there are few places that relied on the dirstate walk to be
filtered by the narrowspec. The important cases are those used by `hg
add` and `hg addremove`. This patch updates them to pass in a matcher
instead of relying on the dirstate to do the filtering. The dirstate
filtering is also dropped in narrowdirstate.py.
Not always filtering in the dirstate should be useful for a future `hg
status --include-outside-narrow` option.
These places now end up doing an unrestricted dirstate walk after this
patch:
* debugfileset
* perfwalk
* sparse (but restricted to sparse config)
* largefiles
I'll let anyone who cares about these cases adapt them to work with
narrow if necessary.
Differential Revision: https://phab.mercurial-scm.org/D4901
Martin von Zweigbergk <martinvonz@google.com> [Mon, 01 Oct 2018 10:11:00 -0700] rev 40087
narrow: allow repo.narrowmatch(match) to include exact matches from "match"
Differential Revision: https://phab.mercurial-scm.org/D4900
Martin von Zweigbergk <martinvonz@google.com> [Fri, 28 Sep 2018 22:35:05 -0700] rev 40086
narrow: filter files by narrowspec in ctx.matches()
This has no effect yet because 1) for committed changes, ctx.matches()
just calls ctx.walk(), which we updated in the previous patch, and 2)
for the working copy, the filtering is also done in the overridden
dirstate.walk() in narrowdirstate.
Differential Revision: https://phab.mercurial-scm.org/D4899
Martin von Zweigbergk <martinvonz@google.com> [Fri, 28 Sep 2018 17:09:15 -0700] rev 40085
narrow: only walk files within narrowspec also for committed revisions
Narrow has been walking only paths matching the narrowspec when
walking the working copy. We have not done the same filtering when
walking committed revisions (e.g. "hg files -r "), which seems a
little odd. Let's make it consistent.
Differential Revision: https://phab.mercurial-scm.org/D4898
Martin von Zweigbergk <martinvonz@google.com> [Thu, 27 Sep 2018 23:01:26 -0700] rev 40084
status: intersect matcher with narrow matcher instead of filtering afterwards
I seem to have done a very naive move of the code from the narrow
extension into core in e411774a2e0f (narrow: move status-filtering to
core and to ctx, 2018-08-02). It seems obvious that a better way is to
intersect the matchers.
Note that this means that when requesting status for the working
directory in a narrow repo, we now pass the narrow matcher (possibly
intersected with a user-provided matcher) into _buildstatus() and then
into dirstate.status() and dirstate.walk(), which will the intersect
it again with the narrow matcher. That's functionally fine, but
wasteful. I hope to later remove the dirstate wrapping that adds the
second layer of matcher intersection.
Differential Revision: https://phab.mercurial-scm.org/D4897
Martin von Zweigbergk <martinvonz@google.com> [Fri, 28 Sep 2018 12:29:21 -0700] rev 40083
localrepo: allow narrowmatch() to accept matcher to intersect with
It's pretty common that we need to intersect a matcher we already have
(usually from the user) with the narrow matcher. Let's make
repo.narrowmatch() take an optional matcher to intersect with.
Differential Revision: https://phab.mercurial-scm.org/D4896
Zharaskhan Aman <aman.zharaskhan@gmail.com> [Fri, 05 Oct 2018 01:55:51 +0300] rev 40082
obsolete: fix ValueError when stored note contains ':' char (issue5783)
The newer version of `amend -n 'Some some'` accepts containing ':' char.
The information contained in this note 'Testing::Obstore' gives ValueError,
because we are trying to store more than 2 values in key and value.
Differential Revision: https://phab.mercurial-scm.org/D4883
Differential Revision: https://phab.mercurial-scm.org/D4882
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 Oct 2018 16:06:51 -0700] rev 40081
narrow: update TODO.rst now that we share format with sparse
The narrowspec format was unified with the sparse format in
f64ebe7d2259 (narrowspec: use sparse.parseconfig() to parse narrowspec
file (BC), 2018-08-03).
Differential Revision: https://phab.mercurial-scm.org/D4904
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 Oct 2018 16:04:25 -0700] rev 40080
narrow: update TODO.rst now that we filter status in ctx
The comment referred to was addressed in e411774a2e0f (narrow: move
status-filtering to core and to ctx, 2018-08-02). I also think
84092edd5c88 (narrow: drop unnecessary overrides of patch, 2018-09-28)
suggests that it was the right thing to do.
Differential Revision: https://phab.mercurial-scm.org/D4903
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 Oct 2018 16:01:21 -0700] rev 40079
narrow: update TODO.rst now that the narrowspec is in .hg/store
We no longer have the unfortunate wrappostshare() and
unsharenarrowspec() since 576eef1ab43d (narrow: move .hg/narrowspec to
.hg/store/narrowspec (BC), 2018-08-02).
Differential Revision: https://phab.mercurial-scm.org/D4902
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 23:28:14 +0300] rev 40078
py3: add 8 new passing tests to whitelist found by buildbot
We are getting close!
Differential Revision: https://phab.mercurial-scm.org/D4893
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 23:31:51 +0300] rev 40077
py3: use '%f' for floats instead of '%s'
I remember Yuya saying we need to use bytestr() or '%r' because '%s' was clever.
Not sure it applies to this or not.
Differential Revision: https://phab.mercurial-scm.org/D4894
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 22:52:24 +0300] rev 40076
narrow: move adding of narrow server capabilities to core
We use the experimental.narrow config option introduced in one of the previous
patch and move the functionality of adding narrow server capabilities to core.
Differential Revision: https://phab.mercurial-scm.org/D4891
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 22:31:12 +0300] rev 40075
wireprotoserver: move narrow capabilities to wireprototypes.py
This is done because wireprotoserver import wireprotov1server, so you cannot
import wireprotoserver in wireprotov1server to use the capabilities constants.
Differential Revision: https://phab.mercurial-scm.org/D4890
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 22:19:19 +0300] rev 40074
narrow: introduce a config option to check if narrow is enabled or not
This patch introduces a new config option experimental.narrow which is set to
False by default and set to True by the narrow extension.
While moving narrow related logic into core, we need to know at places whether
narrow extension is enabled or not. Checking the list of extension enabled is
one solution but once narrow is inbuilt, we will definitely want a config option
to check whether narrow is turned on or not.
So this patch introduces a config option, which will evolve to the main point to
turn narrow capability on and off once all the narrow is in core.
Differential Revision: https://phab.mercurial-scm.org/D4889
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 05 Oct 2018 20:24:07 +0300] rev 40073
narrow: move the code to generate a widening bundle2 to core
This is a part of moving more narrow related bits to core.
Differential Revision: https://phab.mercurial-scm.org/D4888
Pulkit Goyal <pulkit@yandex-team.ru> [Tue, 02 Oct 2018 17:09:56 +0300] rev 40072
narrow: start returning bundle2 from widen_bundle()
Differential Revision: https://phab.mercurial-scm.org/D4838
Pulkit Goyal <pulkit@yandex-team.ru> [Fri, 28 Sep 2018 23:42:31 +0300] rev 40071
narrow: the first version of narrow_widen wireprotocol command
This patch introduces a wireprotocol command narrow_widen() which will be used
to widen a narrow copy using `hg tracked` command provided by narrow extension.
The wireprotocol command takes the old and new includes and excludes, common
heads, changegroup version, known revs, and a boolean ellipses and generates a
bundle2 of the required data and send it. The clients receives the bundle2
and applies that.
A bundle2 instead of changegroup because in future we might want to add more
things to send while widening. Thanks for martinvonz for the suggestion.
I am not sure whether we need changegroup version as an argument to the command
as I *think* narrow needs changegroup3 already.
The tests shows that we don't exchange phase data now while widening which is
nice. Also we don't check for pushkeys, rbc-cache, bookmarks etc.
This does not support ellipses cases for now but will be supported in future
patches. Since we send bundle2, it won't be hard to plug the ellipses logic in
here.
The existing code for widening a non-ellipses case is also dropped in this
patch.
Differential Revision: https://phab.mercurial-scm.org/D4813
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:43:57 +0900] rev 40070
remotenames: abort if literal revset pattern matches nothing
This is the convention of the other namespace revsets such as tag(). Let's
make the remote variants do the same.
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:39:41 +0900] rev 40069
remotenames: remove unneeded sorted() from revset implementation
The order is constrained by the subset.
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:36:48 +0900] rev 40068
remotenames: don't call a set of nodes as "revs"
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:30:55 +0900] rev 40067
remotenames: use util.always instead of handcrafted lambda
Yuya Nishihara <yuya@tcha.org> [Fri, 05 Oct 2018 21:29:21 +0900] rev 40066
remotenames: inline _parseargs() into _revsetutil()
The _parseargs() function gets quite simple, and the 0/1 loop can be rewritten
as "if".
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 16:27:40 -0700] rev 40065
repo: create changectx in a single place in localrepo.__getitem__
Differential Revision: https://phab.mercurial-scm.org/D4885
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 16:06:36 -0700] rev 40064
repo: remove the last few "pass" statements in localrepo.__getitem__
In case of IndexError or LookupError, we used "pass" statements and
fell through to the end of localrepo.__getitem__. I find the pass
statements easy to miss. Consistently raising and catching exceptions
seems easier to follow.
Oh -- and I didn't plan this before I wrote the above -- that probably
also lets us reuse the "return context.changectx(self, rev, node)" in
a later patch.
Differential Revision: https://phab.mercurial-scm.org/D4884
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 10:38:55 -0700] rev 40063
filectx: correct docstring about "changeid"
The changeid argument must be a revnum (basefile.rev() is defined as
"return self._changeid"), so fix the lie in the docstring. It seems to
have been incorrect for at least 10 years (I didn't check further
back).
Differential Revision: https://phab.mercurial-scm.org/D4881
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 Oct 2018 10:30:05 -0700] rev 40062
context: drop incorrect and superfluous docstring
It's been incorrect at least since 8b86acc7aa64 (context: drop support
for looking up context by ambiguous changeid (API), 2018-04-28).
Differential Revision: https://phab.mercurial-scm.org/D4880
Augie Fackler <raf@durin42.com> [Thu, 04 Oct 2018 21:35:12 -0400] rev 40061
remotenames: follow-up on D3639 to make revset funcs take only one arg
Per the review discussion on D3639, we want this to just take one
argument. That ended up simplifying the code, so I'm sharing this as a
follow-up to that revision rather than editing in-flight.
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 12 Jul 2018 03:12:09 +0530] rev 40060
remotenames: add names argument to remotenames revset
This patch adds names argument to the revsets provided by the remotenames
extension. The revsets are remotenames(), remotebranches() and
remotebookmarks(). names can be a single names, list of names or can be empty too
which means it's an optional argument.
If names is/are passed, changesets which have those remotenames will be
returned.
If names are not passed, changesets from all the remotenames are shown.
Passing an invalid remotename does not throw error.
The name argument also supports pattern matching.
Tests are added for the argument in tests/test-logexchange.t
Differential Revision: https://phab.mercurial-scm.org/D3639
Boris Feld <boris.feld@octobus.net> [Fri, 07 Sep 2018 11:43:48 -0400] rev 40059
copies: add time information to the debug information
Boris Feld <boris.feld@octobus.net> [Fri, 07 Sep 2018 11:16:06 -0400] rev 40058
copies: add a devel debug mode to trace what copy tracing does
Mercurial can spend a lot of time finding renames between two commits. Having
more information about that process help to understand what makes it slow in
an individual instance. (eg: many files vs 1 file, etc...)
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 Oct 2018 17:34:34 -0700] rev 40057
revlog: rewrite censoring logic
I was able to corrupt a revlog relatively easily with the existing
censoring code. The underlying problem is that the existing code
doesn't fully take delta chains into account. When copying revisions
that occur after the censored revision, the delta base can refer
to a censored revision. Then at read time, things blow up due to the
revision data not being a compressed delta.
This commit rewrites the revlog censoring code to take a higher-level
approach. We now create a new revlog instance pointing at temp files.
We iterate through each revision in the source revlog and insert
those revisions into the new revlog, replacing the censored revision's
data along the way.
The new implementation isn't as efficient as the old one. This is
because it will fully engage delta computation on insertion. But I
don't think it matters.
The new implementation is a bit hacky because it attempts to reload
the revlog instance with a new revlog index/data file. This is fragile.
But this is needed because the index (which could be backed by C) would
have a cached copy of the old, possibly changed data and that could
lead to problems accessing index or revision data later.
One benefit of the new approach is that we integrate with the
transaction. The old revlog is backed up and if the transaction is
rolled back, the original revlog is restored.
As part of this, we had to teach the transaction about the store
vfs. I'm not super keen about this. But this was the easiest way
to hook things up to the transaction. We /could/ just ignore the
transaction like we were doing before. But any file mutation should
be governed by transaction semantics, including undo during rollback.
Differential Revision: https://phab.mercurial-scm.org/D4869
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 Oct 2018 17:28:54 -0700] rev 40056
revlog: move loading of index data into own method
This will allow us to "reload" a revlog instance from a rewritten
index file, which will be used in a subsequent commit.
Differential Revision: https://phab.mercurial-scm.org/D4868
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:57:35 -0700] rev 40055
revlog: clear revision cache on hash verification failure
The revision cache is populated after raw revision fulltext is
retrieved but before hash verification. If hash verification
fails, the revision cache will be populated and subsequent
operations to retrieve the invalid fulltext may return the cached
fulltext instead of raising.
This commit changes hash verification so it will invalidate the
revision cache if the cached node fails hash verification. The
side-effect is that subsequent operations to request the revision
text - even the raw revision text - will always fail.
The new behavior is consistent and is definitely less wrong. There
is an open question of whether revision(raw=True) should validate
hashes. But I'm going to punt on this problem. We can always change
behavior later. And to be honest, I'm not sure we should expose
raw=True on the storage interface at all. Another day...
Differential Revision: https://phab.mercurial-scm.org/D4867
Augie Fackler <augie@google.com> [Thu, 06 Sep 2018 02:36:25 -0400] rev 40054
fuzz: new fuzzer for cext/manifest.c
This is a bit messy, because lazymanifest is tightly coupled to the
cpython API for performance reasons. As a result, we have to build a
whole Python without pymalloc (so ASAN can help us out) and link
against that. Then we have to use an embedded Python interpreter. We
could manually drive the lazymanifest in C from that point, but
experimentally just using PyEval_EvalCode isn't really any slower so
we may as well do that and write the innermost guts of the fuzzer in
Python.
Leak detection is currently disabled for this fuzzer because there are
a few global-lifetime things in our extensions that we more or less
intentionally leak and I didn't want to take the detour to work around
that for now.
This should not be pushed to our repo until
https://github.com/google/oss-fuzz/pull/1853 is merged, as this
depends on having the Python tarball around.
Differential Revision: https://phab.mercurial-scm.org/D4879
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:32:21 -0700] rev 40053
revlog: rename _cache to _revisioncache
"cache" is generic and revlog instances have multiple caches. Let's
be descriptive about what this is a cache for.
Differential Revision: https://phab.mercurial-scm.org/D4866
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:56:48 -0700] rev 40052
testing: add file storage integration for bad hashes and censoring
In order to implement these tests, we need a backdoor to write data
into storage backends while bypassing normal checks. We invent a
callable to do that.
As part of writing the tests, I found a bug with censorrevision()
pretty quickly! After calling censorrevision(), attempting to
access revision data for an affected node raises a cryptic error
related to malformed compression. This appears to be due to the
revlog not adjusting delta chains as part of censoring.
I also found a bug with regards to hash verification and revision
fulltext caching. Essentially, we cache the fulltext before hash
verification. If we look up the fulltext after a failed hash
verification, we don't get a hash verification exception. Furthermore,
the behavior of revision(raw=True) can be inconsistent depending on
the order of operations.
I'll be fixing both these bugs in subsequent commits.
Differential Revision: https://phab.mercurial-scm.org/D4865
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:03:41 -0700] rev 40051
testing: add file storage tests for getstrippoint() and strip()
Differential Revision: https://phab.mercurial-scm.org/D4864
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 10:04:04 -0700] rev 40050
wireprotov2: always advertise raw repo requirements
I'm pretty sure my original thinking behind making it conditional
on stream clone support was that the behavior mirrored wire protocol
version 1.
I don't see a compelling reason for us to not advertise the server's
storage requirements. The proper way to advertise stream clone support
in wireprotov2 would be to not advertise the command(s) required to
perform stream clone or to advertise a separate capability denoting
stream clone support.
Stream clone isn't yet implemented on wireprotov2, so we can cross
this bridge later.
Differential Revision: https://phab.mercurial-scm.org/D4863
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 09:48:22 -0700] rev 40049
tests: don't be as verbose in wireprotov2 tests
I don't think that printing low-level I/O and frames is beneficial to
testing command-level functionality. Protocol-level testing, yes. But
command-level functionality shouldn't care about low-level details in
most cases. This output makes tests more verbose and harder to read.
It also makes them harder to maintain, as you need to glob over various
dynamic width fields.
Let's remove these low-level details from many of the wireprotov2
tests.
Differential Revision: https://phab.mercurial-scm.org/D4861
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 03 Oct 2018 12:57:01 -0700] rev 40048
repository: define and use revision flag constants
Revlogs have a per-revision 2 byte field holding integer flags that
define how revision data should be interpreted. For historical reasons,
these integer values are sent verbatim on the wire protocol as part of
changegroup data.
From a semantic standpoint, the flags that go out over the wire are
different from the flags stored internally by revlogs. Failure to
establish this semantic distinction creates unwanted strong coupling
between revlog's internals and the wire protocol.
This commit establishes new constants on the repository module that
define the revision flags used by the wire protocol (and by some
internal storage APIs, sadly). The changegroups internals documentation
has been updated to document them explicitly. Various references
throughout the repo now use the repository constants instead of the
revlog constants. This is done to make it clear that we're operating
on generic revision data and this isn't tied to revlogs.
Differential Revision: https://phab.mercurial-scm.org/D4860
Boris Feld <boris.feld@octobus.net> [Thu, 04 Oct 2018 01:22:25 +0200] rev 40047
context: reverse conditional branch order in introrev
Positive logic will be simpler to follow. It will help to clarify coming
refactoring.