Gregory Szorc <gregory.szorc@gmail.com> [Sun, 18 Dec 2016 17:02:57 -0800] rev 30778
revlog: add clone method
Upcoming patches will introduce functionality for in-place
repository/store "upgrades." Copying the contents of a revlog
feels sufficiently low-level to warrant being in the revlog
class. So this commit implements that functionality.
Because full delta recomputation can be *very* expensive (we're
talking several hours on the Firefox repository), we support
multiple modes of execution with regards to delta (re)use. This
will allow repository upgrades to choose the "level" of
processing/optimization they wish to perform when converting
revlogs.
It's not obvious from this commit, but "addrevisioncb" will be
used for progress reporting.
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 18 Dec 2016 16:59:04 -0800] rev 30777
repair: begin implementation of in-place upgrading
Now that all the upgrade planning work is in place, we can start
doing the real work: actually upgrading a repository.
The main goal of this commit is to get the "framework" for running
in-place upgrade actions in place.
Rather than get too clever and low-level with regards to in-place
upgrades, our strategy is to create a new, temporary repository,
copy data to it, then replace the old data with the new. This allows
us to reuse a lot of code in localrepo.py around store interaction,
which will eventually consume the bulk of the upgrade code.
But we have to start small. This patch implements adding new
repository requirements. But it still sets up a temporary
repository and locks it and the source repo before performing the
requirements file swap. This means all the plumbing is in place
to implement store copying in subsequent commits.
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 18 Dec 2016 16:51:09 -0800] rev 30776
repair: determine what upgrade will do
This commit introduces code for determining what actions/improvements
an upgrade should perform.
The "upgradefindimprovements" function introduces a mechanism to
return a list of improvements that can be made to a repository.
Each improvement is effectively an action that an upgrade will
perform. Associated with each of these improvements is metadata
that will be used to inform users what's wrong and what an
upgrade will do.
Each "improvement" is categorized as a "deficiency" or an
"optimization." TBH, I'm not thrilled about the terminology and
am receptive to constructive bikeshedding. The main difference
between a "deficiency" and an "optimization" is a deficiency
is always corrected (if it deviates from the current config) and
an "optimization" is an optional action that goes above and beyond
to improve the state of the repository (usually by requiring more
CPU during upgrade).
Our initial set of improvements identifies missing repository
requirements, a single, easily correctable problem with
changelog storage, and a set of "optimizations" related to delta
recalculation.
The main "upgraderepo" function has been expanded to handle
improvements. It queries for the list of improvements and determines
which of them will run based on the current repository state and user
I went through numerous iterations of the output format before
settling on a ReST-inspired definition list format. (I used
bulleted lists in the first submission of this commit and could
not get it to format just right.) Even with the various iterations,
I'm still not super thrilled with the format. But, this is a debug*
command, so that should mean we can refine the output without BC
concerns.
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 18 Dec 2016 16:16:54 -0800] rev 30775
repair: implement requirements checking for upgrades
This commit introduces functionality for upgrading a repository in
place. The first part that's implemented is testing for upgrade
"compatibility." This is done by examining repository requirements.
There are 5 functions returning sets of requirements that control
upgrading. Why so many functions? Mainly to support extensions.
Functions are easier to monkeypatch than module variables.
Astute readers will see that we don't support "manifestv2" and
"treemanifest" requirements in the upgrade mechanism. I don't have
a great answer for why other than this is a complex set of patches
and I don't want to deal with the complexity of these experimental
features just yet. We can teach the upgrade mechanism about them
later, once the basic upgrade mechanism is in place.
This commit also introduces the "upgraderepo" function. This will be
our main routine for performing an in-place upgrade. Currently, it
just implements requirements checking. The structure of some code in
this function may look a bit weird (e.g. the inline function that is
only called once). But this will make sense after future commits.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 24 Nov 2016 16:24:09 -0800] rev 30774
debugcommands: stub for debugupgraderepo command
Currently, if Mercurial introduces a new repository/store feature or
changes behavior of an existing feature, users must perform an
`hg clone` to create a new repository with hopefully the
correct/optimal settings. Unfortunately, even `hg clone` may not
give the correct results. For example, if you do a local `hg clone`,
you may get hardlinks to revlog files that inherit the old state.
If you `hg clone` from a remote or `hg clone --pull`, changegroup
application may bypass some optimization, such as converting to
generaldelta.
Optimizing a repository is harder than it seems and requires more
than a simple `hg` command invocation.
This commit starts the process of changing that. We introduce
`hg debugupgraderepo`, a command that performs an in-place upgrade
of a repository to use new, optimal features. The command is just
a stub right now. Features will be added in subsequent commits.
This commit does foreshadow some of the behavior of the new command,
notably that it doesn't do anything by default and that it takes
arguments that influence what actions it performs. These will be
explained more in subsequent commits.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 11 Jan 2017 21:47:19 -0500] rev 30773
util: teach stringmatcher to handle forced case insensitive matches
The 'author' and 'desc' revsets are documented to be case insensitive.
Unfortunately, this was implemented in 'author' by forcing the input to
lowercase, including for regex like '\B'. (This actually inverts the meaning of
the sequence.) For backward compatibility, we will keep that a case insensitive
regex, but by using matcher options instead of brute force.
This doesn't preclude future hypothetical 'icase-literal:' style prefixes that
can be provided by the user. Such user specified cases can probably be handled
up front by stripping 'icase-', setting the variable, and letting it drop
through the existing code.
Matt Harbison <matt_harbison@yahoo.com> [Wed, 11 Jan 2017 23:13:51 -0500] rev 30772
revset: point to 'grep' in the 'keyword' help for regex searches
The help for 'grep' already points to 'keyword'.
Martin von Zweigbergk <martinvonz@google.com> [Wed, 11 Jan 2017 23:13:00 -0800] rev 30771
help: explain that revsets can be used where 1 or 2 revs are wanted
We did not seem to document that one can do things like "hg up :@"
where the last revision of the revset ":@".
Martin von Zweigbergk <martinvonz@google.com> [Wed, 11 Jan 2017 22:46:07 -0800] rev 30770
help: explain what the term "revset" means
We refer to revsets in a few places (e.g. in "hg help config"), but we
never explained what they are. Until now.
Martin von Zweigbergk <martinvonz@google.com> [Wed, 11 Jan 2017 11:37:38 -0800] rev 30769
help: merge revsets.txt into revisions.txt
Selecting single and multiple revisions is closely related, so let's
put it in one place, so users can easily find it. We actually did not
even point to "hg help revsets" from "hg help revisions", but now that
they're on a single page, that won't be necessary.
Martin von Zweigbergk <martinvonz@google.com> [Wed, 11 Jan 2017 11:40:40 -0800] rev 30768
tests: use `hg help dates` instead of `hg help revs` in test
The revisions help is already long and will get longer, so switch to
another short and stable topic.
Martin von Zweigbergk <martinvonz@google.com> [Wed, 11 Jan 2017 11:28:54 -0800] rev 30767
help: use a single paragraph to describe full and abbreviated nodeids
The texts describing 40-digit strings and the abbreviated form are
closely related, so make it a single paragraph.
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 10 Jan 2017 23:37:08 -0800] rev 30766
hgweb: support Content Security Policy
Content-Security-Policy (CSP) is a web security feature that allows
servers to declare what loaded content is allowed to do. For example,
a policy can prevent loading of images, JavaScript, CSS, etc unless
the source of that content is whitelisted (by hostname, URI scheme,
hashes of content, etc). It's a nifty security feature that provides
extra mitigation against some attacks, notably XSS.
Mitigation against these attacks is important for Mercurial because
hgweb renders repository data, which is commonly untrusted. While we
make attempts to escape things, etc, there's the possibility that
malicious data could be injected into the site content. If this happens
today, the full power of the web browser is available to that
malicious content. A restrictive CSP policy (defined by the server
operator and sent in an HTTP header which is outside the control of
malicious content), could restrict browser capabilities and mitigate
security problems posed by malicious data.
CSP works by emitting an HTTP header declaring the policy that browsers
should apply. Ideally, this header would be emitted by a layer above
Mercurial (likely the HTTP server doing the WSGI "proxying"). This
works for some CSP policies, but not all.
For example, policies to allow inline JavaScript may require setting
a "nonce" attribute on <script>. This attribute value must be unique
and non-guessable. And, the value must be present in the HTTP header
and the HTML body. This means that coordinating the value between
Mercurial and another HTTP server could be difficult: it is much
easier to generate and emit the nonce in a central location.
This commit introduces support for emitting a
Content-Security-Policy header from hgweb. A config option defines
the header value. If present, the header is emitted. A special
"%nonce%" syntax in the value triggers generation of a nonce and
inclusion in <script> elements in templates. The inclusion of a
nonce does not occur unless "%nonce%" is present. This makes this
commit completely backwards compatible and the feature opt-in.
The nonce is a type 4 UUID, which is the flavor that is randomly
generated. It has 122 random bits, which should be plenty to satisfy
the guarantees of a nonce.
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 10 Jan 2017 20:47:48 -0800] rev 30765
hgweb: call process_dates() via DOM event listener
All the hgweb templates include mercurial.js in their header. All
the hgweb templates have the same <script> boilerplate to run
process_dates(). This patch factors that function call into
mercurial.js as part of a DOMContentLoaded event listener.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 24 Dec 2016 15:29:32 -0700] rev 30764
protocol: send application/mercurial-0.2 responses to capable clients
With this commit, the HTTP transport now parses the X-HgProto-<N>
header to determine what media type and compression engine to use for
responses. So far, we only compress responses that are already being
compressed with zlib today (stream response types to specific
commands). We can expand things to cover additional response types
later.
The practical side-effect of this commit is that non-zlib compression
engines will be used if both ends support them. This means if both
ends have zstd support, zstd - not zlib - will be used to compress
data!
When cloning the mozilla-unified repository between a local HTTP
server and client, the benefits of non-zlib compression are quite
noticeable:
engine server CPU (s) client CPU (s) bundle size
zlib (l=6) 174.1 283.2 1,148,547,026
zstd (l=1) 99.2 267.3 1,127,513,841
zstd (l=3) 103.1 266.9 1,018,861,363
zstd (l=7) 128.3 269.7 919,190,278
zstd (l=10) 162.0 - 894,547,179
none 95.3 277.2 4,097,566,064
The default zstd compression level is 3. So if you deploy zstd
capable Mercurial to your clients and servers and CPU time on
your server is dominated by "getbundle" requests (clients cloning
and pulling) - and my experience at Mozilla tells me this is often
the case - this commit could drastically reduce your server-side
CPU usage *and* save on bandwidth costs!
Another benefit of this change is that server operators can install
*any* compression engine. While it isn't enabled by default, the
"none" compression engine can now be used to disable wire protocol
compression completely. Previously, commands like "getbundle" always
zlib compressed output, adding considerable overhead to generating
responses. If you are on a high speed network and your server is under
high load, it might be advantageous to trade bandwidth for CPU.
Although, zstd at level 1 doesn't use that much CPU, so I'm not
convinced that disabling compression wholesale is worthwhile. And, my
data seems to indicate a slow down on the client without compression.
I suspect this is due to a lack of buffering resulting in an increase
in socket read() calls and/or the fact we're transferring an extra 3 GB
of data (parsing HTTP chunked transfer and processing extra TCP packets
can add up). This is definitely worth investigating and optimizing. But
since the "none" compressor isn't enabled by default, I'm inclined to
punt on this issue.
This commit introduces tons of tests. Some of these should arguably
have been implemented on previous commits. But it was difficult to
test without the server functionality in place.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 24 Dec 2016 15:22:18 -0700] rev 30763
httppeer: advertise and support application/mercurial-0.2
Now that servers expose a capability indicating they support
application/mercurial-0.2 and compression, clients can key off
this to say they support responses that are compressed with
various compression formats.
After this commit, the HTTP wire protocol client now sends an
"X-HgProto-<N>" request header indicating its support for
"application/mercurial-0.2" media type and various compression
formats.
This commit also implements support for handling
"application/mercurial-0.2" responses. It simply reads the header
compression engine identifier then routes the remainder of the
response to the appropriate decompressor.
There were some test changes, but only to logging. That points to
an obvious gap in our test coverage. This will be addressed in a
subsequent commit once server support is in place (it is hard to
test without server support).
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 24 Dec 2016 15:21:46 -0700] rev 30762
wireproto: advertise supported media types and compression formats
This commit introduces support for advertising a server's support for
media types and compression formats in accordance with the spec defined
in internals.wireproto.
The bulk of the new code is a helper function in wireproto.py to
obtain a prioritized list of compression engines available to the
wire protocol. While not utilized yet, we implement support
for obtaining the list of compression engines advertised by the
client.
The upcoming HTTP protocol enhancements are a bit lower-level than
existing tests (most existing tests are command centric). So,
this commit establishes a new test file that will be appropriate
for holding tests around the functionality of the HTTP protocol
itself.
Rounding out this change, `hg debuginstall` now prints compression
engines available to the server.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 24 Dec 2016 13:51:12 -0700] rev 30761
util: declare wire protocol support of compression engines
This patch implements a new compression engine API allowing
compression engines to declare support for the wire protocol.
Support is declared by returning a compression format string
identifier that will be added to payloads to signal the compression
type of data that follows and default integer priorities of the
engine.
Accessor methods have been added to the compression engine manager
class to facilitate use.
Note that the "none" and "bz2" engines declare wire protocol support
but aren't enabled by default due to their priorities being 0. It
is essentially free from a coding perspective to support these
compression formats, so we do it in case anyone may derive use from
it.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 24 Dec 2016 13:56:36 -0700] rev 30760
internals: document compression negotiation
As part of adding zstd support to all of the things, we'll need
to teach the wire protocol to support non-zlib compression formats.
This commit documents how we'll implement that.
To understand how we arrived at this proposal, let's look at how
things are done today.
The wire protocol today doesn't have a unified format. Instead,
there is a limited facility for differentiating replies as successful
or not. And, each command essentially defines its own response format.
A significant deficiency in the current protocol is the lack of
payload framing over the SSH transport. In the HTTP transport,
chunked transfer is used and the end of an HTTP response body (and
the end of a Mercurial command response) can be identified by a 0
length chunk. This is how HTTP chunked transfer works. But in the
SSH transport, there is no such framing, at least for certain
responses (notably the response to "getbundle" requests). Clients
can't simply read until end of stream because the socket is
persistent and reused for multiple requests. Clients need to know
when they've encountered the end of a request but there is nothing
simple for them to key off of to detect this. So what happens is
the client must decode the payload (as opposed to being dumb and
forwarding frames/packets). This means the payload itself needs
to support identifying end of stream. In some cases (bundle2), it
also means the payload can encode "error" or "interrupt" events
telling the client to e.g. abort processing. The lack of framing
on the SSH transport and the transfer of its responsibilities to
e.g. bundle2 is a massive layering violation and a wart on the
protocol architecture. It needs to be fixed someday by inventing a
proper framing protocol.
So about compression.
The client transport abstractions have a "_callcompressable()"
API. This API is called to invoke a remote command that will
send a compressible response. The response is essentially a
"streaming" response (no framing data at the Mercurial layer)
that is fed into a decompressor.
On the HTTP transport, the decompressor is zlib and only zlib.
There is currently no mechanism for the client to specify an
alternate compression format. And, clients don't advertise what
compression formats they support or ask the server to send a
specific compression format. Instead, it is assumed that non-error
responses to "compressible" commands are zlib compressed.
On the SSH transport, there is no compression at the Mercurial
protocol layer. Instead, compression must be handled by SSH
itself (e.g. `ssh -C`) or within the payload data (e.g. bundle
compression).
For the HTTP transport, adding new compression formats is pretty
straightforward. Once you know what decompressor to use, you can
stream data into the decompressor until you reach a 0 size HTTP
chunk, at which point you are at end of stream.
So our wire protocol changes for the HTTP transport are pretty
straightforward: the client and server advertise what compression
formats they support and an appropriate compression format is
chosen. We introduce a new HTTP media type to hold compressed
payloads. The header of the payload defines the compression format
being used. Whoever is on the receiving end can sniff the first few
bytes route to an appropriate decompressor.
Support for multiple compression formats is advertised on both
server and client. The server advertises a "compression" capability
saying which compression formats it supports and in what order they
are preferred. Clients advertise their support for multiple
compression formats and media types via the introduced "X-HgProto"
request header.
Strictly speaking, servers don't need to advertise which compression
formats they support. But doing so allows clients to fail fast if
they don't support any of the formats the server does. This is useful
in situations like sending bundles, where the client may have to
perform expensive computation before sending data to the server.
Rather than simply advertise a list of supported compression formats,
we introduce an additional "httpmediatype" server capability
advertising which media types the server supports. This means servers
are explicit about what formats they exchange. IMO, this is superior
to inferring support from other capabilities (like "compression").
By advertising compression support on each request in the "X-HgProto"
header and media type and direction at the server level, we are able
to gradually transition existing commands/responses to the new media
type and possibly compression. Contrast with the old world, where we
only supported a single media type and the use of compression was
built-in to the semantics of the command on both client and server.
In the new world, if "application/mercurial-0.2" is supported,
compression is supported. It's that simple.
It's worth noting that we explicitly don't use "Accept,"
"Accept-Encoding," "Content-Encoding," or "Transfer-Encoding" for
content negotiation and compression. People knowledgeable of the HTTP
specifications will say that we should use these because that's
what they are designed to be used for. They have a point and I
sympathize with the argument. Earlier versions of this commit even
defined supported media types in the "Accept" header. However, my
years of experience rolling out services leveraging HTTP has taught
me to not trust the HTTP layer, especially if you are going outside
the normal spec (such as using a custom "Content-Encoding" value to
represent zstd streams). I've seen load balancers, proxies, and other
network devices do very bad and unexpected things to HTTP messages
(like insisting zlib compressed content is decoded and then re-encoded
at a different compression level or even stripping compression
completely). I've found that the best way to avoid surprises when
writing protocols on top of HTTP is to use HTTP as a dumb transport as
much as possible to minimize the chances that an "intelligent" agent
between endpoints will muck with your data. While the widespread use of
TLS is mitigating many intermediate network agents interfering with
HTTP, there are still problems at the edges, with e.g. the origin HTTP
server needing to convert HTTP to and from WSGI and buggy or
feature-lacking HTTP client implementations. I've found the best way to
avoid these problems is to avoid using headers like "Content-Encoding"
and to bake as much logic as possible into media types and HTTP message
bodies. The protocol changes in this commit do rely on a custom HTTP
request header and the "Content-Type" headers. But we used them before,
so we shouldn't be increasing our exposure to "bad" HTTP agents.
For the SSH transport, we can't easily implement content negotiation
to determine compression formats because the SSH transport has no
content negotiation capabilities today. And without a framing protocol,
we don't know how much data to feed into a decompressor. So in order
to implement compression support on the SSH transport, we'd need to
invent a mechanism to represent content types and an outer framing
protocol to stream data robustly. While I'm fully capable of doing
that, it is a lot of work and not something that should be undertaken
lightly. My opinion is that if we're going to change the SSH transport
protocol, we should take a long hard look at implementing a grand
unified protocol that attempts to address all the deficiencies with
the existing protocol. While I want this to happen, that would be
massive scope bloat standing in the way of zstd support. So, I've
decided to take the easy solution: the SSH transport will not gain
support for multiple compression formats. Keep in mind it doesn't
support *any* compression today. So essentially nothing is changing
on the SSH front.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 24 Dec 2016 14:46:02 -0700] rev 30759
httppeer: extract code for HTTP header spanning
A second consumer of HTTP header spanning will soon be introduced.
Factor out the code to do this so it can be reused.
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 10 Jan 2017 11:20:32 -0800] rev 30758
commands: config option to control bundle compression level
Currently, bundle compression uses the default compression level
for the active compression engine. The default compression level
is tuned as a compromise between speed and size.
Some scenarios may call for a different compression level. For
example, with clone bundles, bundles are generated once and used
several times. Since the cost to generate is paid infrequently,
server operators may wish to trade extra CPU time for better
compression ratios.
This patch introduces an experimental and undocumented config
option to control the bundle compression level. As the inline
comment says, this approach is a bit hacky. I'd prefer for
the compression level to be encoded in the bundle spec. e.g.
"zstd-v2;complevel=15." However, given that the 4.1 freeze is
imminent, I'm not comfortable implementing this user-facing
change without much time to test and consider the implications.
So, we're going with the quick and dirty solution for now.
Having this option in the 4.1 release will enable Mozilla to
easily produce and test zlib and zstd bundles with non-default
compression levels in production. This will help drive future
development of the feature and zstd integration with Mercurial.
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 10 Jan 2017 11:19:37 -0800] rev 30757
bundle2: allow compression options to be passed to compressor
Compression engines allow options to be passed to them to control
behavior. This patch exposes an argument to bundle2.writebundle()
that passes options to the compression engine when writing compressed
bundles. The argument is honored for both bundle1 and bundle2, the
latter requiring a bit of plumbing to pass the value around.
Jun Wu <quark@fb.com> [Wed, 11 Jan 2017 23:39:24 +0800] rev 30756
chg: check snprintf result strictly
This makes the program more robust when somebody changes hgclient's
maxdatasize in the future.
Valters Vingolds <valters@vingolds.ch> [Tue, 10 Jan 2017 09:32:27 +0100] rev 30755
rebase: provide detailed hint to abort message if working dir is not clean
Detailed hint message is now provided when 'pull --rebase' operation detects
unclean working dir, for example:
abort: uncommitted changes
(cannot pull with rebase: please commit or shelve your changes first)
Added tests for uncommitted merge, and for subrepo support verifying that same
hint is also passed to subrepo state check.
Yuya Nishihara <yuya@tcha.org> [Mon, 09 Jan 2017 16:02:56 +0900] rev 30754
revset: parse variable-length arguments of followlines() by getargsdict()
Yuya Nishihara <yuya@tcha.org> [Mon, 09 Jan 2017 15:25:52 +0900] rev 30753
parser: extend buildargsdict() to support variable-length positional args
This can simplify the argument parsing of followlines(). Tests are added by
the next patch.
Yuya Nishihara <yuya@tcha.org> [Mon, 09 Jan 2017 15:15:21 +0900] rev 30752
parser: make buildargsdict() precompute position where keyword args start
This prepares for adding *varargs support. See the next patch.
Jun Wu <quark@fb.com> [Wed, 11 Jan 2017 07:40:52 +0800] rev 30751
chg: change server's process title
This patch uses the newly introduced "setprocname" interface to update the
process title server-side, to make it easier to tell what a worker is actually
doing.
The new title is "chg[worker/$PID]", where PID is the process ID of the
connected client. It can be directly observed using "ps -AF" under Linux, or
"ps -A" under FreeBSD.
Jun Wu <quark@fb.com> [Wed, 11 Jan 2017 07:36:48 +0800] rev 30750
chgserver: add the setprocname interface
This allows clients to change its process title freely.
Anton Shestakov <av6@dwimlabs.net> [Tue, 10 Jan 2017 23:41:58 +0800] rev 30749
hgweb: use archivespecs for links on repo index page too
Moving archivespecs to the module level allows using it from other modules
(such as hgwebdir_mod), and keeping a reference to it in requestcontext allows
current code to just work.
Anton Shestakov <av6@dwimlabs.net> [Tue, 10 Jan 2017 23:34:39 +0800] rev 30748
hgweb: use util.sortdict for archivespecs
Thus we allow dict-like indexing and "in" checks, and also preserve the order
of archive types and can generate links in a certain order (so
requestcontext.archives is no longer needed).
Anton Shestakov <av6@dwimlabs.net> [Wed, 11 Jan 2017 01:25:07 +0800] rev 30747
hgweb: test the order of archive links
Remi Chaintron <remi@fb.com> [Thu, 05 Jan 2017 17:16:51 +0000] rev 30746
revlog: REVIDX_EXTSTORED flag
This flag will be used by the lfs extension to mark the revision data as stored
externally.
Remi Chaintron <remi@fb.com> [Tue, 10 Jan 2017 16:15:21 +0000] rev 30745
revlog: flag processor
Add the ability for revlog objects to process revision flags and apply
registered transforms on read/write operations.
This patch introduces:
- the 'revlog._processflags()' method that looks at revision flags and applies
flag processors registered on them. Due to the need to handle non-commutative
operations, flag transforms are applied in stable order but the order in which
the transforms are applied is reversed between read and write operations.
- the 'addflagprocessor()' method allowing to register processors on flags.
Flag processors are defined as a 3-tuple of (read, write, raw) functions to be
applied depending on the operation being performed.
- an update on 'revlog.addrevision()' behavior. The current flagprocessor design
relies on extensions to wrap around 'addrevision()' to set flags on revision
data, and on the flagprocessor to perform the actual transformation of its
contents. In the lfs case, this means we need to process flags before we meet
the 2GB size check, leading to performing some operations before it happens:
- if flags are set on the revision data, we assume some extensions might be
modifying the contents using the flag processor next, and we compute the
node for the original revision data (still allowing extension to override
the node by wrapping around 'addrevision()').
- we then invoke the flag processor to apply registered transforms (in lfs's
case, drastically reducing the size of large blobs).
- finally, we proceed with the 2GB size check.
Note: In the case a cachedelta is passed to 'addrevision()' and we detect the
flag processor modified the revision data, we chose to trust the flag processor
and drop the cachedelta.
Remi Chaintron <remi@fb.com> [Thu, 05 Jan 2017 17:16:07 +0000] rev 30744
revlog: pass revlog flags to addrevision
Adding the ability to passing flags to addrevision instead of simply passing
default flags to _addrevision will allow extensions relying on flag transforms
to wrap around addrevision() in order to update revlog flags.
The first use case of this patch will be the lfs extension marking nodes as
stored externally when the contents are larger than the defined threshold.
One of the reasons leading to setting flags in addrevision() wrappers in the
flag processor design is that it allows to detect files larger than the 2GB
limit before the check is performed, which allows lfs to transform the contents
into metadata.
Remi Chaintron <remi@fb.com> [Thu, 05 Jan 2017 17:16:07 +0000] rev 30743
revlog: add 'raw' argument to revision and _addrevision
This patch introduces a new 'raw' argument (defaults to False) to revlog's
revision() and _addrevision() methods.
When the 'raw' argument is set to True, it indicates the revision data should be
handled as raw data by the flagprocessor.
Note: Given revlog.addgroup() calls are restricted to changegroup generation, we
can always set raw to True when calling revlog._addrevision() from
revlog.addgroup().
Jun Wu <quark@fb.com> [Tue, 10 Jan 2017 06:59:49 +0800] rev 30742
pager: do not special case chg
Since chg has its own _runpager implementation, it's no longer necessary to
special-case chg in the pager extension. This will effectively enable the
new chg pager code path that runs inside runcommand.
Jun Wu <quark@fb.com> [Tue, 10 Jan 2017 06:59:39 +0800] rev 30741
chg: remove getpager support
We have enough bits to switch to the new chg pager code path in runcommand.
So just remove the legacy getpager support.
This is a red-only patch, and will break chg's pager support temporarily.
Jun Wu <quark@fb.com> [Tue, 10 Jan 2017 06:59:31 +0800] rev 30740
chgserver: implement chgui._runpager
This patch implements chgui._runpager in a relatively simple way. A more
clean way is to move the core logic of "attachio" to "ui", which will be
done later after chg runs uisetup per request.
Jun Wu <quark@fb.com> [Tue, 10 Jan 2017 06:59:21 +0800] rev 30739
chgserver: make S channel support pager request
This patch adds the "pager" support for the S channel. The pager API allows
running some subcommands, namely attachio, and waiting for the client to be
properly synchronized.
Jun Wu <quark@fb.com> [Tue, 10 Jan 2017 06:59:03 +0800] rev 30738
chg: handle pager request client-side
This patch implements the simple S-channel pager handling at chg
client-side.
Note: It does not deal with environ and cwd currently for simplicity, which
will be fixed later.
Jun Wu <quark@fb.com> [Tue, 10 Jan 2017 06:58:51 +0800] rev 30737
chgserver: use util.shellenviron
This avoids code duplication.
Jun Wu <quark@fb.com> [Tue, 10 Jan 2017 06:58:02 +0800] rev 30736
util: extract the logic calculating environment variables
The method will be reused in chgserver. Move it out so it can be reused.
Anton Shestakov <av6@dwimlabs.net> [Sun, 08 Jan 2017 00:52:54 +0800] rev 30735
hgweb: generate archive links in order
It would be nice for archive links to always be in a certain commonly used
order, such as 'zip', 'bz', 'gzip2'. Repo index page (hgwebdir_mod) already
shows archive links in this order, let's do the same in hgweb_mod.
Sadly, archivespecs is a regular unordered dict, and collections.OrderedDict is
new in 2.7. But requestcontext.archives is a tuple of archive types, so it can
be used as an index to archivespecs.
Anton Shestakov <av6@dwimlabs.net> [Sun, 08 Jan 2017 01:24:45 +0800] rev 30734
hgweb: use archivespecs (dict) instead of archives (tuple) for "in" check
Matt Harbison <matt_harbison@yahoo.com> [Sun, 08 Jan 2017 14:37:44 -0500] rev 30733
test-obsolete: stabilize output on platforms without 'serve' support
The conditional was updating the repository, which wasn't reflected in
subsequent logs on Windows, so the conditional is narrowed to just the serve
commands. The serve operation generates log files, so those are deleted to keep
the output of summary consistent.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 08 Jan 2017 13:49:53 -0500] rev 30732
tests: update globs for Windows
The extra glob in test-command-template.t caused it to say no result was
reported. It used to be (within the past year), that both this and the missing
glob cases could be fixed simply by editing any output in the test, and
re-running it in interactive mode. But that no longer works, and I had to diff
*.t against *.t.err. I didn't dig into what changed.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 08 Jan 2017 12:05:10 -0500] rev 30731
help: merge the various operator sections of revsets, filesets and templates
Having sections for specific operator types assumes the user already knows what
type of operators are supported. By having a common heading, the user can
simply lookup help for "(revsets|filesets|templates).operators".
Matt Harbison <matt_harbison@yahoo.com> [Sun, 08 Jan 2017 02:43:01 -0500] rev 30730
help: apply the section headings from revsets to templates
Unlike filesets, there are a few distinct headings that are not shared with
revsets. But common names are used where possible.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 08 Jan 2017 02:40:36 -0500] rev 30729
help: apply the section headings from revsets to filesets
This has the nice property of visually breaking up the wall of text. It also
allows specific smaller sections to be called out. For example,
`hg help filesets.predicates` now prints just the predicate section. At the
moment, the revset headings are a superset of the fileset headings, so there is
consistency in how example, predicate and operator help is called out.
The reference to `hg help patterns` was moved to the overview section, so that
it isn't stuck in the examples section.
Jun Wu <quark@fb.com> [Fri, 06 Jan 2017 16:14:52 +0000] rev 30728
chg: check type read from S channel
The previous patch added the check server-side. This patch added it
client-side.
Jun Wu <quark@fb.com> [Fri, 06 Jan 2017 16:12:25 +0000] rev 30727
chgserver: check type passed to S channel
It currently only supports the "system" type. Add an explicit check.
Jun Wu <quark@fb.com> [Fri, 06 Jan 2017 16:11:03 +0000] rev 30726
chg: send type information via S channel (BC)
Previously S channel is only used to send system commands. It will also be
used to send pager commands. So add a type parameter.
This breaks older chg clients. But chg and hg should always come from a
single commit and be packed into a single package. Supporting running
inconsistent versions of chg and hg seems to be unnecessarily complicated
with little benefit. So just make the change and assume people won't use
inconsistent chg with hg.
Valters Vingolds <valters@vingolds.ch> [Sun, 01 Jan 2017 13:16:29 +0100] rev 30725
rebase: fail-fast the pull if working dir is not clean (BC)
Refuse to run 'hg pull --rebase' if there are uncommitted changes:
so that instead of going ahead with fetching changes and then suddenly aborting
the rebase, we can warn user of uncommitted changes (or unclean repo state)
right up front.
In tests, we create a 'histedit' session to verify that also an unfinished
state is detected and handled.
Yuya Nishihara <yuya@tcha.org> [Fri, 06 Jan 2017 22:50:04 +0900] rev 30724
commit: fix unmodified message detection for the "--- >8 ----" magic
We need the raw editortext to be compared with the templatetext.
Yuya Nishihara <yuya@tcha.org> [Fri, 06 Jan 2017 22:44:39 +0900] rev 30723
commit: update test to actually modify template text
We have a check for unmodified commit message (introduced by
bec1a579ebc4),
which should be enabled for the "--- >8 ---" magic but currently not.
Jun Wu <quark@fb.com> [Mon, 26 Dec 2016 00:25:44 +0000] rev 30722
pager: wrap ui._runpager
As discussed at [1], ui._runpager will be the new low-level API accepting a
pager command to actually run the pager. And ui.pager is the high-level API
which reads config directly from self.
This change is necessary for chgserver to override _runpager cleanly.
[1]: www.mercurial-scm.org/pipermail/mercurial-devel/2016-December/091656.html
Denis Laxalde <denis@laxalde.org> [Sat, 07 Jan 2017 12:24:15 +0100] rev 30721
summary: use ui.label and join to write evolution troubles
Follow-up on
7b526670f540 to avoid a convoluted loop.
Denis Laxalde <denis@laxalde.org> [Sat, 07 Jan 2017 12:07:56 +0100] rev 30720
log: drop unnecessary ui.note label from "trouble: " line
Follow-up on
f05ede08dcf7 and
6d0b1a69f98c.
Denis Laxalde <denis.laxalde@logilab.fr> [Wed, 04 Jan 2017 16:47:49 +0100] rev 30719
revset: add a followlines(file, fromline, toline[, rev]) revset
This revset returns the history of a range of lines (fromline, toline) of a
file starting from `rev` or the current working directory.
Added tests in test-annotate.t which already contains a reasonably complex
repository.