Durham Goode <durham@fb.com> [Sat, 16 May 2015 16:12:00 -0700] rev 25238
match: add root to _buildmatch
A future patch will make _buildmatch able to expand relative include patterns.
Doing so will require knowing the root of the repo, so let's go ahead and pass
it in.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 21 May 2015 10:41:06 -0700] rev 25237
localrepo: extract stream clone application into reusable function
The existing stream_in method assumes a streaming clone is applied via
the wire protocol. Previous patches have enabled streaming clone data to
be produced and consumed outside the context of the wire protocol.
However, the consuming part was incomplete because it didn't deal with
things like updating the branch caches or writing out a requirements
file.
This patch finishes the separation of stream clone handling from the
wire protocol. After this patch, it is possible to consume stream clones
from arbitrary sources, including files. Mozilla plans to leverage this
to serve pre-generated stream clone files to consumers, drastically
reducing the wall and CPU time required to clone large repositories.
This will enable clones to be nearly as fast as `tar`.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 21 May 2015 10:27:45 -0700] rev 25236
exchange: move code for consuming streaming clone into exchange
For reasons outlined in the previous commit, we want to make the code
for consuming "stream bundles" reusable. This patch extracts the code
into a standalone function.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 21 May 2015 10:27:22 -0700] rev 25235
exchange: move code for generating a streaming clone into exchange
Streaming clones are fast because they are essentially tar files.
On mozilla-central, a streaming clone only consumes ~55s CPU time
on clients as opposed to ~340s CPU time for a regular clone or gzip
bundle unbundle.
Mozilla is deploying static file "lookaside" support to our Mercurial
server. Static bundles are pre-generated and uploaded to S3. When a
clone is performed, the static file is fetched, applied, and then an
incremental pull is performed. Unfortunately, on an ideal network
connection this still takes as much wall and CPU time as a regular
clone (although it does save significant server resources).
We like the client-side wall time wins of streaming clones. But we want
to leverage S3-based pre-generated files for serving the bulk of clone
data.
This patch moves the code for producing a "stream bundle" into its
own standalone function, away from the wire protocol. This will enable
stream bundle files to be produced outside the context of the wire
protocol.
A bikeshed on whether exchange is the best module for this function
might be warranted. I selected exchange instead of changegroup because
"stream bundles" aren't changegroups (yet).
Martin von Zweigbergk <martinvonz@google.com> [Tue, 19 May 2015 10:13:43 -0700] rev 25234
dirstate: avoid match.files() in walk()
Martin von Zweigbergk <martinvonz@google.com> [Tue, 28 Oct 2014 22:47:22 -0700] rev 25233
match: introduce boolean prefix() method
tl;dr: This is another step towards a (previously unstated) goal of
eliminating match.files() in conditions.
There are four types of matchers:
* always: Matches everything, checked with always(), files() is empty
* exact: Matches exact set of files, checked with isexact(), files()
contains the files to match
* patterns: Matches more complex patterns, checked with anypats(),
files() contains roots of the matched patterns
* prefix: Matches simple 'path:' patterns as prefixes ('foo' matches
both 'foo' and 'foo/bar'), no single method to check, files()
contains the prefixes to match
For completeness, it would be nice to have a method for checking for
the "prefix" type of matcher as well, so let's add that, making it
return True simply when none of the others do.
The larger goal here is to eliminate uses of match.files() in
conditions (i.e. bool(match.files())). The reason for this is that
there are scenarios when you would like to create a "prefix" matcher
that happens to match no files. One example is for 'hg files -I foo
bar'. The narrowmatcher also restricts the set of files given and it
would not surprise me if have bugs caused by that already. Note that
'if m.files() and not m.anypats()' and similar is sometimes used to
catch the "exact" and "prefix" cases above.
Anton Shestakov <engored@ya.ru> [Thu, 21 May 2015 19:52:36 +0800] rev 25232
hgweb: descend empty directories in monoblue
The ability to "skip" a chain of empty directories in hgweb was added in
c21d236ca897, but monoblue style wasn't updated.
This block is copied from gitweb/map file and just works.
Drew Gottlieb <drgott@google.com> [Mon, 18 May 2015 14:29:20 -0700] rev 25231
match: have visitdir() consider includes and excludes
match.visitdir() used to only look at the match's primary pattern roots to
decide if a treemanifest traverser should descend into a particular directory.
This change logically makes visitdir also consider the match's include and
exclude pattern roots (if applicable) to make this decision.
This is especially important for situations like using narrowhg with multiple
treemanifest revlogs.
Anton Shestakov <engored@ya.ru> [Thu, 21 May 2015 00:27:12 +0800] rev 25230
hgweb: remove artificial width constraint from header in monoblue
This width property comes from the beginning of the monoblue theme itself, and
was used to stop the action header ("summary", "shortlog", "changelog", etc)
from clashing with the search form. But it still was happening (on smaller
screens, and with more actions added to hgweb over time).
Effectively, the hardcoded width was preventing the header from fitting into
the available screen space, since it always tried to be 900px wide, even if
that meant horizontal scroll on smaller screens and having the actions on two
lines where one should've been enough. For example,
http://selenic.com/hg/log?style=monoblue has the last two actions ("gz" and
"help") in the header on the second line, even when there seems to be enough
space on the first.
This patch makes the form float, which prevents it from overlaying/clashing
with the action header, and allows the latter to resize itself in the best
possible way.
Matt Mackall <mpm@selenic.com> [Wed, 20 May 2015 15:29:32 -0500] rev 25229
merge with stable
Matt Harbison <matt_harbison@yahoo.com> [Sun, 17 May 2015 22:42:47 -0400] rev 25228
files: recurse into subrepos automatically with an explicit path
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Wed, 20 May 2015 01:06:09 +0900] rev 25227
dirstate: use open/read of vfs(opener) explicitly instead of read
This simplifies changes in subsequent patch, which tries to open
`.pending` file when HG_PENDING environment variable is defined.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Wed, 20 May 2015 01:06:09 +0900] rev 25226
dirstate: use self._filename instead of immediate string `dirstate`
This prevents immediate string `dirstate` from multiplying. This is also
a preparation for making dirstate aware of PENDING mechanism in
subsequent patches.
Yuya Nishihara <yuya@tcha.org> [Tue, 19 May 2015 23:29:20 +0900] rev 25225
revset: drop translation marker from error message of _notpublic()
It is a kind of an internal error. End user won't see it.
Yuya Nishihara <yuya@tcha.org> [Tue, 19 May 2015 23:26:25 +0900] rev 25224
revset: drop docstring from internal _notpublic() function
It shouldn't be listed in "hg help revset".
Laurent Charignon <lcharignon@fb.com> [Wed, 13 May 2015 20:30:12 -0700] rev 25223
record: make hg record always use the non curses interface
Before this patch, hg record was running hg commit -i, therefore with the
experimental.crecord=True flag, hg record was actually launching the curses
record interface. Some of our users could be confused by that.
This patch makes the hg record command set this flag to False, ensuring that
hg record never shows the curses interface.
commit -i, shelve -i and revert -i remain unchanged and use the curses
interface if the experimental.crecord flag is set.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 09 Apr 2015 17:14:35 -0700] rev 25222
treemanifest: lazily load manifests
Most operations on treemanifests already visit only relevant
submanifests. Notable examples include __getitem__, __contains__,
walk/matches with matcher, diff. By making submanifests lazily loaded,
we speed up all these operations.
The lazy loading is achieved by adding a _load() method that gets
defined where we currently eagerly parse the manifest. We make sure to
call it before any access to _dirs, _files or _flags.
Some timings on the Mozilla repo (with flat manifest timings for
reference):
hg cat -r . README.txt: 1.644s -> 0.096s (0.255s)
hg diff -r .^ -r . : 1.746s -> 0.137s (0.431s)
hg files -r . python : 1.508s -> 0.146s (0.335s)
hg files -r . : 2.125s -> 2.203s (0.712s)
Martin von Zweigbergk <martinvonz@google.com> [Mon, 18 May 2015 21:31:40 -0700] rev 25221
treemanifest: speed up commit using dirty flag
We currently avoid saving a treemanifest revision if it's the same as
one of it's parents. This is checked by comparing the generated text
for all three versions. Let's avoid that when possible by comparing
the nodeids for clean (not dirty) nodes.
On the Mozilla repo, this speeds up commit from 2.836s to 2.343s.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 26 Feb 2015 08:16:13 -0800] rev 25220
treemanifest: speed up diff by keeping track of dirty nodes
Since tree manifests have a nodeid per directory, we can avoid diffing
entire directories if they have the same nodeid. The comparison is
only valid for unmodified treemanifest instances, of course, so we
need to keep track of which have been modified. Therefore, let's add a
dirty flag to treemanifest indicating whether its nodeid can be
trusted. We set it when _files or _dirs is modified, and make diff(),
and its cousin filesnotin(), not descend into subdirectories that are
the same on both sides.
On the Mozilla repo, this speeds up 'hg diff -r .^ -r .' from 1.990s
to 1.762s. The improvement will be much larger when we start lazily
loading subdirectory manifests.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Wed, 20 May 2015 04:34:27 +0900] rev 25219
localrepo: use correct argument name for pretxnclose hooks (BC)
Before this patch, "the reason for the transaction" is passed to
`pretxnclose` hooks via wrong name argument `xnname` (`HG_XNNAME` for
external hooks)
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Wed, 20 May 2015 04:34:27 +0900] rev 25218
localrepo: rename hook argument from TXNID to txnid (BC)
From the first (3.4 or
d283517b260b), `TXNID` is passed to Python
hooks without lowering its name, but it is wrong.
Martin von Zweigbergk <martinvonz@google.com> [Wed, 05 Nov 2014 11:25:57 -0800] rev 25217
test-walk: add more tests for -I/-X
We had very limited testing of -I and -X, especially when combined
with plain file patterns and with each other. This adds some more
protection against regressions as upcoming patches modify the matcher
code. (Originally meant for my own upcoming patches, but now I know
drgott will be sending some patches soon.)
The only noteworthy cases seems to be that both of
hg debugwalk -Xbeans/black beans/black
hg debugwalk -Xbeans beans/black
walk the file. I would personally have expected the -X to trump. I
don't care enough to change it, but I also think it's fair if some
future commit changes the behavior.
Durham Goode <durham@fb.com> [Sat, 16 May 2015 16:06:22 -0700] rev 25216
ignore: use 'include:' rules instead of custom syntax
Now that the matcher supports 'include:' rules, let's change the dirstate.ignore
creation to just create a matcher with a bunch of includes. This allows us to
completely delete ignore.py.
I moved some of the syntax documentation over to readpatternfile in match.py so
we don't lose it.
Durham Goode <durham@fb.com> [Sat, 16 May 2015 15:56:52 -0700] rev 25215
match: add 'include:' syntax
This allows the matcher to understand 'include:path/to/file' style rules. The
files support the standard hgignore syntax and any rules read from the file are
included in the matcher without regard to the files location in the repository
(i.e. if the included file is in somedir/otherdir, all of it's rules will still
apply to the entire repository).
Durham Goode <durham@fb.com> [Mon, 18 May 2015 16:27:56 -0700] rev 25214
match: add optional warn argument
Occasionally the matcher will want to print warning messages instead of throwing
exceptions (like if it encounters a bad syntax parameter when parsing files).
Let's add an optional warn argument that can provide this. The next patch will
actually use this argument.
Durham Goode <durham@fb.com> [Sat, 16 May 2015 15:51:03 -0700] rev 25213
match: add source to kindpats list
Future patches will be adding the ability to recursively include pattern files
in a match rule expression. Part of that behavior will require tracking which
file each pattern came from so we can report errors correctly.
Let's add a 'source' arg to the kindpats list to track this. Initially it will
only be populated by listfile rules.
Matt Mackall <mpm@selenic.com> [Tue, 19 May 2015 08:41:04 -0500] rev 25212
check-code: reintroduce str.format() ban for 3.x porting
In their infinite wisdom, the Python maintainers stripped bytes of its
% and format() methods for 3.x. They've now added % back to 3.5, but
format() is still missing. Since we don't have any particular need for
it, we should keep avoiding it.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 23:43:36 -0500] rev 25211
util: drop the 'unpacker' helper
It is not helping anything anymore.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:56:04 -0500] rev 25210
MBTextWrapper: drop dedicated __init__ method
It was only there as a compatibility layer with a version of Python which we do
support anymore.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:54:21 -0500] rev 25209
util: drop the compatibility with Python 2.4 unpacker
Python 2.4 compatibility have packed and sailed.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:52:28 -0500] rev 25208
tests: just use 'response.reason'
There is no reason to not have simple code now that Python 2.4 is dead.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:51:02 -0500] rev 25207
url: drop awful hack around bug in Python 2.4
It's all just a memory now.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:47:26 -0500] rev 25206
httpconnection: drop Python 2.4 specify hack
Python 2.4.1 doesn't provide the full URI, good for it.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:46:32 -0500] rev 25205
mail: drop explicit mail import required by Python 2.4
He's dead, Jim.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:33:57 -0500] rev 25204
windows: drop Python2.4 specific hack for directory not found handling
A good Python 2.4 hack is a removed Python 2.4 hack.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:24:16 -0500] rev 25203
notify: drop import required by Python 2.4
Toto, I've a feeling we're not in anno 2004 anymore.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:22:15 -0500] rev 25202
patchbomb: stop explicit import required by Python 2.4
Ding Dong, the witch is dead!
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:20:12 -0500] rev 25201
pager: drop python 2.4 hack around subprocess
Farewell, we do not need you anymore.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:18:18 -0500] rev 25200
check-code: drop ban of 'val if cond else otherval' construct
We now have access to this horrible but less bad than
'cond and val or otherval' syntax.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:30:24 -0500] rev 25199
check-code: entirely drop the 'non-py24.py' file from the test
There are no Python 2.4 related errors remaining.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:11:44 -0500] rev 25198
check-code: drop the 'format' built-in
I'm not clear what it is doing, but one who knows what it is about can now make
use of it.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 16:09:05 -0500] rev 25197
check-code: drop ban of str.format
After discussion with Augie and Matt, we are fine with it being introduced in
the code base.
Augie Fackler <raf@durin42.com> [Mon, 18 May 2015 22:40:16 -0400] rev 25196
statichttprepo: remove wrong getattr ladder
At least as far back as Python 2.6 the .code attribute is always
defined, and to the best of my detective skills a .getcode() method
has never been a thing.
Matt Mackall <mpm@selenic.com> [Tue, 19 May 2015 07:17:57 -0500] rev 25195
merge with stable
Matt Harbison <matt_harbison@yahoo.com> [Sun, 17 May 2015 22:09:37 -0400] rev 25194
match: explicitly naming a subrepo implies always() for the submatcher
The files command supports naming directories to limit the output to children of
that directory, and it also supports -S to force recursion into a subrepo. But
previously, using -S and naming a subrepo caused nothing to be output. The
reason was narrowmatcher() strips the current subrepo path off of each file,
which would leave an empty list if only the subrepo was named.
When matching on workingctx, dirstate.matches() would see the submatcher is not
always(), so it returned the list of files in dmap for each file in the matcher-
namely, an empty list. If a directory in a subrepo was named, the output was as
expected, so this was inconsistent.
The 'not anypats()' check is enforced by an existing test around line 140:
$ hg remove -I 're:.*.txt' sub1
Without the check, this removed all of the files in the subrepo.
Matt Harbison <matt_harbison@yahoo.com> [Sun, 17 May 2015 01:06:10 -0400] rev 25193
context: don't complain about a matcher's subrepo paths in changectx.walk()
Previously, the first added test printed the following:
$ hg files -S -r '.^' sub1/sub2/folder
sub1/sub2/folder: no such file in rev
9bb10eebee29
sub1/sub2/folder: no such file in rev
9bb10eebee29
sub1/sub2/folder/test.txt
One warning occured each time a subrepo was crossed into.
The second test ensures that the matcher copy stays in place. Without the copy,
the bad() function becomes an increasingly longer chain, and no message would be
printed out for a file missing in the subrepo because the predicate would match
in one of the replaced methods. Manifest doesn't know anything about subrepos,
so it needs help ignoring subrepos when complaining about bad files.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 22:35:27 -0500] rev 25192
ssh: capture output with bundle2 again (
issue4642)
I just discovered that we are not displaying ssh server output in real time
anymore. So we can just fall back to the bundle2 output capture for now. This
fix the race condition issue we where seeing in tests. Re-instating real time
output for ssh would fix the issue too but lets get the test to pass first.
Laurent Charignon <lcharignon@fb.com> [Fri, 24 Apr 2015 14:30:30 -0700] rev 25191
revset: optimize not public revset
This patvh speeds up the computation of the not public() changeset
and incidentally speed up the computation of divergents() changeset on our big
repo by 100x from 50% to 0.5% of the time spent in smartlog with evolve.
In this patch we optimize not public() to _notpublic() (new revset) and use
the work on phaseset (from the previous commit) to be able to compute
_notpublic() quickly.
We use a non-lazy approach making the assumption the number of notpublic
change will not be in the order of magnitude of the repo size. Adopting a
lazy approach gives a speedup of 5x (vs 100x) only due to the overhead of the
code for lazy generation.
Laurent Charignon <lcharignon@fb.com> [Wed, 01 Apr 2015 11:17:17 -0700] rev 25190
phases: add set per phase in C phase computation
To speed up the computation of draft(), secret(), divergent(), obsolete() and
unstable() we need to have a fast way of getting the list of revisions that
are in draft(), secret() or the union of both: not public().
This patch extends the work on phase computation in C and make the phase
computation code also return a list of set for each non public phase.
Using these sets we can quickly obtain all the revisions of a given phase.
We do not return a set for the public phase as we expect it to be roughly the
size of the repo. Also, it can be computed easily by substracting the entries in the
non public phases from all the revs in the repo.
Drew Gottlieb <drgott@google.com> [Fri, 08 May 2015 12:30:51 -0700] rev 25189
match: rename _fmap to _fileroots for clarity
fmap isn't a very descriptive name for the set of the match's files.
Drew Gottlieb <drgott@google.com> [Wed, 06 May 2015 15:59:35 -0700] rev 25188
match: remove unnecessary optimization where visitdir() returns 'all'
Match's visitdir() was prematurely optimized to return 'all' in some cases, so
that the caller would not have to call it for directories within the current
directory. This change makes the visitdir system less flexible for future
changes, such as making visitdir consider the match's include and exclude
patterns.
As a demonstration of this optimization not actually improving performance,
I ran 'hg files -r . media' on the Mozilla repository, stored as treemanifest
revlogs.
With best of ten tries, the command took 1.07s both with and without the
optimization, even though the optimization reduced the calls from visitdir()
from 987 to 51.
Augie Fackler <augie@google.com> [Thu, 16 Apr 2015 17:12:33 -0400] rev 25187
dispatch: add support for python-flamegraph[0] profiling
This gives us nicer svg flame graphs for output, which can make
understanding some types of performance problems significantly easier.
0: https://github.com/evanhempel/python-flamegraph/
Augie Fackler <augie@google.com> [Tue, 28 Apr 2015 16:44:37 -0400] rev 25186
extensions: document that `testedwith = 'internal'` is special
Extension authors (notably at companies using hg) have been
cargo-culting the `testedwith = 'internal'` bit from hg's own
extensions, which then defeats our "file bugs over here" logic in
dispatch. Let's be more aggressive about trying to give extension
authors a hint about what testedwith should say.
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Apr 2015 23:12:33 -0700] rev 25185
treemanifest: cache directory logs and manifests
Since manifests instances are cached on the manifest log instance, we
can cache directory manifests by caching the directory manifest
logs. The directory manifest log cache is a plain dict, so it never
expires; we assume that we can keep all the directories in memory.
The cache is kept on the root manifestlog, so access to directory
manifest logs now has to go through the root manifest log.
The caching will soon not be only an optimization. When we start
lazily loading directory manifests, we need to make sure we don't
create multiple instances of the log objects. The caching takes care
of that problem.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 15:40:23 -0500] rev 25184
hook: drop dedicated catch for 'KeyboardInterrupt'
This is no longer under 'Exception' in Python 2.6.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 15:38:24 -0500] rev 25183
recover: catch any exception, not just Exception
We want recover to be rock solid.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 15:33:21 -0500] rev 25182
exchange: catch down to BaseException when handling bundle2
We can now catch more things.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 13:23:14 -0500] rev 25181
bundle2: use BaseException in bundle2
We can ensure we fail over properly in more cases.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 13:20:19 -0500] rev 25180
check-code: drop ban of BaseException
Lets go back to the basic. It is available in Python 2.6.
Pierre-Yves David <pierre-yves.david@fb.com> [Mon, 18 May 2015 13:25:07 -0500] rev 25179
wireproto: turn an 'except' into a 'finally' as suggest by the comment
Look! More hidden footprints!