Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 16:26:43 -0700] rev 32292
changelog: load pending file directly
When changelogs are written, a copy of the index (or inline revlog)
may be written to an 00changelog.i.a file to facilitate hooks and
other processes having access to the pending data before it is
finalized.
The way it works today, the localrepo class loads the changelog
like normal. Then, if it detects a pending transaction, it asks
the changelog class to load a pending changelog. The changelog
class looks for a 00changelog.i.a file. If it exists, it is
loaded and internal data structures on the new revlog class are
copied to the original instance.
The existing mechanism is inefficient because it loads 2 revlog
files. The index, node map, and chunk cache for 00changelog.i
are thrown away and replaced by those for 00changelog.i.a.
The existing mechanism is also brittle because it is a layering
violation to access the data structures being accessed. For example,
the code copies the "chunk cache" because for inline revlogs
this cache contains the raw revision chunks and allows the original
changelog/revlog instance to access revision data for these pending
revisions. This whole behavior of course relies on the revlog
constructor reading the entirety of an inline revlog into memory
and caching it. That's why it is brittle. (I discovered all this
as part of modifying behavior of the chunk cache.)
This patch streamlines the loading of a pending 00changelog.i.a
revlog by doing it directly in the changelog constructor if told
to do so. When this code path is active, we no longer load the
00changelog.i file at all.
The only negative outcome I see from this change is if loading
00changelog.i was somehow facilitating a role. But I can't imagine
what that would be because we throw away its data (the index data
structures are replaced and inline revision data is replaced via
the chunk cache) and since 00changelog.i.a is a copy of
00changelog.i, file content should be identical, so there should
be no meaninful file integrity checking at play. I think this was
all just sub-optimal code.
Martin von Zweigbergk <martinvonz@google.com> [Fri, 10 Feb 2017 16:56:29 -0800] rev 32291
cleanup: use set literals
We no longer support Python 2.6, so we can now use set literals.
Pulkit Goyal <7895pulkit@gmail.com> [Sat, 06 May 2017 04:51:25 +0530] rev 32290
py3: convert date and format arguments str before passing in time.strptime
time.strptime() raises ValueError if the arguments are not str.
Source Code: https://hg.python.org/cpython/file/3.5/Lib/_strptime.py#l307
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 04 May 2017 00:24:21 +0530] rev 32289
py3: convert kwargs' keys to str using pycompat.strkwargs
Jun Wu <quark@fb.com> [Sun, 14 May 2017 09:38:06 -0700] rev 32288
verify: add a config option to skip certain flag processors
Previously, "hg verify" verifies everything, which could be undesirable when
there are expensive flag processor contents.
This patch adds a "verify.skipflags" developer config. A flag processor will
be skipped if (flag & verify.skipflags) == 0.
In the LFS usecase, that means "hg verify --config verify.skipflags=8192"
will not download all LFS blobs, which could be too large to be stored
locally.
Note: "renamed" is also skipped since its default implementation may call
filelog.data() which will trigger the flag processor.
Durham Goode <durham@fb.com> [Mon, 15 May 2017 09:35:27 -0700] rev 32287
changegroup: add bundlecaps back
Commit
282b288aa20c333c removed the unused bundlecaps argument from the
changegroup code. While it is unused in core Mercurial, it was an important
feature for the remotefilelog extension because it allowed the exchange layer to
communicate to the changegroup packer that this was a shallow repo and that
filelogs should not be included. Without bundlecaps, there is currently no other
way to pass that information along without a more extensive refactor of
exchange, bundle, and changegroup code.
This patch backs out the original removal, and merges it with some recent
changes to changegroup apis.
Jun Wu <quark@fb.com> [Wed, 10 May 2017 16:17:58 -0700] rev 32286
flagprocessor: add a fast path when flags is 0
When flags is 0, _processflags could be a no-op instead of iterating through
the flag bits.
Kostia Balytskyi <ikostia@fb.com> [Sat, 13 May 2017 14:52:29 -0700] rev 32285
shelve: make shelvestate use simplekeyvaluefile
Currently shelvestate uses line ordering to differentiate fields. This
makes it hard for extensions to wrap shelve, since if two alternative
versions of code add a new line, correct merging is going to be problematic.
simplekeyvaluefile was introduced fot this purpose specifically.
After this patch:
- shelve will always write a simplekeyvaluefile
- unshelve will check the first line of the file for a version, and if the
version is 1, will read it in a position-based way, if the version is 2,
will read it in a key-value way
As discussed with Yuya previously, this will be able to handle old-style
shelvedstate files, but old Mercurial versions will fail on the attempt to
read shelvedstate file of version 2 with a self-explanatory message:
'abort: this version of shelve is incompatible with the version used
in this repo'
Kostia Balytskyi <ikostia@fb.com> [Sun, 14 May 2017 14:15:07 -0700] rev 32284
shelve: refactor shelvestate loading
This is a preparatory patch which separates file reading from the
minimal validation we have (like turning version into int and
checking that this version is supported). The purpose of this patch
is to be able to read statefile form simplekeyvaluefile, which is
implemented in the following patch.
Matt Harbison <matt_harbison@yahoo.com> [Thu, 11 May 2017 22:33:45 -0400] rev 32283
extdiff: copy back execbit-only changes to the working directory
Some tools like BeyondCompare allow the file mode to be changed. The change
was previously applied if the content of the file changed (either according to
size or mtime), but was not being copied back for a mode-only change. That
would seem to indicate handling this in an 'elif' branch, but I opted not to in
order to avoid copying back the mode without the content changes when mtime and
size are unchanged. (Yes, that's a rare corner case, but all the more reason
not to have a subtle difference in behavior.)
The only way I can think to handle this undetected change is to set each file in
the non-wdir() snapshot to readonly, and check for that attribute (as well as
mtime) when deciding to copy back. That would avoid the overhead of copying the
whole file when only the mode changed. But a chmod in a diff tool is likely
rare. See also
affd753ddaf1.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 12:14:24 -0700] rev 32282
tests: remove regular expression matching for Python 2.6
This effectively reverts
52cca17ac523. Some lines still have (re)
due to variable length port numbers. There's not much we can do
about that. But at least this change removes most of the ugliness.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 11:58:08 -0700] rev 32281
branchmap: remove use of buffer() to support Python 2.6
The use of buffer() was added in
7359157b9e46 to support Python 2.6,
which we no longer support.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 11:55:39 -0700] rev 32280
py3: remove delayed import of importlib
All supported versions of Python now have importlib. This
effectively reverts
b85fa6bf298b.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 11:52:44 -0700] rev 32279
tests: use context manager form of assertRaises
Support for using unittest.TestCase.assertRaises as a context
manager was added in Python 2.7. This form is more readable,
especially for complex tests.
While I was here, I also restored the use of assertRaisesRegexp,
which was removed in
c6921568cd20 for Python 2.6 compatibility.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 11:42:42 -0700] rev 32278
obsolete: use 2 argument form of enumerate()
The 2 argument form of enumerate was added in Python 2.6. This
change effectively reverts
10880c8aad85.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 11:31:36 -0700] rev 32277
tests: remove special handling for undefined memoryview
'memoryview' was introduced in Python 2.7.
4adc090fa2fb added code
to filterpyflakes.py to ignore "undefined name 'memoryview'" pyflakes
warnings. Since we no longer support <Python 2.7, we can remove this
workaround.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 11:20:51 -0700] rev 32276
encoding: remove workaround for locale.getpreferredencoding()
locale.getpreferredencoding() was buggy in OS X for Python <2.7.
Since we no longer support Python <2.7, we no longer need this
workaround.
This essentially reverts
2be70ca17311.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 13 May 2017 11:12:44 -0700] rev 32275
mail: remove code to support < Python 2.7
This code was added in
594b98846ce1. Since we no longer support
Python <2.7, it can be removed.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 11 May 2017 00:02:32 -0700] rev 32274
help: clarify that colons are allowed in fingerprints values
This was suggested by Lars Rohwedder in
issue5559.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 10 May 2017 23:49:37 -0700] rev 32273
sslutil: tweak the legacy [hostfingerprints] warning message
Lars Rohwedder noted in
issue5559 that the previous wording was
confusing. I agree.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 11 May 2017 11:37:18 -0700] rev 32272
rebase: allow rebase even if some revisions need no rebase (BC) (
issue5422)
This allows you to do e.g. "hg rebase -d @ -r 'draft()'" even if some
drafts are already based off of @. You'd still need to exclude
obsolete and troubled revisions, though. We will deal with those cases
later.
Implemented by treating state[rev]==rev as "no need to rebase". I
considered adding another fake revision number like revdone=-6. That
would make the code clearer in a few places, but would add extra code
in other places.
I moved the existing test out of test-rebase-base.t and into a new
file and added more tests there, since not all are using --base.
Jun Wu <quark@fb.com> [Wed, 10 May 2017 11:55:22 -0700] rev 32271
chgserver: more explicit about sensitive environ variables
Environment variables like HGUSER, HGEDITOR, HGEDITFROM should not trigger
a new chgserver. This patch uses a whitelist for environ variables starting
with "HG" to reduce the number of servers.
I have went through `grep -o "[\"']HG[A-Z_0-9]*['\"]" -hR . | sort -u` so
the list should be up-to-date.
Kostia Balytskyi <ikostia@fb.com> [Thu, 11 May 2017 08:49:33 -0700] rev 32270
scmutil: make simplekeyvaluefile able to have a non-key-value first line
To ease migration from files with version numbers in their first lines,
we want simplekeyvaluefile to support a non-key-value first line. In this
way, old versions of Mercurial will read such files, discover a newer version
than the one they know how to handle and fail gracefully, rather than with
exception. Shelve's shelvestate file is an example.
Kostia Balytskyi <ikostia@fb.com> [Thu, 11 May 2017 08:39:44 -0700] rev 32269
scmutil: add simplekeyvaluefile reading test
Before this patch, mockvfs did not emulate readlines correctly
and there was no test for simplekeyvaluefile reading.
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 18:57:52 +0200] rev 32268
caches: stop warming the cache after changegroup application
Now that we garantee that branchmap cache is updated at the end of the
transaction we can drop this update. This removes a problematic case with
nested transaction where the new cache could be written on disk before the
transaction is finished (and even roll-backed)
Such premature cache write was visible in the following test:
* tests/test-acl.t
* tests/test-rebase-conflicts.t
In addition, running the cache update later means having more date about the
state of the repository (in particular: phases). So we can generate caches with
more information. This creates harmless changes to the following tests:
* tests/test-hardlinks-whitelisted.t
* tests/test-hardlinks.t
* tests/test-phases.t
* tests/test-tags.t
* tests/test-inherit-mode.t
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 22:27:44 +0200] rev 32267
caches: move the 'updating the branch cache' message in 'updatecaches'
We are about to remove the branchmap cache update in changegroup application.
There is a debug message alongside this update that we do not want to loose. We
move the message beforehand to simplify the test update in the next changeset.
The message move is quite noisy and isolating that noise is useful.
Most tests update are just line reordering since the message is issued at a
later point during the transaction.
After this changes, the message is displayed in more case since local commit
creation also issue it.
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 18:56:07 +0200] rev 32266
caches: stop warming the cache after 'localrepo.commitctx'
Now that we garantee that branchmap cache are updated at the end of the
transaction we can drop that one. This removes a problematic case with nested
transaction where the new cache could be written on disk before the transaction
is finished.
The test change is harmless, since we update the cache at a later point, the
dirstate have been updated in between.
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 21:35:06 +0200] rev 32265
caches: introduce a 'debugupdatecaches' command
That command make sure caches are updated. This is based on
'localrepo.updatecaches' so when we move support for new cache in that function this
command will benefit from it.
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 19:05:58 +0200] rev 32264
caches: call 'repo.updatecache()' in 'repo.destroyed()'
Regenerating the cache after a 'strip' or a 'rollback' is useful. So we call the
generic cache warming function as other caches than just branchmap will be
updated there in the future.
To do so, we have to make 'repo.updatecache()' able to take no arguments. In
such cases, we reload all caches.
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 21:39:43 +0200] rev 32263
caches: introduce a function to warm cache
We have multiple caches that gain from being kept up to date. For example in a
server setup, we want to make sure the branchcache cache is hot for other
read-only clients.
Right now each cache tries to update themself in place where new data have been
added. However the approach is error prone (we might miss some spot) and
fragile. When nested transaction are involved, such cache updates might happen
before a top level transaction is committed. Writing caches for uncommitted
data on disk.
Having a single entry point, run at the end of each successful transaction,
helps to ensure the cache is up to date and refreshed at the right time.
We start with updating the branchmap cache but other will come.
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 18:45:51 +0200] rev 32262
transaction: track newly introduced revisions
Tracking revisions is not the data that will unlock the most new capability.
However, they are the simplest thing to track and still unlock some nice
improvements in regard with caching.
We plug ourself at the changelog level to make sure we do not miss any revision
additions.
The 'revs' set is configured at the repository level because the transaction
itself does not needs to know that much about the business logic.
Pierre-Yves David <pierre-yves.david@ens-lyon.org> [Tue, 02 May 2017 18:31:18 +0200] rev 32261
transaction: introduce "changes" dictionary to precisely track updates
The transaction is already tracking some data intended for hooks (in
'hookargs'). However, that information is minimal as we optimise for
passing data to other processes through environment variables. There are
multiple places were we could use more complete and lower level
information locally (eg: cache update, better report of changes to
hooks, etc...).
For this purpose we introduces a 'changes' dictionary on the
transaction. It is intended to track every changes happening to the
repository (eg: new revs, bookmarks move, phases move, obs-markers,
etc).
For now we just adds the 'changes' dictionary. We'll adds more tracking
and usages over time.
Siddharth Agarwal <sid0@fb.com> [Thu, 11 May 2017 10:50:05 -0700] rev 32260
clone: add a server-side option to disable full getbundles (pull-based clones)
For large enough repositories, pull-based clones take too long, and an attempt
to use them indicates some sort of configuration or other issue or maybe an
outdated Mercurial. Add a config option to disable them.
Siddharth Agarwal <sid0@fb.com> [Mon, 08 May 2017 20:01:06 -0700] rev 32259
clone: warn when streaming was requested but couldn't be performed
This helps both users and the people who support them figure out why
a stream clone couldn't be performed.
In an upcoming patch we're going to add a way for servers to hard
abort on a full getbundle. In those cases servers might expect
clients to perform a stream clone, so it's important to communicate
why one couldn't be done.
Siddharth Agarwal <sid0@fb.com> [Mon, 08 May 2017 18:47:24 -0700] rev 32258
clone: test streaming disabled because client is missing requirement
Turns out we had no coverage for this important case.
Siddharth Agarwal <sid0@fb.com> [Mon, 08 May 2017 17:30:51 -0700] rev 32257
bundle2: don't check for whether we can do stream clones
At the moment this isn't used and all stream clones use the legacy protocol.
In an upcoming diff, canperformstreamclone will print out a message if a stream
clone was requested but couldn't happen for some reason. Removing this call
ensures the message isn't printed twice.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 13 May 2017 03:37:50 +0900] rev 32256
debugcommands: add debugpickmergetool to examine which merge tool is chosen
Before this patch, there is no convenient way to know which merge tool
is chosen for each managed files without actual merging.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 13 May 2017 03:31:42 +0900] rev 32255
filemerge: add internal merge tool to dump files forcibly
Internal merge tool :dump implies premerge. Therefore, files aren't
dumped, if premerge runs successfully.
This undocumented behavior might confuse users, if they want to always
dump files. But just making :dump omit premerge might cause backward
compatibility issue for existing automation.
This patch adds new internal merge tool :forcedump, which works as
same as :dump, but omits premerge always.
Internal tools annotated with "nomerge" should merge "change and
delete" correctly, but _forcedump() can't. Therefore, it is annotated
with "mergeonly" to always omit premerge, even though it doesn't merge
files actually.
This patch also adds explanation about premerge to :dump, to clarify
how :dump actually works.
BTW, this patch specifies internal tools with "internal:" prefix in
newly added test scenario in test-merge-tools.t, even though this
prefix is already deprecated. This is only for similarity to other
tests in test-merge-tools.t.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 13 May 2017 03:28:36 +0900] rev 32254
filemerge: make warning message more i18n friendly
Before this patch, " specified for " part of warning messages
(e.g. "couldn't find merge tool TOOL specified for PAT") isn't
translatable.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 13 May 2017 03:28:36 +0900] rev 32253
filemerge: show warning about choice of :prompt only at an actual fallback
Before this patch, internal merge tool :prompt shows "no tool found to
merge FILE" line, even if :prompt is explicitly specified as a tool to
be used.
This patch shows warning message about choice of :prompt only at an
actual fallback, in which case any tool is rejected by capability for
binary or symlink.
It is for backward compatibility to omit warning message in
"changedelete" case.
Durham Goode <durham@fb.com> [Tue, 09 May 2017 13:56:46 -0700] rev 32252
treemanifest: allow manifestrevlog to take an explicit treemanifest arg
Previously we relied on the opener options to tell the revlog to be a tree
manifest. This makes it complicated for extensions to create treemanifests and
normal manifests at the same time. Let's add a construtor argument to create a
treemanifest revlog as well.
I considered removing the options['treemanifest'] logic from manifestrevlog
entirely, but doing so shifts the responsibility to the caller which ends up
requiring changes in localrepo, bundlerepo, and unionrepo. I figured having the
dual mechanism was better than polluting other parts of the code base with
treemanifest knowledge.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 23:02:43 +0900] rev 32251
policy: relax the default for in-place build
We're going to make the 'c' policy more strict, where no missing attribute
will be allowed. Since we want to run 'hg bisect' without rebuilding the C
extension modules, we'll need a looser policy for development environment.
The default for system installation isn't changed.
Note that the current 'c' policy is practically 'allow'-ish as we have lots
of adhoc fallbacks to pure functions.
Jun Wu <quark@fb.com> [Thu, 11 May 2017 14:52:02 -0700] rev 32250
verify: always check rawsize
Previously, verify only checks "rawsize == len(rawtext)", if
"len(fl.read()) != fl.size()".
With flag processor, "len(fl.read()) != fl.size()" does not necessarily mean
"rawsize == len(rawtext)". So we may miss a useful check.
This patch removes the "if len(fl.read()) != fl.size()" condition so the
rawsize check is always performed.
With the condition removed, "fl.read(n)" looks unnecessary so a comment was
added to explain the side effect is wanted.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 11 May 2017 22:38:15 -0700] rev 32249
rebase: rename "target" to "destination" in messages
The help text for rebase calls it "the destination" (never "target"),
so let's use that in messages as well.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 11 May 2017 22:38:03 -0700] rev 32248
rebase: rename "target" to "dest" in variable names
It took me a while to figure out that "target" was actually what's
passed to --dest.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 10 May 2017 23:32:00 -0700] rev 32247
sslutil: remove conditional cipher code needed for Python 2.6
We dropped support for Python 2.6. So this code to work around a
missing feature on 2.6 is no longer necessary.
Phil Cohen <phillco@fb.com> [Thu, 11 May 2017 18:38:43 -0700] rev 32246
merge: use repo.wvfs.setflags() instead of util.setflags()
Most merge.py code goes through the vfs instead of maniulating
files directly, so let's do the same here.
Martin von Zweigbergk <martinvonz@google.com> [Fri, 12 May 2017 11:20:25 -0700] rev 32245
merge with stable
Jun Wu <quark@fb.com> [Tue, 09 May 2017 21:27:06 -0700] rev 32244
revlog: move part of "addrevision" to "addrawrevision"
"addrawrevision" will be the public API to reuse revision rawdata elsewhere.
It will be used by a future patch.
Jun Wu <quark@fb.com> [Tue, 09 May 2017 20:23:21 -0700] rev 32243
filectx: add an overlayfilectx class
The end goal is to make it possible to avoid potential expensive fctx.data()
when unnecessary.
While memctx is useful for creating new file contexts, there are many cases
where we could reuse an existing raw file revision (amend, histedit, rebase,
process a revision constructed by a remote peer, etc). The overlayfilectx
class is made to support such reuse cases. Together with a later patch, hash
calculation and expensive flag processor could be avoided.
Jun Wu <quark@fb.com> [Tue, 09 May 2017 19:16:48 -0700] rev 32242
filectx: remove __new__
It does not seem to be used anywhere, and breaks a later patch.
Jun Wu <quark@fb.com> [Tue, 09 May 2017 16:34:12 -0700] rev 32241
filectx: add a rawflags method
The new method returns the low-level revlog flag. We already have "rawdata"
so a "rawflags" makes sense.
Both "rawflags" and "rawdata" will be used in a later patch.
Jun Wu <quark@fb.com> [Tue, 09 May 2017 19:53:31 -0700] rev 32240
filectx: move size to basefilectx
See previous patch for context - avoid code duplication.
Jun Wu <quark@fb.com> [Tue, 09 May 2017 19:48:57 -0700] rev 32239
filectx: make renamed a property cache
See previous patch for context - mainly to avoid code duplication.
Jun Wu <quark@fb.com> [Tue, 09 May 2017 19:23:28 -0700] rev 32238
filectx: make flags a property cache
Make basefilectx._flags a property cache, so basefilectx.flags() could be
reasonably reused. This avoids code duplication between memfilectx and a
class added in a later patch.
Jun Wu <quark@fb.com> [Sun, 30 Apr 2017 11:21:05 -0700] rev 32237
commandserver: move printbanner logic to bindsocket
bindsocket now handles listen automatically. "printbanner" seems to be just
a part of "bindsocket". This simplifies the interface a bit.
Jun Wu <quark@fb.com> [Sun, 30 Apr 2017 11:08:27 -0700] rev 32236
commandserver: move "listen" responsibility from service to handler
This enables chg to replace a server socket in an atomic way:
1. bind to a temp address
2. listen
3. rename
Currently 3 happens before 2 so a client may see the socket file but fails
to connect to it.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 May 2017 15:31:34 -0700] rev 32235
hghave: remove py27+ capability
It is now unused. And we require Python 2.7+ now so this check is not
necessary.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 May 2017 15:30:15 -0700] rev 32234
tests: remove test targeting Python 2.6
We just removed support for Python 2.7. This test is dead since it only
ran on <2.7.
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 02 May 2017 16:19:04 -0700] rev 32233
setup: drop support for Python 2.6 (BC)
Per discussion on the mailing list and elsewhere, we've decided that
Python 2.6 is too old to continue supporting. We keep accumulating
hacks/fixes/workarounds for 2.6 and this is taking time away from
more important work.
So with this patch, we officially drop support for Python 2.6 and
require Python 2.7 to run Mercurial.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 11:16:59 -0700] rev 32232
perf: move revlog construction and length calculation out of benchmark
We don't need to measure the time it takes to open the revlog or
calculate its length.
This is more consistent with what other perf* functions do.
While I was here, I also renamed the revlog variable from "r" to
"rl" - again in the name of consistency.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 11:15:56 -0700] rev 32231
perf: clear revlog caches on every iteration
cmdutil.openrevlog() may return a cached revlog instance. This /may/
be a recent "regression" due to refactoring of the manifest API. I'm
not sure.
Either way, this perf command was broken for at least manifests because
subsequent invocations of the perf function would get cache hits from
previous invocations, invalidating results. In the extreme case,
testing the last revision in the revlog resulted in near-instantanous
execution of subsequent runs (since the fulltext is cached). A time
of ~1us would be reported in this case.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 11:12:23 -0700] rev 32230
perf: don't convert rev to node before calling revlog.revision()
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 12:12:53 -0700] rev 32229
revlog: rename _chunkraw to _getsegmentforrevs()
This completes our rename of internal revlog methods to
distinguish between low-level raw revlog data "segments" and
higher-level, per-revision "chunks."
perf.py has been updated to consult both names so it will work
against older Mercurial versions.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 12:02:31 -0700] rev 32228
perf: store reference to revlog._chunkraw in a local variable
To prepare for renaming revlog._chunkraw, we stuff a reference to this
metho in a local variable. This does 2 things. First, it moves the
attribute lookup outside of a loop, which more accurately measures
the time of the code being invoked. Second, it allows us to alias
to different methods depending on their presence (perf.py needs to
support running against old Mercurial versions).
Removing an attribute lookup from a tigh loop appears to shift the
numbers slightly with mozilla-central:
$ hg perfrevlogchunks -c
! read
! wall 0.354789 comb 0.340000 user 0.330000 sys 0.010000 (best of 28)
! wall 0.335932 comb 0.330000 user 0.290000 sys 0.040000 (best of 30)
! read w/ reused fd
! wall 0.342326 comb 0.340000 user 0.320000 sys 0.020000 (best of 29)
! wall 0.332857 comb 0.340000 user 0.290000 sys 0.050000 (best of 30)
! read batch
! wall 0.023623 comb 0.020000 user 0.000000 sys 0.020000 (best of 124)
! wall 0.023666 comb 0.020000 user 0.000000 sys 0.020000 (best of 125)
! read batch w/ reused fd
! wall 0.023828 comb 0.020000 user 0.000000 sys 0.020000 (best of 124)
! wall 0.023556 comb 0.020000 user 0.000000 sys 0.020000 (best of 126)
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 12:02:12 -0700] rev 32227
revlog: rename internal functions containing "chunk" to use "segment"
Currently, "chunk" is overloaded in revlog terminology to mean
multiple things. One of them refers to a segment of raw data from
the revlog. This commit renames various methods only used within
revlog.py to have "segment" in their name instead of "chunk."
While I was here, I also made the names more descriptive. e.g.
"_loadchunk()" becomes "_readsegment()" because it actually does
I/O.
Jun Wu <quark@fb.com> [Sat, 06 May 2017 16:36:24 -0700] rev 32226
fsmonitor: do not nuke dirstate filecache
In the future, chg may prefill repo's dirstate filecache so it's valuable
and should be kept. Previously we drop both filecache and property cache for
dirstate during fsmonitor reposetup, this patch changes it to only drop
property cache but keep the filecache.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 11:01:02 -0700] rev 32225
perf: move gettimer() call
This is more consistent with other perf* functions.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 06 May 2017 10:59:38 -0700] rev 32224
perf: don't clobber startrev variable
Previously, the "startrev" argument would be ignored due to
"startrev = 0" in the benchmark function. This meant that
`hg perfrevlog` always started at revision 0.
Rename the local variable to "beginrev" so the variable does the
right thing.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 May 2017 17:31:15 +0200] rev 32223
bundle: add optional 'tagsfnodecache' data to on disk bundle (
issue5543)
This should help performance when unbundling.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 May 2017 17:28:52 +0200] rev 32222
bundle2: move tagsfnodecache generation in a generic function
This will help us reusing the logic for `hg bundle`.
Pierre-Yves David <pierre-yves.david@octobus.net> [Fri, 05 May 2017 17:09:47 +0200] rev 32221
bundle: introduce an higher level function to write bundle on disk
The current function ('writebundle') is focussing on getting an existing
changegroup to disk. It is no easy ways to includes more part in the generated
bundle2. So we introduce a slightly higher level function that is fed the
'outgoing' object (that defines the bundled spec) and the bundlespec parameters
(to control the changegroup generation and inclusion of other parts).
This is creating the third logic dedicated to create a consistent bundle2 (the
other 2 are the push code and the getbundle code). We should probably reconcile
them at some points but they all takes different types of input. So we need to
introduce an intermediate "object" that each different input could be converted
to. Such unified "bundle2 specification" could be fed to some unified code.
We start by having the `hg bundle` related code on its own to helps defines its
specific needs first. Once the common and specific parts of each logic will be
known we can start unification.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 21:47:03 +0200] rev 32220
bundle: handle compression earlier
We can also handle that part before starting any generation.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 21:46:02 +0200] rev 32219
bundle: check changegroup version earlier
We can check if we know how to bundle this changegroup version before actually
starting to generate the changegroup.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 21:44:36 +0200] rev 32218
bundle: check lack of revs to bundle before generating the changegroup
We already have the information so we can check it earlier.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 06 May 2017 23:00:57 -0400] rev 32217
extdiff: copy back files to the working directory if the size changed
In theory, it should be enough to pay attention only to the modification time
when detecting if a snapshotted working directory file changed. In practice,
BeyondCompare preserves all file attributes when syncing files at the directory
level. (If you open the file and sync individual hunks, then mtime does change,
and everything was being copied back as desired.) I'm not sure how many other
synchronization tools would trigger this issue, but it's annoyingly inconsistent
(if a single file is diffed, it isn't snapshotted, so the same BeyondCompare
file sync operation _is_ visible, because wdir() is updated in place.
I filed a bug with them, and they stated it is on their wish list, but won't be
fixed in the near term. This isn't a complete fix (there is still the case of
the size not changing), but this seems like a trivial enough change to fix most
of the problem. I suppose we could fool around with making files in the other
snapshot readonly, and copy back if we see the readonly bit copied. That seems
pretty hacky though, and only works if the external tool copies all attributes.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 06 May 2017 22:48:06 -0400] rev 32216
test-extdiff: enable a previously failing test on Windows
Matt Harbison <matt_harbison@yahoo.com> [Sat, 06 May 2017 19:11:59 -0400] rev 32215
test-extdiff: narrow the range of an '#if execbit' block
Now that output can be conditionalized, the few `chmod +x` specific outputs can
be conditionalized, and the rest of the tests run as normal. Disable one test
that is failing on Windows for now.
Matt Harbison <matt_harbison@yahoo.com> [Sat, 06 May 2017 14:36:26 -0400] rev 32214
test-extdiff: deduplicate tests
Matt Harbison <matt_harbison@yahoo.com> [Sat, 06 May 2017 13:37:00 -0400] rev 32213
test-extdiff: fill in a missing Windows test
Yuya Nishihara <yuya@tcha.org> [Sat, 13 Aug 2016 17:21:58 +0900] rev 32212
policy: eliminate ".pure." from module name only if marked as dual
So we can switch cext/pure modules to new layout one by one.
Yuya Nishihara <yuya@tcha.org> [Fri, 12 Aug 2016 11:06:14 +0900] rev 32211
policy: add "cext" package which will host CPython extension modules
I'm going to restructure cext/pure modules and get rid of our hgimporter
hack. C extension modules will be moved to cext/ directory so old and new
compiled modules can coexist in development tree. This is necessary to
run 'hg bisect' without recompiling.
New extension modules will be loaded by an importer function:
base85 = policy.importmod('base85') # select pure.base85 or cext.base85
This will also allow us to split cffi from pure modules, which is currently
difficult because pure modules can't be imported by name.
Yuya Nishihara <yuya@tcha.org> [Tue, 02 May 2017 18:35:09 +0900] rev 32210
policy: mark all string literals as sysstr or bytes
The policy module won't be imported early in future, which means string
literals will be processed by our Python 3 loader.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 23:30:52 +0900] rev 32209
debuginstall: check C extensions only if they are loadable per policy
This check is useless in pure installation and I want to make it directly
import C extension modules.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 22:26:28 +0900] rev 32208
osutil: proxy through util (and platform) modules (API)
See the previous commit for why. Marked as API change since osutil.listdir()
seems widely used in third-party extensions.
The win32mbcs extension is updated to wrap both util. and windows. aliases.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Fri, 12 May 2017 21:46:14 +0900] rev 32207
win32mbcs: wrap underlying pycompat.bytestr to use checkwinfilename safely
win32mbcs wraps some functions, to prevent them from unintentionally
treating backslash (0x5c), which is used as the second or later byte
of multi bytes characters by problematic encodings, as a path
component delimiter on Windows platform.
This wrapping assumes that wrapped functions can safely accept unicode
string arguments.
Unfortunately,
d1937bdcee8c broke this assumption by introducing
pycompat.bytestr() into util.checkwinfilename() for py3 support. After
that, wrapped checkwinfilename() always fails for non-ASCII filename
at pycompat.bytestr() invocation.
This patch wraps underlying pycompat.bytestr() function to use
util.checkwinfilename() safely.
To avoid similar regression in the future, another patch series will
add smoke testing on default branch.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 May 2017 15:08:47 +0200] rev 32206
hghave: prefill more version of Mercurial
The previous code was unable to go above version 4.0.
Mads Kiilerich <madski@unity3d.com> [Thu, 11 May 2017 17:18:40 +0200] rev 32205
graft: fix graft across merges of duplicates of grafted changes
Graft used findmissingrevs to find the candidates for graft duplicates in the
destination. That function operates with the constraint:
1. N is an ancestor of some node in 'heads'
2. N is not an ancestor of any node in 'common'
For our purpose, we do however have to work correctly in cases where the graft
set has multiple roots or where merges between graft ranges are skipped. The
only changesets we can be sure doesn't have ancestors that are grafts of any
changeset in the graftset, are the ones that are common ancestors of *all*
changesets in the graftset. We thus need:
2. N is not an ancestor of all nodes in 'common'
This change will graft more correctly, but it will also in some cases make
graft slower by making it search through a bigger and unnecessary large sets of
changes to find duplicates. In the general case of grafting individual or
linear sets, we do the same amount of work as before.
Mads Kiilerich <madski@unity3d.com> [Tue, 09 May 2017 00:11:30 +0200] rev 32204
graft: test coverage of grafts and how merges can break duplicate detection
This demonstrates unfortunate behaviour: extending the graft range cause the
graft to behave differently. When the graft range includes a merge, we fail to
detect duplicates that are ancestors of the merge.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 22:05:59 +0900] rev 32203
mpatch: proxy through mdiff module
See the previous commit for why.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 22:03:37 +0900] rev 32202
bdiff: proxy through mdiff module
See the previous commit for why.
mdiff seems a good place to host bdiff functions. bdiff.bdiff was already
aliased as textdiff, so we use it.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 21:56:47 +0900] rev 32201
base85: proxy through util module
I'm going to replace hgimporter with a simpler import function, so we can
access to pure/cext modules by name:
# util.py
base85 = policy.importmod('base85') # select pure.base85 or cext.base85
# cffi/base85.py
from ..pure.base85 import * # may re-export pure.base85 functions
This means we'll have to use policy.importmod() function in place of the
standard import statement, but we wouldn't want to write it every place where
C extension modules are used. So this patch makes util host base85 functions.
Yuya Nishihara <yuya@tcha.org> [Tue, 02 May 2017 17:05:22 +0900] rev 32200
mdiff: move re-exports to top
This style seems more common in our codebase.
Yuya Nishihara <yuya@tcha.org> [Tue, 02 May 2017 19:10:55 +0900] rev 32199
test-commit-interactive-curses: remove unused import of parsers
Matt Harbison <matt_harbison@yahoo.com> [Mon, 08 May 2017 23:05:01 -0400] rev 32198
churn: use the non-deprecated template option in the examples
Durham Goode <durham@fb.com> [Mon, 08 May 2017 11:35:23 -0700] rev 32197
strip: make tree stripping O(changes) instead of O(repo)
The old tree stripping logic iterated over every tree revlog in the repo looking
for commits that had revs to be stripped. That's very inefficient in large
repos. Instead, let's look at what files are touched by the strip and only
inspect those revlogs.
I don't have actual perf numbers, since internally we don't use a true
treemanifest, but simply iterating over hundreds of thousands of revlogs takes
many, many seconds, so this should help tremendously when stripping only a few
commits.
Durham Goode <durham@fb.com> [Mon, 08 May 2017 11:35:23 -0700] rev 32196
strip: move tree strip logic to it's own function
This will allow external extensions to modify tree strip behavior more
precisely.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 08 May 2017 09:39:21 -0700] rev 32195
manifest: remove unused property _oldmanifest
The last use seems to have gone away in
7c7d845f8b64 (manifest: make
manifestlog use it's own cache, 2016-11-10).
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 May 2017 09:30:26 -0700] rev 32194
sslutil: reference fingerprints config option properly (
issue5559)
The config option is "host:fingerprints" not "host.fingerprints".
This warning message is bad and misleads users.
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 05 May 2017 04:48:42 +0530] rev 32193
py3: convert key to str to make kwargs.pop work in mq
The keys are passed here and there as unicodes and our transformer make things
bytes. Due to that, mq was not poped and this results in error on Py3.
Here we abuse r'' to make that str on Python 3.
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 05 May 2017 04:41:45 +0530] rev 32192
py3: convert kwargs' keys to str before passing in cmdutil.getcommiteditor
Jun Wu <quark@fb.com> [Wed, 03 May 2017 23:50:41 -0700] rev 32191
diff: add a fast path to avoid loading binary contents
When diffing binary contents, with certain configs, we can show
"Binary file <name> has changed" without actual content.
That allows a fast path where we could avoid providing actual binary
contents. Note: in that case we still need to test if two contents are the
same, that's done by using "filectx.cmp", which could have its own fast
path.
Jun Wu <quark@fb.com> [Fri, 05 May 2017 17:20:32 -0700] rev 32190
diff: correct binary testing logic
This seems to be more correct given the table drawn in the previous patch.
Namely, "losedatafn" and "opts.git" are removed, "not opts.text" is added.
- losedatafn: diff output (binary) should not be affected by "losedatafn"
- opts.git: binary testing is helpful for detecting a fast path in the
next path. the fast path can also be used if opts.git is False
- opts.text: if it's set, we should treat the content as non-binary
Jun Wu <quark@fb.com> [Fri, 05 May 2017 16:48:58 -0700] rev 32189
diff: draw a table about binary diff behaviors
The table should make it easier to reason about future changes.
Jun Wu <quark@fb.com> [Wed, 03 May 2017 22:20:44 -0700] rev 32188
diff: use fctx.size() to test empty
fctx.size() could have a fast path that does not require loading content.
Jun Wu <quark@fb.com> [Wed, 03 May 2017 22:16:54 -0700] rev 32187
diff: use fctx.isbinary() to test binary
The end goal is to avoid calling fctx.data() when unnecessary. For example,
if diff.nobinary=1 and files are binary, the expected behavior is to print
"Binary file has changed". That could avoid reading fctx.data() sometimes.
This is mainly to enable an external LFS extension to skip expensive binary
file loading sometimes (read: most of the time with diff.nobinary=1 and
diff.text=0), without any behavior changes to mercurial (i.e. whether a file
is LFS or not does not change any behavior, LFS could be 100% transparent to
users).
Yuya Nishihara <yuya@tcha.org> [Thu, 20 Apr 2017 22:16:12 +0900] rev 32186
pycompat: extract helper to raise exception with traceback
It uses "raise excobj, None, tb" form which I think is simpler and more
useful than "raise exctype, args, tb".
Yuya Nishihara <yuya@tcha.org> [Thu, 04 May 2017 15:23:51 +0900] rev 32185
largefiles: make sure debugstate command is populated before wrapping
Copied the hack from
869d660b8669, which seemed the simplest workaround.
Perhaps debugcommands.py should have its own commands table.
Yuya Nishihara <yuya@tcha.org> [Mon, 01 May 2017 17:23:48 +0900] rev 32184
check-code: ignore re-exports of os.environ in encoding.py
These are valid uses of os.environ.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 21:51:19 +0900] rev 32183
check-code: exclude demandimport.py and policy.py from Python 3 checks
These modules can't depend on pycompat.py, which means we have to write Py3
hacks in them.
Yuya Nishihara <yuya@tcha.org> [Mon, 01 May 2017 17:10:22 +0900] rev 32182
check-code: rewrite py3 exclusion pattern with negative lookahead
I want to add more patterns, but negative lookbehind requires patterns of
the same length so not useful.
Yuya Nishihara <yuya@tcha.org> [Wed, 03 May 2017 11:16:55 +0900] rev 32181
cleanup: remove useless re-raises of KeyboardInterrupt
KeyboardInterrupt is no longer a subclass of Exception since Python 2.5.
https://docs.python.org/2/whatsnew/2.5.html#pep-352-exceptions-as-new-style-classes
Yuya Nishihara <yuya@tcha.org> [Fri, 12 Aug 2016 11:36:42 +0900] rev 32180
make: drop deprecated rule to process temporary copy of pure modules
Pure modules never be copied to mercurial/ since
511a4384b033.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 06 May 2017 02:33:00 +0900] rev 32179
help: describe about choice of :prompt as a fallback merge tool explicitly
"merge-tools" help topic has described that the merge of the file
fails if no tool is found to merge binary or symlink, since
c77f6276c9e7 (or Mercurial 1.7), which based on (already removed)
MergeProgram wiki page.
But even at that revision, and of course now, merge of the file
doesn't fail automatically for binary/symlink. ":prompt" (or
equivalent logic) is used, if there is no appropriate tool
configuration for binary/symlink.
Steve Borho <steve@borho.org> [Sat, 06 May 2017 10:18:34 -0500] rev 32178
wix: only one KeyPath is allowed per Component
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 May 2017 08:49:46 -0700] rev 32177
dirstate: optimize walk() by using match.visitdir()
We already have the logic for restricting directory walks in
match.visitdir() that we use for treemanifests. We should take
advantage of it when walking the working copy as well.
This speeds up "hg st -I rootfilesin:." on the Firefox repo from
0.587s to 0.305s on warm disk (and much more on cold disk). More time
is spent reading the dirstate than walking the working copy after.
I tried to find scenarios where calling match.visitdir() would be a
noticeable overhead, but I couldn't find any. I encourage the reader
to try for themselves, since this is performance-critical code.
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 May 2017 08:49:07 -0700] rev 32176
match: optimize visitdir() for patterns matching only root directory
Because _rootsanddirs() returns a list of directories to visit
recursively and a list of directories to visit non-recursively. For
patterns such as 'rootfilesin:foo/bar', we clearly need to visit the
directory foo/bar, but we also need to visit its parents. The method
therefore uses util.dirs() to find the parent directories of
'foo/bar'. That method does not include the root directory, but since
we obviously need to visit the root directory, we always added '.' to
the set of directories to visit non-recursively.
The visitdir() method had special handling to consider set(['.']) to
mean that no includes had been specified and would thus visit all
directories. However, when the pattern is 'rootfilesin:.', set(['.'])
is actually the real set of directories to visit and the special
handling of that set meant that all directories got visited instead of
just the root directory.
The fix is simple: add '.' to the set of parent directories in
_rootsanddirs() and stop treating set(['.']) specially. This makes
hg files -r . -I rootfilesin:.
in a treemanifest version of the Firefox repo go from 1.5s to 0.26s on
warm disk (and a *much* bigger improvement on cold disk).
Note that the -I is necessary for no good reason. We just haven't
optimized visitdir() for regular (non-include, non-exclude) patterns
yet.
Martin von Zweigbergk <martinvonz@google.com> [Sat, 11 Mar 2017 12:25:56 -0800] rev 32175
rebase: don't update state dict same way for each root
The update statement does not depend on anything in the loop, so just
move it before the loop and do it once. There are no cases where
update would happen 0 times before (and 1 now); the function returns
early in all such cases.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 May 2017 21:11:40 -0700] rev 32174
forget: access status fields by name, not index
Phil Cohen <phillco@fb.com> [Wed, 03 May 2017 18:26:57 -0700] rev 32173
demandimport: add urwid.command_map to ignore list
The useful pudb debugger can be used with Mercurial, but its import of urwid
fails when demandimport is enabled. Add urwid.command_map to the ignore list so
pudb can be used with hg without disabling all of demandimport.