Yuya Nishihara <yuya@tcha.org> [Tue, 02 May 2017 18:35:09 +0900] rev 32210
policy: mark all string literals as sysstr or bytes
The policy module won't be imported early in future, which means string
literals will be processed by our Python 3 loader.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 23:30:52 +0900] rev 32209
debuginstall: check C extensions only if they are loadable per policy
This check is useless in pure installation and I want to make it directly
import C extension modules.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 22:26:28 +0900] rev 32208
osutil: proxy through util (and platform) modules (API)
See the previous commit for why. Marked as API change since osutil.listdir()
seems widely used in third-party extensions.
The win32mbcs extension is updated to wrap both util. and windows. aliases.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Fri, 12 May 2017 21:46:14 +0900] rev 32207
win32mbcs: wrap underlying pycompat.bytestr to use checkwinfilename safely
win32mbcs wraps some functions, to prevent them from unintentionally
treating backslash (0x5c), which is used as the second or later byte
of multi bytes characters by problematic encodings, as a path
component delimiter on Windows platform.
This wrapping assumes that wrapped functions can safely accept unicode
string arguments.
Unfortunately,
d1937bdcee8c broke this assumption by introducing
pycompat.bytestr() into util.checkwinfilename() for py3 support. After
that, wrapped checkwinfilename() always fails for non-ASCII filename
at pycompat.bytestr() invocation.
This patch wraps underlying pycompat.bytestr() function to use
util.checkwinfilename() safely.
To avoid similar regression in the future, another patch series will
add smoke testing on default branch.
Pierre-Yves David <pierre-yves.david@octobus.net> [Tue, 09 May 2017 15:08:47 +0200] rev 32206
hghave: prefill more version of Mercurial
The previous code was unable to go above version 4.0.
Mads Kiilerich <madski@unity3d.com> [Thu, 11 May 2017 17:18:40 +0200] rev 32205
graft: fix graft across merges of duplicates of grafted changes
Graft used findmissingrevs to find the candidates for graft duplicates in the
destination. That function operates with the constraint:
1. N is an ancestor of some node in 'heads'
2. N is not an ancestor of any node in 'common'
For our purpose, we do however have to work correctly in cases where the graft
set has multiple roots or where merges between graft ranges are skipped. The
only changesets we can be sure doesn't have ancestors that are grafts of any
changeset in the graftset, are the ones that are common ancestors of *all*
changesets in the graftset. We thus need:
2. N is not an ancestor of all nodes in 'common'
This change will graft more correctly, but it will also in some cases make
graft slower by making it search through a bigger and unnecessary large sets of
changes to find duplicates. In the general case of grafting individual or
linear sets, we do the same amount of work as before.
Mads Kiilerich <madski@unity3d.com> [Tue, 09 May 2017 00:11:30 +0200] rev 32204
graft: test coverage of grafts and how merges can break duplicate detection
This demonstrates unfortunate behaviour: extending the graft range cause the
graft to behave differently. When the graft range includes a merge, we fail to
detect duplicates that are ancestors of the merge.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 22:05:59 +0900] rev 32203
mpatch: proxy through mdiff module
See the previous commit for why.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 22:03:37 +0900] rev 32202
bdiff: proxy through mdiff module
See the previous commit for why.
mdiff seems a good place to host bdiff functions. bdiff.bdiff was already
aliased as textdiff, so we use it.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 21:56:47 +0900] rev 32201
base85: proxy through util module
I'm going to replace hgimporter with a simpler import function, so we can
access to pure/cext modules by name:
# util.py
base85 = policy.importmod('base85') # select pure.base85 or cext.base85
# cffi/base85.py
from ..pure.base85 import * # may re-export pure.base85 functions
This means we'll have to use policy.importmod() function in place of the
standard import statement, but we wouldn't want to write it every place where
C extension modules are used. So this patch makes util host base85 functions.
Yuya Nishihara <yuya@tcha.org> [Tue, 02 May 2017 17:05:22 +0900] rev 32200
mdiff: move re-exports to top
This style seems more common in our codebase.
Yuya Nishihara <yuya@tcha.org> [Tue, 02 May 2017 19:10:55 +0900] rev 32199
test-commit-interactive-curses: remove unused import of parsers
Matt Harbison <matt_harbison@yahoo.com> [Mon, 08 May 2017 23:05:01 -0400] rev 32198
churn: use the non-deprecated template option in the examples
Durham Goode <durham@fb.com> [Mon, 08 May 2017 11:35:23 -0700] rev 32197
strip: make tree stripping O(changes) instead of O(repo)
The old tree stripping logic iterated over every tree revlog in the repo looking
for commits that had revs to be stripped. That's very inefficient in large
repos. Instead, let's look at what files are touched by the strip and only
inspect those revlogs.
I don't have actual perf numbers, since internally we don't use a true
treemanifest, but simply iterating over hundreds of thousands of revlogs takes
many, many seconds, so this should help tremendously when stripping only a few
commits.
Durham Goode <durham@fb.com> [Mon, 08 May 2017 11:35:23 -0700] rev 32196
strip: move tree strip logic to it's own function
This will allow external extensions to modify tree strip behavior more
precisely.
Martin von Zweigbergk <martinvonz@google.com> [Mon, 08 May 2017 09:39:21 -0700] rev 32195
manifest: remove unused property _oldmanifest
The last use seems to have gone away in
7c7d845f8b64 (manifest: make
manifestlog use it's own cache, 2016-11-10).
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 08 May 2017 09:30:26 -0700] rev 32194
sslutil: reference fingerprints config option properly (
issue5559)
The config option is "host:fingerprints" not "host.fingerprints".
This warning message is bad and misleads users.
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 05 May 2017 04:48:42 +0530] rev 32193
py3: convert key to str to make kwargs.pop work in mq
The keys are passed here and there as unicodes and our transformer make things
bytes. Due to that, mq was not poped and this results in error on Py3.
Here we abuse r'' to make that str on Python 3.
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 05 May 2017 04:41:45 +0530] rev 32192
py3: convert kwargs' keys to str before passing in cmdutil.getcommiteditor
Jun Wu <quark@fb.com> [Wed, 03 May 2017 23:50:41 -0700] rev 32191
diff: add a fast path to avoid loading binary contents
When diffing binary contents, with certain configs, we can show
"Binary file <name> has changed" without actual content.
That allows a fast path where we could avoid providing actual binary
contents. Note: in that case we still need to test if two contents are the
same, that's done by using "filectx.cmp", which could have its own fast
path.
Jun Wu <quark@fb.com> [Fri, 05 May 2017 17:20:32 -0700] rev 32190
diff: correct binary testing logic
This seems to be more correct given the table drawn in the previous patch.
Namely, "losedatafn" and "opts.git" are removed, "not opts.text" is added.
- losedatafn: diff output (binary) should not be affected by "losedatafn"
- opts.git: binary testing is helpful for detecting a fast path in the
next path. the fast path can also be used if opts.git is False
- opts.text: if it's set, we should treat the content as non-binary
Jun Wu <quark@fb.com> [Fri, 05 May 2017 16:48:58 -0700] rev 32189
diff: draw a table about binary diff behaviors
The table should make it easier to reason about future changes.
Jun Wu <quark@fb.com> [Wed, 03 May 2017 22:20:44 -0700] rev 32188
diff: use fctx.size() to test empty
fctx.size() could have a fast path that does not require loading content.
Jun Wu <quark@fb.com> [Wed, 03 May 2017 22:16:54 -0700] rev 32187
diff: use fctx.isbinary() to test binary
The end goal is to avoid calling fctx.data() when unnecessary. For example,
if diff.nobinary=1 and files are binary, the expected behavior is to print
"Binary file has changed". That could avoid reading fctx.data() sometimes.
This is mainly to enable an external LFS extension to skip expensive binary
file loading sometimes (read: most of the time with diff.nobinary=1 and
diff.text=0), without any behavior changes to mercurial (i.e. whether a file
is LFS or not does not change any behavior, LFS could be 100% transparent to
users).
Yuya Nishihara <yuya@tcha.org> [Thu, 20 Apr 2017 22:16:12 +0900] rev 32186
pycompat: extract helper to raise exception with traceback
It uses "raise excobj, None, tb" form which I think is simpler and more
useful than "raise exctype, args, tb".
Yuya Nishihara <yuya@tcha.org> [Thu, 04 May 2017 15:23:51 +0900] rev 32185
largefiles: make sure debugstate command is populated before wrapping
Copied the hack from
869d660b8669, which seemed the simplest workaround.
Perhaps debugcommands.py should have its own commands table.
Yuya Nishihara <yuya@tcha.org> [Mon, 01 May 2017 17:23:48 +0900] rev 32184
check-code: ignore re-exports of os.environ in encoding.py
These are valid uses of os.environ.
Yuya Nishihara <yuya@tcha.org> [Wed, 26 Apr 2017 21:51:19 +0900] rev 32183
check-code: exclude demandimport.py and policy.py from Python 3 checks
These modules can't depend on pycompat.py, which means we have to write Py3
hacks in them.
Yuya Nishihara <yuya@tcha.org> [Mon, 01 May 2017 17:10:22 +0900] rev 32182
check-code: rewrite py3 exclusion pattern with negative lookahead
I want to add more patterns, but negative lookbehind requires patterns of
the same length so not useful.
Yuya Nishihara <yuya@tcha.org> [Wed, 03 May 2017 11:16:55 +0900] rev 32181
cleanup: remove useless re-raises of KeyboardInterrupt
KeyboardInterrupt is no longer a subclass of Exception since Python 2.5.
https://docs.python.org/2/whatsnew/2.5.html#pep-352-exceptions-as-new-style-classes
Yuya Nishihara <yuya@tcha.org> [Fri, 12 Aug 2016 11:36:42 +0900] rev 32180
make: drop deprecated rule to process temporary copy of pure modules
Pure modules never be copied to mercurial/ since
511a4384b033.
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 06 May 2017 02:33:00 +0900] rev 32179
help: describe about choice of :prompt as a fallback merge tool explicitly
"merge-tools" help topic has described that the merge of the file
fails if no tool is found to merge binary or symlink, since
c77f6276c9e7 (or Mercurial 1.7), which based on (already removed)
MergeProgram wiki page.
But even at that revision, and of course now, merge of the file
doesn't fail automatically for binary/symlink. ":prompt" (or
equivalent logic) is used, if there is no appropriate tool
configuration for binary/symlink.
Steve Borho <steve@borho.org> [Sat, 06 May 2017 10:18:34 -0500] rev 32178
wix: only one KeyPath is allowed per Component
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 May 2017 08:49:46 -0700] rev 32177
dirstate: optimize walk() by using match.visitdir()
We already have the logic for restricting directory walks in
match.visitdir() that we use for treemanifests. We should take
advantage of it when walking the working copy as well.
This speeds up "hg st -I rootfilesin:." on the Firefox repo from
0.587s to 0.305s on warm disk (and much more on cold disk). More time
is spent reading the dirstate than walking the working copy after.
I tried to find scenarios where calling match.visitdir() would be a
noticeable overhead, but I couldn't find any. I encourage the reader
to try for themselves, since this is performance-critical code.
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 May 2017 08:49:07 -0700] rev 32176
match: optimize visitdir() for patterns matching only root directory
Because _rootsanddirs() returns a list of directories to visit
recursively and a list of directories to visit non-recursively. For
patterns such as 'rootfilesin:foo/bar', we clearly need to visit the
directory foo/bar, but we also need to visit its parents. The method
therefore uses util.dirs() to find the parent directories of
'foo/bar'. That method does not include the root directory, but since
we obviously need to visit the root directory, we always added '.' to
the set of directories to visit non-recursively.
The visitdir() method had special handling to consider set(['.']) to
mean that no includes had been specified and would thus visit all
directories. However, when the pattern is 'rootfilesin:.', set(['.'])
is actually the real set of directories to visit and the special
handling of that set meant that all directories got visited instead of
just the root directory.
The fix is simple: add '.' to the set of parent directories in
_rootsanddirs() and stop treating set(['.']) specially. This makes
hg files -r . -I rootfilesin:.
in a treemanifest version of the Firefox repo go from 1.5s to 0.26s on
warm disk (and a *much* bigger improvement on cold disk).
Note that the -I is necessary for no good reason. We just haven't
optimized visitdir() for regular (non-include, non-exclude) patterns
yet.
Martin von Zweigbergk <martinvonz@google.com> [Sat, 11 Mar 2017 12:25:56 -0800] rev 32175
rebase: don't update state dict same way for each root
The update statement does not depend on anything in the loop, so just
move it before the loop and do it once. There are no cases where
update would happen 0 times before (and 1 now); the function returns
early in all such cases.
Martin von Zweigbergk <martinvonz@google.com> [Thu, 04 May 2017 21:11:40 -0700] rev 32174
forget: access status fields by name, not index
Phil Cohen <phillco@fb.com> [Wed, 03 May 2017 18:26:57 -0700] rev 32173
demandimport: add urwid.command_map to ignore list
The useful pudb debugger can be used with Mercurial, but its import of urwid
fails when demandimport is enabled. Add urwid.command_map to the ignore list so
pudb can be used with hg without disabling all of demandimport.
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 May 2017 10:08:36 -0700] rev 32172
outgoing: run on filtered repo
outgoing has been using an unfiltered repo since
fe67107094fd (discovery:
outgoing pass unfiltered repo to findcommonincoming (
issue3776),
2013-01-28). If I'm reading code and history correctly, it should be
safe to run _outgoing() on a filtered repo since
c5456b64eb07
(discovery: run discovery on filtered repository, 2015-01-07). By
running _outgoing() on a filtered repo, we can also remove the
workaround there for ignoring filtered revisions.
Martin von Zweigbergk <martinvonz@google.com> [Fri, 05 May 2017 14:10:58 -0700] rev 32171
manifest: remove check for non-contexts in _dirmancache
It looks like the _dirmancache has contained only manifest contexts
since
d79c141fdf41 (manifest: remove usages of manifest.read,
2016-11-10).
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 12:48:45 +0200] rev 32170
bundle: factor the 'getchangegroup' out
The call in the two branches is identical, so we can just issue it outside of
the conditional.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 12:47:27 +0200] rev 32169
bundle: avoid reset of the 'outgoing' variable
We have a cleaner way to achieve the same effect. Not resetting the variable
will help us to simplify the code.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 12:43:41 +0200] rev 32168
changegroup: deprecate 'getlocalchangroup' (API)
We have 'getchangegroup' with a shorter name for the exactly same feature. Now
that all users are gone we can formally deprecate it.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 12:41:50 +0200] rev 32167
tests: directly 'getchangegroup'
It is identical to 'getlocalchangegroup' with a shorter name.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 12:41:36 +0200] rev 32166
exchange: directly 'getchangegroup'
It is identical to 'getlocalchangegroup' with a shorter name.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 12:41:17 +0200] rev 32165
commands: directly 'getchangegroup'
It is identical to 'getlocalchangegroup' with a shorter name.
Pierre-Yves David <pierre-yves.david@octobus.net> [Thu, 04 May 2017 12:36:45 +0200] rev 32164
changegroup: deduplicate 'getlocalchangegroup'
The two functions 'getlocalchangegroup' and 'getchangegroup' have been strictly
identical for multiple years ('getchangegroup' had a deprecated docstring)
We'll drop one of them (getlocalchangegroup, since it has the longest name).
However, we needs to migrate all users of the dropped one to the new one before
we can deprecate it. In the mean time we drop one of the duplicated definition
and the outdated docstring.
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 04 May 2017 04:57:30 +0530] rev 32163
py3: add test to show 'hg log -Tjson' works
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 04 May 2017 04:52:03 +0530] rev 32162
py3: add test to show 'hg log -G' works
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 04 May 2017 04:42:05 +0530] rev 32161
py3: rename test-check-py3-commands.t to test-py3-commands.t
test-check-*.t is a set of tests which tests certain coding style checks. So
this test was wrongly named, thanks to marmoute for pointing this out.
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 04 May 2017 04:38:20 +0530] rev 32160
py3: use list of bytes rather than bytestring while extending bytes into lists
Python 2:
>>> ls = []
>>> ls.extend(b'abc')
>>> ls
['a', 'b', 'c']
Python 3:
>>> ls = []
>>> ls.extend(b'abc')
>>> ls
[97, 98, 99]
So to make sure things work fine in Py3, this patch does:
>>> ls = []
>>> ls.extend([b'a', b'b', b'c'])
>>> ls
[b'a', b'b', b'c']
After this `hg log -G` works!
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 04 May 2017 01:12:14 +0530] rev 32159
py3: use pycompat.byteskwargs to converts kwargs to bytes
baseformatter._item must contain both keys and values in bytes. So to make
sure that, we convert the opts back to bytes.
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 04 May 2017 00:44:53 +0530] rev 32158
py3: make adefaults keys str to be compatible with getattr
getattr passes a str value of the attribute to be looked and keys in adefaults
dict are bytes which resulted in AttributeError. This patch abuses r'' to
make the keys str.
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 03 May 2017 15:41:28 +0530] rev 32157
py3: abuse r'' to access keys in keyword arguments
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 03 May 2017 15:37:51 +0530] rev 32156
py3: use pycompat.bytechr instead of chr
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 05 May 2017 01:41:54 +0530] rev 32155
py3: use %d to format integers into bytestrings
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 05 May 2017 01:26:49 +0530] rev 32154
py3: use pycompat.bytestr instead of bytes
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 05 May 2017 01:26:13 +0530] rev 32153
py3: slice over bytes to prevent getting ascii values
Pulkit Goyal <7895pulkit@gmail.com> [Sat, 08 Apr 2017 11:02:37 +0530] rev 32152
py3: use encoding.unitolocal instead of .encode(encoding.encoding)
Durham Goode <durham@fb.com> [Wed, 03 May 2017 10:43:59 -0700] rev 32151
rebase: use matcher to optimize manifestmerge
The old merge code would call manifestmerge and calculate the complete diff
between the source to the destination. In many cases, like rebase, the vast
majority of differences between the source and destination are irrelevant
because they are differences between the destination and the common ancestor
only, and therefore don't affect the merge. Since most actions are 'keep', all
the effort to compute them is wasted.
Instead, let's compute the difference between the source and the common ancestor
and only perform the diff of those files against the merge destination. When
using treemanifest, this lets us avoid loading almost the entire tree when
rebasing from a very old ancestor. This speeds up rebase of an old stack of 27
commits by 20x.
In mozilla-central, without treemanifest, when rebasing a commit from
default~100000 to default, this speeds up the manifestmerge step from 2.6s to
1.2s. However, the additional diff adds an overhead to all manifestmerge calls,
especially for flat manifests. When rebasing a commit from default~1 to default
it appears to add 100ms in mozilla-central. While we could put this optimization
behind a flag, I think the fact that it makes merge O(number of changes being
applied) instead of O(number of changes between X and Y) justifies it.
Martin von Zweigbergk <martinvonz@google.com> [Tue, 02 May 2017 23:47:10 -0700] rev 32150
changegroup: delete unused 'bundlecaps' argument (API)
Martin von Zweigbergk <martinvonz@google.com> [Wed, 03 May 2017 10:33:26 -0700] rev 32149
localrepo: reuse exchange.bundle2requested()
It seems like localrepo.getbundle() is trying to do the same thing, so
let's just call the method. That way we get the same condition as
there (matching any "HG2" prefix, not only "HG20").
Pulkit Goyal <7895pulkit@gmail.com> [Fri, 28 Apr 2017 01:13:07 +0530] rev 32148
py3: use raw strings while accessing class.__dict__
The keys of class.__dict__ are unicodes on Python 3.
Pulkit Goyal <7895pulkit@gmail.com> [Tue, 25 Apr 2017 01:52:30 +0530] rev 32147
py3: handle opts correctly for `hg add`
opts in add command were passed again to cmdutil.add() as kwargs so we need
to convert them again to str. Intstead we convert them to bytes when passing
scmutil.match(). Opts handling is also corrected for all the functions which
are called from cmdutil.add().