Philippe Pepiot <philippe.pepiot@logilab.fr> [Thu, 15 Dec 2016 12:17:08 +0100] rev 30588
perf: add historical support of ui.load()
ui.load() has been available since
d83ca854 and at the time of writing isn't
available on stable branch breaking benchmarking newer stable revisions.
Add historical portability policy note on contrib/benchmarks
Jun Wu <quark@fb.com> [Wed, 14 Dec 2016 02:17:59 +0000] rev 30587
chg: ignore HG_* in confighash
The environment variables `HG_*` are usually used by hooks. Unlike `HGPLAIN`
etc, they do not actually affect hg's behavior. So do not include them in
confighash.
This would avoid spawning an unbound number of chg server processes if
commit hook calls hg frequently.
Pulkit Goyal <7895pulkit@gmail.com> [Tue, 13 Dec 2016 20:53:40 +0530] rev 30586
py3: make keys of keyword arguments strings
keys of keyword arguments on Python 3 has to be string. We are dealing with
bytes in our codebase so the keys are also bytes. Done that using
pycompat.strkwargs().
Also after this patch, `hg version` now runs on Python 3.5. Hurray!
Jun Wu <quark@fb.com> [Mon, 12 Dec 2016 08:01:52 +0000] rev 30585
error: make it clear that ProgrammingError is for mercurial developers
The word "developer" could refer to users - people using hg are likely to be
developers. Add adjectives to make it refer to mercurial developers only.
Remi Chaintron <remi@fb.com> [Tue, 13 Dec 2016 14:21:36 +0000] rev 30584
revlog: merge hash checking subfunctions
This patch factors the behavior of both methods into 'checkhash'.
Stanislau Hlebik <stash@fb.com> [Fri, 09 Dec 2016 03:22:26 -0800] rev 30583
bookmarks: make bookmarks.comparebookmarks accept binary nodes (API)
Binary bookmark format should be used internally. It doesn't make sense to have
optional parameters `srchex` and `dsthex`. This patch removes them. It will
also be useful for `bookmarks` bundle2 part because unnecessary conversions
between hex and bin nodes will be avoided.
Stanislau Hlebik <stash@fb.com> [Tue, 22 Nov 2016 01:33:31 -0800] rev 30582
bookmarks: rename `compare()` to `comparebookmarks()` (API)
Next commit will remove optional parameters from `compare()` function.
Let's rename `compare()` to `comparebookmarks()` to avoid ambiguity from
callers from external extensions.
Gábor Stefanik <gabor.stefanik@nng.com> [Mon, 05 Dec 2016 17:40:01 +0100] rev 30581
graft: support grafting changes to new file in renamed directory (
issue5436)
Jun Wu <quark@fb.com> [Mon, 28 Nov 2016 05:45:22 +0000] rev 30580
rebase: calculate ancestors for --base separately (
issue5420)
Previously, the --base option only works with a single "branch" - if there
is one changeset in the "--base" revset whose branching point(s) is/are
different from another changeset in the "--base" revset, "rebase" will error
out with:
abort: source is ancestor of destination
This happens if the user has multiple draft branches, and uses "hg rebase -b
'draft()' -d master", for example. The error message looks cryptic to users
who don't know the implementation detail.
This patch changes the logic to calculate the common ancestor for every
"base" changeset separately so we won't (incorrectly) select "source" which
is an ancestor of the destination.
This patch should not change the behavior where all changesets specified by
"--base" have the same branching point(s).
A new situation is: some of the specified changesets could be rebased, while
some couldn't (because they are descendants of the destination, or they do
not share a common ancestor with the destination). The current behavior is
to show "nothing to rebase" and exits with 1.
This patch maintains the current behavior (show "nothing to rebase") even if
part of the "--base" revset could be rebased. A clearer error message may be
"cannot find branching point for X", or "X is a descendant of destination".
The error message issue is tracked by
issue5422 separately.
A test is added with all kinds of tricky cases I could think of for now.
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 07 Dec 2016 21:53:03 +0530] rev 30579
py3: utility functions to convert keys of kwargs to bytes/unicodes
Keys of keyword arguments need to be str(unicodes) on Python 3. We have a lot
of function where we pass keyword arguments. Having utility functions to help
converting keys to unicodes before passing and convert back them to bytes once
passed into the function will be helpful. We now have functions named
pycompat.strkwargs(dic) and pycompat.byteskwargs(dic) to help us.
Pulkit Goyal <7895pulkit@gmail.com> [Tue, 06 Dec 2016 06:36:36 +0530] rev 30578
py3: make a bytes version of getopt.getopt()
getopt.getopt() deals with unicodes on Python 3 internally and if bytes
arguments are passed, then it will return TypeError. So we have now
pycompat.getoptb() which takes bytes arguments, convert them to unicode, call
getopt.getopt() and then convert the returned value back to bytes and then
return those value.
All the instances of getopt.getopt() are replaced with pycompat.getoptb().
Jun Wu <quark@fb.com> [Tue, 06 Dec 2016 11:44:49 +0000] rev 30577
parsers: use buffer to store revlog index
Previously, the revlog index passed to parse_index2 must be a "string",
which means we have to read the whole revlog index into memory. This patch
makes the code accept a generic Py_buffer, to be more flexible - it could be
a "string", or anything that implements the buffer interface, like a mmap-ed
region.
Note: ideally we want to remove the "data" field. However, it is still used
in parse_index2:
if (idx->inlined) {
cache = Py_BuildValue("iO", 0, idx->data);
....
}
....
tuple = Py_BuildValue("NN", idx, cache);
....
return tuple;
Its only users are revlogio.parseindex and revlog.__init__:
# revlogio.parseindex
index, cache = parsers.parse_index2(data, inline)
return index, getattr(index, 'nodemap', None), cache
# revlog.__init__
d = self._io.parseindex(indexdata, self._inline)
self.index, nodemap, self._chunkcache = d
Maybe we could move the logic (testing inline and returnning "data" object)
to revlog.py. But that should be a separate patch.
Pulkit Goyal <7895pulkit@gmail.com> [Tue, 06 Dec 2016 06:27:58 +0530] rev 30576
fancyopts: switch from fancyopts.getopt.* to getopt.*
In the next patch, we will be creating a bytes version of getopt.getopt() and
doing that will leave getopt as unused import in fancyopts. So before removing
that there are instances in codebase where instead of importing getopt, we
have used fancyopts.getopt. This patch will switch all those cases so that
the next patch can remove the import of getopt from fancyopts without breaking
things.
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 05 Dec 2016 06:46:51 +0530] rev 30575
py3: use pycompat.fsdecode() to pass to imp.* functions
When we try to pass a bytes argument to a function from imp library, it
returns TypeError as it deals with unicodes internally. So we can't use bytes
with imp.* functions. Hunting through this, I found we were returning bytes
path variable to loadpath() on Python 3.5 (yes most of our codebase is
dealing with bytes on Python 3 especially the path variables). Passing unicode
does not fails the purpose of loding the extensions and a module object is
returned.
Jun Wu <quark@fb.com> [Tue, 06 Dec 2016 17:06:39 +0000] rev 30574
localrepo: use ProgrammingError
This is an example usage of ProgrammingError. Let's start migrating
RuntimeError to ProgrammingError.
The code only runs when devel.all-warnings or devel.check-locks is set, so
it does not affect the end-user experience.
Jun Wu <quark@fb.com> [Tue, 06 Dec 2016 14:57:47 +0000] rev 30573
error: add ProgrammingError
We have requirement to express "this is clearly an error caused by the
programmer". The code base uses RuntimeError for that in some places, not
ideal. So let's add a formal exception for that.
Jun Wu <quark@fb.com> [Mon, 05 Dec 2016 21:36:35 +0000] rev 30572
chgserver: call "load" for new ui objects
After
d83ca854fa21, we need to call "ui.load" explicitly to load config
files.
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 04 Dec 2016 23:22:34 +0530] rev 30571
localrepository: remove None as default value of path argument in __init__()
The path variable in localrepository.__init__() has a default value None. So
it gives us a option to create an object to localrespository class without
path variable. But things break if you try to do so. The second line in the
init which will be executed when we try to create a localrepository object
will call os.path.expandvars(path) which returns
TypeError: argument of type 'NoneType' is not iterable
I checked occurrences when it is called and can't find any piece of code
which calls it without path variable. Also if something is calling it, its
should break.
Pulkit Goyal <7895pulkit@gmail.com> [Thu, 01 Dec 2016 13:12:04 +0530] rev 30570
py3: use pycompat.sysstr() in __import__()
__import__() on Python 3 accepts strings which are different from that of
Python 2. Used pycompat.sysstr() to get string accordingly.
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 30 Nov 2016 23:51:11 +0530] rev 30569
py3: avoid use of basestring
"In this case, result is a source variable of a list to be returned, it
shouldn't be unicode. Hence we can use bytes instead of basestring here." -Yuya
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 30 Nov 2016 23:38:50 +0530] rev 30568
py3: use unicodes in __slots__
__slots__ in Python 3 accepts only unicodes and there is no harm using
unicodes in __slots__. So just adding u'' is fine. Previous occurences of this
problem are treated the same way.
Mateusz Kwapich <mitrandir@fb.com> [Mon, 21 Nov 2016 08:09:41 -0800] rev 30567
memctx: allow the metadataonlyctx thats reusing the manifest node
When we have a lot of files writing a new manifest revision can be expensive.
This commit adds a possibility for memctx to reuse a manifest from a different
commit. This can be beneficial for commands that are creating metadata changes
without any actual files changed like "hg metaedit" in evolve extension.
I will send the change for evolve that leverages this once this is accepted.
Mateusz Kwapich <mitrandir@fb.com> [Thu, 17 Nov 2016 10:59:15 -0800] rev 30566
localrepo: make it possible to reuse manifest when commiting context
This makes the commit function understand the context that's reusing manifest.
Mateusz Kwapich <mitrandir@fb.com> [Thu, 17 Nov 2016 10:59:15 -0800] rev 30565
manifest: expose the parents() method
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 28 Nov 2016 21:07:51 -0800] rev 30564
httppeer: assign Vary request header last
In preparation for adding another value to it in a subsequent patch.
While I was here, I added some empty lines because walls of text
are hard to read.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 28 Nov 2016 20:46:42 -0800] rev 30563
wireproto: only advertise HTTP-specific capabilities to HTTP peers (BC)
Previously, the capabilities list was protocol agnostic and we
advertised the same capabilities list to all clients, regardless of
transport protocol.
A few capabilities are specific to HTTP. I see no good reason why we
should advertise them to SSH clients. So this patch limits their
advertisement to HTTP clients.
This patch is BC, but SSH clients shouldn't be using the removed
capabilities so there should be no impact.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 28 Nov 2016 20:46:59 -0800] rev 30562
protocol: declare transport protocol name
We add an attribute to the HTTP and SSH protocol implementations
identifying the transport so future patches can conditionally
expose capabilities on a per-transport basis.
Mads Kiilerich <madski@unity3d.com> [Wed, 16 Nov 2016 19:45:35 +0100] rev 30561
bdiff: early pruning of common prefix before doing expensive computations
It seems quite common that files don't change completely. New lines are often
pretty much appended, and modifications will often only change a small section
of the file which on average will be in the middle.
There can thus be a big win by pruning a common prefix before starting the more
expensive search for longest common substrings.
Worst case, it will scan through a long sequence of similar bytes without
encountering a newline. Splitlines will then have to do the same again ...
twice for each side. If similar lines are found, splitlines will save the
double iteration and hashing of the lines ... plus there will be less lines to
find common substrings in.
This change might in some cases make the algorith pick shorter or less optimal
common substrings. We can't have the cake and eat it.
This make hg --time bundle --base null -r 4.0 go from 14.5 to 15 s - a 3%
increase.
On mozilla-unified:
perfbdiff -m
3041e4d59df2
! wall 0.053088 comb 0.060000 user 0.060000 sys 0.000000 (best of 100) to
! wall 0.024618 comb 0.020000 user 0.020000 sys 0.000000 (best of 116)
perfbdiff
0e9928989e9c --alldata --count 10
! wall 0.702075 comb 0.700000 user 0.700000 sys 0.000000 (best of 15) to
! wall 0.579235 comb 0.580000 user 0.580000 sys 0.000000 (best of 18)
Yuya Nishihara <yuya@tcha.org> [Sat, 22 Oct 2016 15:02:11 +0900] rev 30560
formatter: add overview of API and example as doctest
Yuya Nishihara <yuya@tcha.org> [Sat, 22 Oct 2016 14:35:10 +0900] rev 30559
ui: factor out ui.load() to create a ui without loading configs (API)
This allows us to write doctests depending on a ui object, but not on global
configs.
ui.load() is a class method so we can do wsgiui.load(). All ui() calls but
for doctests are replaced with ui.load(). Some of them could be changed to
not load configs later.
Anton Shestakov <av6@dwimlabs.net> [Thu, 08 Dec 2016 23:59:36 +0800] rev 30558
hgweb: add missing slash to file log url in rss style
Jun Wu <quark@fb.com> [Wed, 30 Nov 2016 19:23:04 +0000] rev 30557
check-code: add a rule to forbid "cp -r"
See the commit message of the previous patch for the reason. In short,
according to the current POSIX standard, "-r" is "removed", and "-R" is the
current standard way to do "copy file hierarchies".
Jun Wu <quark@fb.com> [Wed, 30 Nov 2016 19:25:18 +0000] rev 30556
tests: replace "cp -r" with "cp -R"
The POSIX documentation about "cp" [1] says:
....
RATIONALE
....
Earlier versions of this standard included support for the -r option to
copy file hierarchies. The -r option is historical practice on BSD and
BSD-derived systems. This option is no longer specified by POSIX.1-2008
but may be present in some implementations. The -R option was added as a
close synonym to the -r option, selected for consistency with all other
options in this volume of POSIX.1-2008 that do recursive directory
descent.
The difference between -R and the removed -r option is in the treatment
by cp of file types other than regular and directory. It was
implementation-defined how the - option treated special files to allow
both historical implementations and those that chose to support -r with
the same abilities as -R defined by this volume of POSIX.1-2008. The
original -r flag, for historic reasons, did not handle special files any
differently from regular files, but always read the file and copied its
contents. This had obvious problems in the presence of special file
types; for example, character devices, FIFOs, and sockets.
....
....
Issue 6
The -r option is marked obsolescent.
....
Issue 7
....
The obsolescent -r option is removed.
....
(No "Issue 8" yet)
Therefore it's clear that "cp -R" is strictly better than "cp -r".
The issue was discovered when running tests on OS X after
0d87b1caed92.
[1]: pubs.opengroup.org/onlinepubs/
9699919799/utilities/cp.html
Martijn Pieters <mjpieters@fb.com> [Wed, 30 Nov 2016 16:39:36 +0000] rev 30555
posix: give the cached symlink a real target
The NamedTemporaryFile file is cleared up so checklink ends up as a dangling
symlink, causing cp -r in tests to complain on both Solaris and OS X. Use
a permanent file instead when there is a .hg/cache directory.
Kostia Balytskyi <ikostia@fb.com> [Tue, 29 Nov 2016 07:20:03 -0800] rev 30554
shelve: move patch extension to a string constant
We are using 'name + ".patch"' pattern throughout the shelve code to
identify the existence of a shelve with a particular name. In two
cases however we use 'name + ".hg"' instead. This commit makes
'patch' be used in all places and "emphasizes" it by moving
'patch' to live in a constant. Also, this allows to extract file
name without extension like this:
f[:-(1 + len(patchextension))]
instead of:
f[:-6]
which is good IMO.
This is a first patch from this initial "obsshelve" series. This
series does not include tests, although locally I have all of
test-shelve.t ported to test obs-shelve as well. I will send tests
later as a separate series.
Kevin Bullock <kbullock+mercurial@ringworld.org> [Thu, 01 Dec 2016 15:55:45 -0600] rev 30553
merge with stable
Kevin Bullock <kbullock@ringworld.org> [Thu, 01 Dec 2016 14:13:28 -0600] rev 30552
Added signature for changeset
b3b1ae98f6a0
Kevin Bullock <kbullock@ringworld.org> [Thu, 01 Dec 2016 14:13:23 -0600] rev 30551
Added tag 4.0.1 for changeset
b3b1ae98f6a0
Wagner Bruna <wbruna@yahoo.com> [Fri, 25 Nov 2016 07:39:02 -0200] rev 30550
i18n-pt_BR: synchronized with
819f96b82fa4
Kostia Balytskyi <ikostia@fb.com> [Tue, 29 Nov 2016 04:11:05 -0800] rev 30549
shelve: fix use of unexpected working dirs in test-shelve.t
Fixing some clowniness where we created ~four levels of nested repos
and once (my test case :( ) did not even cd into a created repo.
Jun Wu <quark@fb.com> [Mon, 28 Nov 2016 23:38:46 +0000] rev 30548
crecord: change the verb according to the operation
This will make crecord consistent with record when being used in the revert
situation. It will say "Select hunks to revert / discard" accordingly.
This should make the revert crecord interface less confusing.
Jun Wu <quark@fb.com> [Mon, 28 Nov 2016 23:37:29 +0000] rev 30547
crecord: change help text for the space key dynamically
A follow-up of the previous patch, to make the text simple and clear about
whether it's to "select" or "deselect".
Jun Wu <quark@fb.com> [Mon, 28 Nov 2016 23:33:02 +0000] rev 30546
crecord: rewrite status line text (BC)
Previously, we display fixed text in the 2-line status header. Now we want
to customize some word there to make the "revert" action clear. However, if
we simply replace the verb using '%s' like this:
"SELECT CHUNKS: (j/k/up/dn/pgup/pgdn) move cursor; "
"(space/A) toggle hunk/all; (e)dit hunk;"),
" (f)old/unfold; (c)onfirm %s; (q)uit; (?) help " % verb
"| [X]=hunk %s **=folded, toggle [a]mend mode" % verb
It could cause trouble for i18n - some languages may expect things like
"%(verb) confirm", for example.
Therefore, this patch chooses to break the hard-coded 2-line sentences into
"segment"s which could be translated (and replaced) separately.
With the clean-up, I'm also changing the content being displayed, to make it
cleaner and more friendly to (new) users, namely:
- Replace "SELECT CHUNKS" to "Select hunks to record". Because:
- To eliminate "hunk" / "chunk" inconsistency.
- "record" is used in the "text" UI. Do not use "apply", to make it
consistent.
- To make it clear you are choosing what to record, not revert, or
discard etc. This is probably the most important information the user
should know. So let's put it first.
- "to record" could be replaced to others depending on the operation.
The follow-up patches will address them.
- Move "[x]" and "**" legends first to explain the current interface. New
users should understand what the legends mean, followed by what they can
do in the interface.
- Replace "j/k/up/dn/pgup/pgdn" with "arrow keys". Because:
- "arrow keys" is more friendly to new users.
- Mentioning "j/k" first may make emacs users angry. We should stay
neutral about editors.
- "pgup/pgdn" actually don't work very well. For example, within a hunk
of 100-line insertion, "pgdn" just moves one single line.
- Left/Right arrow keys are useful for movement and discovery of
"expanding" a block.
- Replace "fold/unfold" with "collapse/expand", "fold" is well known as
a different meaning in histedit and evolve.
- Replace "(space/A) toggle hunk/all" to "space: select". Because:
- "A: toggle all" is not that useful
- It's unclear how "hunk" could be "toggled" to a dumb user. Let's
make it clear it's all about "select".
- A follow-up will make it use "unselect" when we know the current item
is selected.
- Remove "(f)old". Use arrow keys instead.
- Remove "toggle [a]mend mode". It's just confusing and could be useless.
- Remove "(e)dit hunk". It's powerful but not friendly to new users.
- Replace "(q)uit" to "q: abort" to make it clear you will lose changes.
The result looks like the following in a 73-char-width terminal:
Select hunks to record - [x]=selected **=collapsed c: confirm q: abort
arrow keys: move/expand/collapse space: select ?: help
If the terminal is 132-char wide, the text could fit in a single line.
Jun Wu <quark@fb.com> [Wed, 23 Nov 2016 22:23:15 +0000] rev 30545
crecord: make _getstatuslines update numstatuslines
We are going to make the text in the status window dynamically generated,
so its size would be dynamic. Change getstatuslines to update
"numstatuslines" automatically. Fix an issue where "numstatuslines" being 1
makes the chunkpad disappear.
Jun Wu <quark@fb.com> [Mon, 28 Nov 2016 23:12:54 +0000] rev 30544
crecord: move status window text calculation to a separate method
We will do some changes there in the next patches. The new method would also
be the "source of truth" of the content of the status window (so if the
status window needs more than 2 lines, it would be calculated from the new
method).
Also, moved "statuswin.refresh" to make the code compact and easier to read.
Cotizo Sima <cotizo@fb.com> [Mon, 28 Nov 2016 04:34:01 -0800] rev 30543
revlog: ensure that flags do not overflow 2 bytes
This patch adds a line that ensures we are not setting by mistake a set of flags
overlfowing the 2 bytes they are allocated. Given the way the data is packed in
the revlog header, overflowing 2 bytes will result in setting a wrong offset.
Augie Fackler <augie@google.com> [Sun, 27 Nov 2016 20:44:52 -0500] rev 30542
merge with stable
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 25 Nov 2016 09:59:39 -0800] rev 30541
debugcommands: sort command order
The diff is a bit large but it is straight code moving without any
logical modifications.
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 25 Nov 2016 09:55:05 -0800] rev 30540
tests: add test that @commands in debugcommands.py are sorted
I felt like inline Python in test-check-code.t was the most
appropriate place for this, as other linters in contrib/ seem to
be source file agnostic.
The test currently fails.
Simon Farnsworth <simonfar@fb.com> [Fri, 25 Nov 2016 07:30:46 -0800] rev 30539
fsmonitor: be robust in the face of bad state
fsmonitor could write out bad state if interrupted part way through, and
would then crash when it tried to read it back in.
Make both sides of the operation more robust - reading state should fail
cleanly, and we can use atomictemp to write out cleanly as the file is
small. Between the two, we shouldn't crash with an IndexError any more.
Mads Kiilerich <madski@unity3d.com> [Wed, 23 Nov 2016 23:47:38 +0100] rev 30538
merge: use original file extension for temporary files
Some merge tools (like Araxis?) can pick merge mode based on the file
extension. That didn't work well when temporary files were given random
suffixes. It seems to work better when the random part is before the extension.
As usual, when using $output, $local will have the .orig extension. That could
perhaps be the subject of another change another day.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 26 Nov 2016 09:14:41 -0800] rev 30537
ui: use try..finally in configoverride
@contextmanager almost always have their "yield" inside a try..finally
block. This is because if the calling code inside the activated
context manager raises, the code after the "yield" won't get
executed. A "finally" block, however, will get executed in this
scenario.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 26 Nov 2016 09:07:11 -0800] rev 30536
util: limit output chunk size in zlib decompression
This is essentially a port of
65bd4b8e48bd, which was inadvertently
dropped by
8cd7d0fefd30.
Jun Wu <quark@fb.com> [Wed, 23 Nov 2016 18:13:11 +0000] rev 30535
crecord: filter text via i18n
There are some text in the user interface that are not filtered by i18n.
This patch adds the missing "_" call. So the text could be translated.
Jun Wu <quark@fb.com> [Wed, 23 Nov 2016 19:03:24 +0000] rev 30534
revert: pass operation to crecord
So crecord would know what to display
Jun Wu <quark@fb.com> [Wed, 23 Nov 2016 19:22:36 +0000] rev 30533
crecord: add an "operation" field
The field would provide extra information to help us to make the curses UI
text less confusing.
Denis Laxalde <denis.laxalde@logilab.fr> [Fri, 25 Nov 2016 09:10:30 +0100] rev 30532
revert: prompt before removing files in interactive mode
Prior to this change, files to be removed (i.e. files added since the revision
to revert to) were unconditionally removed despite the interactive mode. Now
prompt before actually removing the files, as this is done for other actions
(e.g. forget).
Denis Laxalde <denis.laxalde@logilab.fr> [Fri, 25 Nov 2016 09:09:31 +0100] rev 30531
revert: indicate the default choice when prompting to forget files
Denis Laxalde <denis.laxalde@logilab.fr> [Fri, 25 Nov 2016 09:09:03 +0100] rev 30530
style: avoid an unnecessary line split
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 21:01:02 -0700] rev 30529
debugcommands: move 'debugdeltachain' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 21:00:11 -0700] rev 30528
debugcommands: move 'debugindex' and 'debugindexdot' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:59:13 -0700] rev 30527
debugcommands: move 'debugignore' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 09:44:47 -0800] rev 30526
debugcommands: move 'debuggetbundle' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:58:16 -0700] rev 30525
debugcommands: move 'debugfsinfo' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:57:57 -0700] rev 30524
debugcommands: move 'debugfileset' in the new module
Remi Chaintron <remi@fb.com> [Wed, 23 Nov 2016 17:36:35 +0000] rev 30523
censor: flag internal documentation
Kostia Balytskyi <ikostia@fb.com> [Wed, 23 Nov 2016 14:58:52 -0800] rev 30522
shelve: make --keep option survive user intervention (
issue5431)
Currently if user runs 'hg unshelve --keep' and merge conflicts
occur, the information about --keep provided by user is lost and
shelf is deleted after 'hg unshelve --continue'. This is obviously
not desired, so this patch fixes it.
Jun Wu <quark@fb.com> [Thu, 24 Nov 2016 01:15:34 +0000] rev 30521
worker: use os._exit for posix worker in all cases
Like commandserver, the worker should never run other resource cleanup logic.
Previously this is not true for workers if they have exceptions other than
KeyboardInterrupt.
This actually caused a real-world deadlock with remotefilelog:
1. remotefilelog/fileserverclient creates a sshpeer. pipei/o/e get created.
2. worker inherits that sshpeer's pipei/o/e.
3. worker runs sshpeer.cleanup (only happens without os._exit)
4. worker closes pipeo/i, which will normally make the sshpeer read EOF from
its stdin and exit. But the master process still have pipeo, so no EOF.
5. worker reads pipee (stderr of sshpeer), which never completes because
the ssh process does not exit, does not close its stderr.
6. master waits for all workers, which never completes because they never
complete sshpeer.cleanup.
This could also be addressed by closing these fds after fork, which is not
easy because Python 2.x does not have an official "afterfork" hook. Hacking
os.fork is also ugly. Besides, sshpeer is probably not the only troublemarker.
The patch changes _posixworker so all its code paths will use os._exit to
avoid running unwanted resource clean-ups.
Jun Wu <quark@fb.com> [Thu, 24 Nov 2016 00:48:40 +0000] rev 30520
dispatch: move part of callcatch to scmutil
Per discussion at
39149b6036e6 [1], we need "callcatch" in worker.py. Move
it to scmutil.py to avoid cycles.
Note that dispatch's callcatch handles some additional high-level exceptions
related to config parsing, and commands. Moving them to scmutil will make
scmutil depend on "commands" or require "_formatparse" and "_getsimilar"
(and "difflib") to be moved as well. In the worker use-case, it is forked
when config and commands are fully loaded. So it should not care about those
exceptions.
[1]: https://www.mercurial-scm.org/pipermail/mercurial-devel/2016-August/087116.html
Pulkit Goyal <7895pulkit@gmail.com> [Wed, 23 Nov 2016 00:03:11 +0530] rev 30519
py3: use pycompat.getcwd() instead of os.getcwd()
We have pycompat.getcwd() which returns bytes path on Python 3. This patch
changes most of the occurences of the os.getcwd() with pycompat one.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:57:15 -0700] rev 30518
debugcommands: move 'debugextensions' to the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:56:11 -0700] rev 30517
debugcommands: move 'debugdiscovery' in the module
And a lot of imports with it.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:43:31 -0700] rev 30516
debugcommands: move 'debugdate' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:43:05 -0700] rev 30515
debugcommands: move 'debugrevlogopts' into the new module
This move contains the first reference to debugrevlogopts in
debugcommands.py. We'll eventually want to move that over. We
hold off for now because it would introduce a module import cycle.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:41:54 -0700] rev 30514
debugcommands: move 'debugdag' into the new module
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 14:30:16 +0900] rev 30513
chgserver: make it a core module and drop extension flags
It was an extension just because there were several dependency cycles I
needed to address.
I don't add 'chgserver' to extensions._builtin since chgserver is considered
an internal extension so nobody should enable it by their config.
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 14:37:18 +0900] rev 30512
chgserver: delay importing commands and dispatch modules
This is a workaround for future import cycle: dispatch -> commands -> server
-> chgserver -> commands. Some of the problems can be fixed later on pager
and chg refactoring.
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 14:24:29 +0900] rev 30511
chgserver: drop CHGINTERNALMARK by chgunixservice()
Prepares for the removal of uisetup(). We just need to do that at the start
of the chg server, so chgunixservice() should be fine.
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 14:19:16 +0900] rev 30510
server: add public function to select either cmdserver or hgweb
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 14:09:36 +0900] rev 30509
server: move service factory from hgweb
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 14:06:46 +0900] rev 30508
hgweb: extract app factory
I'll move createservice() to the server module, but createapp() seems good to
remain in the hgweb module because of its dependency on hgweb/hgwebdir_mod.
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 13:57:17 +0900] rev 30507
server: move service table and factory from commandserver
This is necessary to solve future dependency cycle between commandserver.py
and chgserver.py.
'cmd' prefix is added to table and function names to avoid conflicts with
hgweb.
Yuya Nishihara <yuya@tcha.org> [Sat, 15 Oct 2016 13:47:43 +0900] rev 30506
server: move cmdutil.service() to new module (API)
And call it runservice() because I'll soon add createservice().
The main reason I'm going to introduce the 'server' module is to solve
future dependency cycle between chgserver.py and commandserver.py.
The 'server' module sits at the same layer as the cmdutil. I believe it's
generally good to get rid of things from the big cmdutil module.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:41:05 -0700] rev 30505
debugcommands: move 'debugcomplete' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:40:13 -0700] rev 30504
debugcommands: move 'debugcommands' in the new module
The commit message isn't an illusion. There is a "debugcommands"
module and command.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:38:29 -0700] rev 30503
debugcommands: move 'debugcheckstate' in the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 20:37:54 -0700] rev 30502
debugcommands: move debug{create,apply}streambundleclone to the new module
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 21:07:22 -0700] rev 30501
debugcommands: move 'debugbundle' in the new module
Pulkit Goyal <7895pulkit@gmail.com> [Tue, 22 Nov 2016 18:46:50 +0530] rev 30500
py3: add os.getcwdb() to have bytes path
Following the behaviour of Python 3, os.getcwd() return unicodes. We need
bytes version as path variables are bytes in UNIX. Python 3 has os.getcwdb()
which returns current working directory in bytes.
Like rest of the things there in pycompat, like osname, ossep, we need to
rewrite every instance of os.getcwd to pycompat.getcwd to make them work
correctly on Python 3.
Gregory Szorc <gregory.szorc@gmail.com> [Tue, 22 Nov 2016 18:13:02 -0800] rev 30499
help: clarify contents of revlog index
The previous wording indicated that field at index 3 was the
size of the decompressed chunk, not the size of the full
revision text.
Danek Duvall <danek.duvall@oracle.com> [Tue, 22 Nov 2016 13:32:05 -0800] rev 30498
zstd: fix compilation with Solaris Studio
Without these changes, Solaris Studio (12.4) gives us "syntax error: empty
declaration" on these two lines.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:36:46 -0500] rev 30497
cmdutil: turn forward of checkunresolved into a deprecation warning
As with dirstateguard, I really doubt anyone outside core was using
this, as my grep over the repositories I keep locally suggests nobody
was using this. If others are comfortable with it, let's drop the
forward entirely.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:32:55 -0500] rev 30496
localrepo: refer to checkunresolved by its new name
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:32:39 -0500] rev 30495
rebase: refer to checkunresolved by its new name
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:31:45 -0500] rev 30494
checkunresolved: move to new package to help avoid import cycles
This will allow localrepo to stop using cmdutil, which should avoid
some future import cycles. There's room for an adventurous soul to
delve deeper into merge.py and figure out how to disentangle more of
it - it appears to be a nexus of cycle problems. Some of it might be
able to move into this new mergeutil package.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:16:54 -0500] rev 30493
cmdutil: mark dirstateguard as deprecated
I sincerely doubt this is used in external code, as grepping the
extensions I keep locally (including Facebook's hgexperimental and
evolve) indicate nobody outside of core uses this. As such, I'd also
welcome just dropping this name forward entirely.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:06:34 -0500] rev 30492
localrepo: refer to dirstateguard by its new name
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:06:22 -0500] rev 30491
commands: refer to dirstateguard by its new name
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:27:12 -0500] rev 30490
rebase: refer to dirstateguard by its new name
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:05:52 -0500] rev 30489
mq: refer to dirstateguard by its new name
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:29:32 -0500] rev 30488
dirstateguard: move to new module so I can break some layering violations
Recently in a review I noticed that localrepo almost has no reason to
import cmdutil anymore. Also, cmdutil is a little on the enormous
side, so breaking this class out strikes me as a win.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 22:17:45 -0500] rev 30487
keepalive: discard legacy Python support for error handling
We never changed the behavior defined by this attribute anyway, so
just jettison all of this support.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:52:19 -0500] rev 30486
mergemod: drop support for merge.update without a target
This was to be deleted after 3.9.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 21:51:23 -0500] rev 30485
dispatch: stop supporting non-use of @command
We said we'd delete this after 3.8. It's time.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 21 Nov 2016 20:12:51 -0800] rev 30484
httppeer: document why super() isn't used
Adding a follow-up to document lack of super() per Augie's
request.
Stanislau Hlebik <stash@fb.com> [Thu, 17 Nov 2016 00:59:41 -0800] rev 30483
exchange: add `_getbookmarks()` function
This function will be used to generate bookmarks bundle2 part.
It is a separate function in order to make it easy to overwrite it
in extensions. Passing `kwargs` to the function makes it easy to
add new parameters in extensions.
Stanislau Hlebik <stash@fb.com> [Thu, 17 Nov 2016 00:59:41 -0800] rev 30482
bookmarks: use listbinbookmarks() in listbookmarks()
Stanislau Hlebik <stash@fb.com> [Thu, 17 Nov 2016 00:59:41 -0800] rev 30481
bookmarks: introduce listbinbookmarks()
`bookmarks` bundle2 part will work with binary nodes. To avoid unnecessary
conversions between binary and hex nodes let's add `listbinbookmarks()` that
returns binary nodes. For now this function is a copy-paste of
listbookmarks(). In the next patch this copy-paste will be removed.
Kostia Balytskyi <ikostia@fb.com> [Mon, 21 Nov 2016 16:22:26 -0800] rev 30480
ui: add configoverride context manager
I feel like this idea might've been discussed before, so please
feel free to point me to the right mailing list entry to read
about why it should not be done.
We have a common pattern of the following code:
backup = ui.backupconfig(section, name)
try:
ui.setconfig(section, name, temporaryvalue, source)
do_something()
finally:
ui.restoreconfig(backup)
IMO, this looks better:
with ui.configoverride({(section, name): temporaryvalue}, source):
do_something()
Especially this becomes more convenient when one has to backup multiple
config values before doing something. In such case, adding a new value
to backup requires codemod in three places.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 18:17:02 -0500] rev 30479
archival: simplify code and drop message about Python 2.5
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 17:52:32 -0500] rev 30478
bugzilla: stop mentioning Pythons older than 2.6
We don't support those anyway.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 17:51:39 -0500] rev 30477
tests: update sitecustomize to use uuid1() instead of randrange()
The comments mention that uuid would be better, so let's go ahead and
make good on an old idea.
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 17:48:13 -0500] rev 30476
win32mbcs: drop code that was catering to Python 2.3 and earlier
Augie Fackler <augie@google.com> [Mon, 21 Nov 2016 17:47:11 -0500] rev 30475
httppeer: drop an except block that says it happens only on Python 2.3
Yuya Nishihara <yuya@tcha.org> [Fri, 21 Oct 2016 00:03:46 +0900] rev 30474
windows: do not replace sys.stdout by winstdout
Now we use util.stdout everywhere.
Yuya Nishihara <yuya@tcha.org> [Thu, 20 Oct 2016 23:53:36 +0900] rev 30473
py3: bulk replace sys.stdin/out/err by util's
Almost all sys.stdin/out/err in hgext/ and mercurial/ are replaced by util's.
There are a few exceptions:
- lsprof.py and statprof.py are untouched since they are a kind of vendor
code and they never import mercurial modules right now.
- ui._readline() needs to replace sys.stdin and stdout to pass them to
raw_input(). We'll need another workaround here.
Yuya Nishihara <yuya@tcha.org> [Thu, 20 Oct 2016 23:40:24 +0900] rev 30472
py3: provide bytes stdin/out/err through util module
Since standard streams are TextIO on Python 3, we can't use sys.stdin/out/err
directly. Fortunately we can get the underlying BytesIO via .buffer as long as
the streams aren't replaced by e.g. StringIO.
stdin/out/err are provided through util so we can wrap them by platform API.
Yuya Nishihara <yuya@tcha.org> [Fri, 21 Oct 2016 00:09:38 +0900] rev 30471
util: rewrite pycompat imports to make pyflakes always happy
I'll add more imports which would confuse pyflakes.
Yuya Nishihara <yuya@tcha.org> [Thu, 20 Oct 2016 23:27:09 +0900] rev 30470
windows: do not replace sys.__stdout__
Now we don't use sys.__stdout__ except for getting its fileno(), so we no
longer have to wrap it by winstdout.
This helps adding pycompat.stdin/out/err.
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 21 Nov 2016 15:38:56 +0530] rev 30469
py3: update test-check-py3-compat.t output
This part remains unchanged because it runs in Python 3 only.
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 21 Nov 2016 15:35:22 +0530] rev 30468
py3: use pycompat.sysargv in dispatch.run()
Another one to have a bytes result from sys.argv in Python 3.
This one is also a part of running `hg version` on Python 3.
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 21 Nov 2016 15:26:47 +0530] rev 30467
py3: use pycompat.sysargv in scmposix.systemrcpath()
sys.argv returns unicodes on Python 3. We have pycompat.sysargv which returns
bytes encoded using os.fsencode(). After this patch scmposix.systemrcpath()
returns bytes in Python 3 world. This change is also a part of making
`hg version` run in Python 3.
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 20 Nov 2016 13:50:45 -0800] rev 30466
wireproto: perform chunking and compression at protocol layer (API)
Currently, the "streamres" response type is populated with a generator
of chunks with compression possibly already applied. This puts the onus
on commands to perform chunking and compression. Architecturally, I
think this is the wrong place to perform this work. I think commands
should say "here is the data" and the protocol layer should take care
of encoding the final bytes to put on the wire.
Additionally, upcoming commits will improve wire protocol support for
compression. Having a central place for performing compression in the
protocol transport layer will be easier than having to deal with
compression at the commands layer.
This commit refactors the "streamres" response type to accept either
a generator or an object with "read." Additionally, the type now
accepts a flag indicating whether the response is a "version 1
compressible" response. This basically identifies all commands
currently performing compression. I could have used a special type
for this, but a flag works just as well. The argument name
foreshadows the introduction of wire protocol changes, hence the "v1."
The code for chunking and compressing has been moved to the output
generation function for each protocol transport. Some code has been
inlined, resulting in the deletion of now unused methods.
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 20 Nov 2016 13:55:53 -0800] rev 30465
httppeer: use compression engine API for decompressing responses
In preparation for supporting multiple compression formats on the
wire protocol, we need all users of the wire protocol to use
compression engine APIs.
This commit ports the HTTP wire protocol client to use the
compression engine API.
The code for handling the HTTPException is a bit hacky. Essentially,
HTTPException could be thrown by any read() from the socket. However,
as part of porting the API, we no longer have a generator wrapping
the socket and we don't have a single place where we can trap the
exception. We solve this by introducing a proxy class that intercepts
read() and converts the exception appropriately.
In the future, we could introduce a new compression engine API that
supports emitting a generator of decompressed chunks. This would
eliminate the need for the proxy class. As I said when I introduced
the decompressorreader() API, I'm not fond of it and would support
transitioning to something better. This can be done as a follow-up,
preferably once all the code is using the compression engine API and
we have a better idea of the API needs of all the consumers.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 19 Nov 2016 18:31:40 -0800] rev 30464
httppeer: do decompression inside _callstream
The current HTTP transport protocol only compresses certain command
responses and requires calls to that command to call
"_callcompressable," which zlib decompresses the response
transparently.
Upcoming changes will enable *any* response to be compressed with
varying compression formats. In order to handle this better, this
commit moves the decompression bits to the main function performing
the HTTP request. We introduce an underscore-prefixed argument to
denote this behavior so it doesn't conflict with a named argument
to a command.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 19 Nov 2016 17:11:12 -0800] rev 30463
keepalive: reorder header precedence
There are 3 sources of headers used by this function:
* The default headers defined by the URL opener
* Headers that are copied on redirects
* Headers that aren't copied on redirects
Previously, we applied the default headers from the URL
opener last. This feels wrong to me as those headers are
the most low level and something built on top of the URL
opener may wish to override them. So, this commit changes
the order to apply them with the least precedence.
While I was here, I removed a Python version test that is
no longer necessary.
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 19 Nov 2016 10:54:21 -0800] rev 30462
debuginstall: print compression engine support
Since compression engines may be provided by extensions and since
not all registered compression engines may be available to use,
it seems useful to provide a mechanism to see the state of known
compression engines.
This commit teaches `hg debuginstall` to print info on known and
available compression engines.
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 20 Nov 2016 16:56:21 -0800] rev 30461
bdiff: don't check border condition in loop
This is pretty much a copy of
d500ddae7494, just to a different loop.
The condition `p == plast` (`plast == a + len - 1`) was only true on
the final iteration of the loop. So it was wasteful to check for it
on every iteration. We decrease the iteration count by 1 and add an
explicit check for `p == plast` after the loop.
Again, we see modest wins.
From the mozilla-unified repository:
$ perfbdiff -m
3041e4d59df2
! wall 0.035502 comb 0.040000 user 0.040000 sys 0.000000 (best of 100)
! wall 0.030480 comb 0.030000 user 0.030000 sys 0.000000 (best of 100)
$ perfbdiff
0e9928989e9c --alldata --count 100
! wall 4.097394 comb 4.100000 user 4.100000 sys 0.000000 (best of 3)
! wall 3.597798 comb 3.600000 user 3.600000 sys 0.000000 (best of 3)
The 2nd example throws a total of ~3.3GB of data at bdiff. This
change increases the throughput from ~811 MB/s to ~924 MB/s.
Kostia Balytskyi <ikostia@fb.com> [Sat, 19 Nov 2016 15:41:37 -0800] rev 30460
conflicts: make spacing consistent in conflict markers
The way default marker template was defined before this patch,
the spacing before dash in conflict markes was dependent on
whether changeset is a tip one or not. This is a relevant part
of template:
'{ifeq(tags, "tip", "", "{tags} "}'
If revision is a tip revision with no other tags, this would
resolve to an empty string, but for revisions which are not tip
and don't have any other tags, this would resolve to a single
space string. In the end this causes weirdnesses like the ones
you can see in the affected tests.
This is a not a big deal, but double spacing may be visually
less pleasant.
Please note that test changes where commit hashes change are
the result of marking files as resolved without removing markers.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 09:21:41 -0800] rev 30459
rebase: move bookmark update to before rebase clearing
Bookmark fixing should probably happen before the rebase starts to clean up, so
let's move it before clearrebased. This will also help a future patch where we
want to add more clear logic to the existing clear section.
Gábor Stefanik <gabor.stefanik@nng.com> [Fri, 28 Oct 2016 17:44:28 +0200] rev 30458
setup: include a dummy $PATH in the custom environment used by build.py
This is required for building with pypiwin32, the pip-installable replacement
for pywin32.
Kostia Balytskyi <ikostia@fb.com> [Fri, 11 Nov 2016 07:01:27 -0800] rev 30457
shelve: move unshelve-finishing logic to a separate function
Finishing unshelve involves two steps now:
- stripping a changelog
- aborting a transaction
Obs-based shelve will not require these things, so isolating this logic
into a separate function where the normal/obs-shelve branching is
going to be implemented seems to be like a nice idea.
Behavior-wise this change moves 'unshelvecleanup' from being between
changelog stripping and transaction abortion to being after them.
I don't think this has any negative effects.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 11:02:39 -0800] rev 30456
shelve: move file-forgetting logic to a separate function
This is just a readability improvement.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 10:57:10 -0800] rev 30455
shelve: move rebasing logic to a separate function
Rebasing restored shelved commit onto the right destination is done
differently in traditional and obs-based unshelve:
- for traditional, we just rebase it
- for obs-based, we need to check whether a successor of
the restored commit already exists in the destination (this
might happen when unshelving twice on the same destination)
This is the reason why this piece of logic should be in its own
function: to not have excessive complexity in the main function.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 10:51:06 -0800] rev 30454
shelve: move commit restoration logic to a separate function
Kostia Balytskyi <ikostia@fb.com> [Sun, 13 Nov 2016 03:35:52 -0800] rev 30453
shelve: move temporary commit creation to a separate function
Committing working copy changes before rebasing a shelved commit
on top of them is an independent piece of behavior, which fits
into its own function.
Similar to the previous series, this and a couple of following
patches are for unshelve refactoring.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 17 Nov 2016 20:30:00 -0800] rev 30452
commands: print chunk type in debugrevlog
Each data entry ("chunk") in a revlog has a type based on the first
byte of the data. This type indicates how to interpret the data.
This seems like a useful thing to be able to query through a debug
command. So let's add that to `hg debugrevlog`.
This does make `hg debugrevlog` slightly slower, as it has to read
more than just the index. However, even on the mozilla-unified
manifest (which is ~200MB spread over ~350K revisions), this takes
<400ms.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 17 Nov 2016 20:17:51 -0800] rev 30451
perf: add command for measuring revlog chunk operations
Upcoming commits will teach revlogs to leverage the new compression
engine API so that new compression formats can more easily be
leveraged in revlogs. We want to be sure this refactoring doesn't
regress performance. So this commit introduces "perfrevchunks" to
explicitly test performance of reading, decompressing, and
recompressing revlog chunks.
Here is output when run on the mozilla-unified repo:
$ hg perfrevlogchunks -c
! read
! wall 0.346603 comb 0.350000 user 0.340000 sys 0.010000 (best of 28)
! read w/ reused fd
! wall 0.337707 comb 0.340000 user 0.320000 sys 0.020000 (best of 30)
! read batch
! wall 0.013206 comb 0.020000 user 0.000000 sys 0.020000 (best of 221)
! read batch w/ reused fd
! wall 0.013259 comb 0.030000 user 0.010000 sys 0.020000 (best of 222)
! chunk
! wall 1.909939 comb 1.910000 user 1.900000 sys 0.010000 (best of 6)
! chunk batch
! wall 1.750677 comb 1.760000 user 1.740000 sys 0.020000 (best of 6)
! compress
! wall 5.668004 comb 5.670000 user 5.670000 sys 0.000000 (best of 3)
$ hg perfrevlogchunks -m
! read
! wall 0.365834 comb 0.370000 user 0.350000 sys 0.020000 (best of 26)
! read w/ reused fd
! wall 0.350160 comb 0.350000 user 0.320000 sys 0.030000 (best of 28)
! read batch
! wall 0.024777 comb 0.020000 user 0.000000 sys 0.020000 (best of 119)
! read batch w/ reused fd
! wall 0.024895 comb 0.030000 user 0.000000 sys 0.030000 (best of 118)
! chunk
! wall 2.514061 comb 2.520000 user 2.480000 sys 0.040000 (best of 4)
! chunk batch
! wall 2.380788 comb 2.380000 user 2.360000 sys 0.020000 (best of 5)
! compress
! wall 9.815297 comb 9.820000 user 9.820000 sys 0.000000 (best of 3)
We already see some interesting data, such as how much slower
non-batched chunk reading is and that zlib compression appears to be
>2x slower than decompression.
I didn't have the data when I wrote this commit message, but I ran this
on Mozilla's NFS-based Mercurial server and the time for reading with a
reused file descriptor was faster. So I think it is worth testing both
with and without file descriptor reuse so we can make informed
decisions about recycling file descriptors.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 17 Nov 2016 20:09:10 -0800] rev 30450
setup: add flag to build_ext to control building zstd
Downstream packagers will inevitably want to disable building the
vendored python-zstandard Python package. Rather than force them
to patch setup.py, let's give them a knob to use.
distutils Command classes support defining custom options. It requires
setting certain class attributes (yes, class attributes: instance
attributes don't work because the class type is consulted before it
is instantiated).
We already have a custom child class of build_ext, so we set these
class attributes, implement some scaffolding, and override
build_extensions to filter the Extension instance for the zstd
extension if the `--no-zstd` argument is specified.
Example usage:
$ python setup.py build_ext --no-zstd
Jun Wu <quark@fb.com> [Wed, 09 Nov 2016 16:01:34 +0000] rev 30449
drawdag: update test repos by drawing the changelog DAG in ASCII
Currently, we have "debugbuilddag" which is a powerful tool to build test
cases but not intuitive. We may end up running "hg log" in the test to make
the test more readable.
This patch adds a "drawdag" extension with a "debugdrawdag" command for
similar testing purpose. Unlike the cryptic "debugbuilddag" command, it
reads an ASCII graph that is intuitive to human, so the test case can be
more readable.
Unlike "debugbuilddag", "drawdag" does not require an empty repo. So it can
be used to add new changesets to an existing repo.
Since the "drawdag" logic is not that trivial and only makes sense for
testing purpose, the extension is added to the "tests" directory, to make
the core logic clean. If we find it useful (for example, to demonstrate
cases and help user understand some cases) and want to ship it by default in
the future, we can move it to a ship-by-default "debugdrawdag" at that time.
Mads Kiilerich <madski@unity3d.com> [Wed, 14 Jan 2015 01:15:26 +0100] rev 30448
posix: give checklink a fast path that cache the check file and is read only
util.checklink would create a symlink and remove it again. That would sometimes
happen multiple times. Write operations are relatively expensive and give disk
tear and noise for applications monitoring file system activity.
Instead of creating a symlink and deleting it again, just create it once and
leave it in .hg/cache/check-link . If the file exists, just verify that
os.islink reports true. We will assume that this check is as good as symlink
creation not failing.
Note: The symlink left in .hg/cache has to resolve to a file - otherwise 'make
dist' will fail ...
test-symlink-os-yes-fs-no.py does some monkey patching to simulate a platform
without symlink support. The slightly different testing method requires
additional monkeying.
Mads Kiilerich <madski@unity3d.com> [Thu, 17 Nov 2016 12:59:36 +0100] rev 30447
posix: move checklink test file to .hg/cache
This avoids unnecessary churn in the working directory.
It is not necessarily a fully valid assumption that .hg/cache is on the same
filesystem as the working directory, but I think it is an acceptable
approximation. It could also be the case that different parts of the working
directory is on different mount points so checking in the root folder could
also be wrong.
Mads Kiilerich <madski@unity3d.com> [Wed, 14 Jan 2015 01:15:26 +0100] rev 30446
posix: give checkexec a fast path; keep the check files and test read only
Before, Mercurial would create a new temporary file every time, stat it, change
its exec mode, stat it again, and delete it. Most of this dance was done to
handle the rare and not-so-essential case of VFAT mounts on unix. The cost of
that was paid by the much more common and important case of using normal file
systems.
Instead, try to create and preserve .hg/cache/checkisexec and
.hg/cache/checknoexec with and without exec flag set. If the files exist and
have correct exec flags set, we can conclude that that file system supports the
exec flag. Best case, the whole exec check can thus be done with two stat
calls. Worst case, we delete the wrong files and check as usual. That will be
because temporary loss of exec bit or on file systems without support for the
exec bit. In that case we check as we did before, with the additional overhead
of one extra stat call.
It is possible that this different test algorithm in some cases on odd file
systems will give different behaviour. Again, I think it will be rare and
special cases and I think it is worth the risk.
test-clone.t happens to show the situation where checkisexec is left behind
from the old style check, while checknoexec only will be created next time a
exec check will be performed.
Mads Kiilerich <madski@unity3d.com> [Wed, 14 Jan 2015 01:15:26 +0100] rev 30445
posix: simplify checkexec check
Use a slightly simpler logic that in some cases can avoid an unnecessary chmod
and stat.
Instead of flipping the X bits, make it more clear that we rely on no X bits
being set on initial file creation, and that at least some of them stick after
they all have been set.
Mads Kiilerich <madski@unity3d.com> [Thu, 17 Nov 2016 12:59:36 +0100] rev 30444
posix: move checkexec test file to .hg/cache
This avoids unnecessary churn in the working directory.
It is not necessarily a fully valid assumption that .hg/cache is on the same
filesystem as the working directory, but I think it is an acceptable
approximation. It could also be the case that different parts of the working
directory is on different mount points so checking in the root folder could
also be wrong.
Durham Goode <durham@fb.com> [Thu, 17 Nov 2016 15:31:19 -0800] rev 30443
manifest: move manifestctx creation into manifestlog.get()
Most manifestctx creation already happened in manifestlog.get(), but there was
one spot in the manifestctx class itself that created an instance manually. This
patch makes that one instance go through the manifestlog. This means extensions
can just wrap manifestlog.get() and it will cover all manifestctx creations. It
also means this code path now hits the manifestlog cache.
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 11 Nov 2016 01:10:07 -0800] rev 30442
util: implement zstd compression engine
Now that zstd is vendored and being built (in some configurations), we
can implement a compression engine for zstd!
The zstd engine is a little different from existing engines. Because
it may not always be present, we have to defer load the module in case
importing it fails. We facilitate this via a cached property that holds
a reference to the module or None. The "available" method is
implemented to reflect reality.
The zstd engine declares its ability to handle bundles using the
"zstd" human name and the "ZS" internal name. The latter was chosen
because internal names are 2 characters (by only convention I think)
and "ZS" seems reasonable.
The engine, like others, supports specifying the compression level.
However, there are no consumers of this API that yet pass in that
argument. I have plans to change that, so stay tuned.
Since all we need to do to support bundle generation with a new
compression engine is implement and register the compression engine,
bundle generation with zstd "just works!" Tests demonstrating this
have been added.
How does performance of zstd for bundle generation compare? On the
mozilla-unified repo, `hg bundle --all -t <engine>-v2` yields the
following on my i7-6700K on Linux:
engine CPU time bundle size vs orig size throughput
none 97.0s 4,054,405,584 100.0% 41.8 MB/s
bzip2 (l=9) 393.6s 975,343,098 24.0% 10.3 MB/s
gzip (l=6) 184.0s 1,140,533,074 28.1% 22.0 MB/s
zstd (l=1) 108.2s 1,119,434,718 27.6% 37.5 MB/s
zstd (l=2) 111.3s 1,078,328,002 26.6% 36.4 MB/s
zstd (l=3) 113.7s 1,011,823,727 25.0% 35.7 MB/s
zstd (l=4) 116.0s 1,008,965,888 24.9% 35.0 MB/s
zstd (l=5) 121.0s 977,203,148 24.1% 33.5 MB/s
zstd (l=6) 131.7s 927,360,198 22.9% 30.8 MB/s
zstd (l=7) 139.0s 912,808,505 22.5% 29.2 MB/s
zstd (l=12) 198.1s 854,527,714 21.1% 20.5 MB/s
zstd (l=18) 681.6s 789,750,690 19.5% 5.9 MB/s
On compression, zstd for bundle generation delivers:
* better compression than gzip with significantly less CPU utilization
* better than bzip2 compression ratios while still being significantly
faster than gzip
* ability to aggressively tune compression level to achieve
significantly smaller bundles
That last point is important. With clone bundles, a server can
pre-generate a bundle file, upload it to a static file server, and
redirect clients to transparently download it during clone. The server
could choose to produce a zstd bundle with the highest compression
settings possible. This would take a very long time - a magnitude
longer than a typical zstd bundle generation - but the result would
be hundreds of megabytes smaller! For the clone volume we do at
Mozilla, this could translate to petabytes of bandwidth savings
per year and faster clones (due to smaller transfer size).
I don't have detailed numbers to report on decompression. However,
zstd decompression is fast: >1 GB/s output throughput on this machine,
even through the Python bindings. And it can do that regardless of the
compression level of the input. By the time you have enough data to
worry about overhead of decompression, you have plenty of other things
to worry about performance wise.
zstd is wins all around. I can't wait to implement support for it
on the wire protocol and in revlogs.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 23:38:41 -0800] rev 30441
hghave: add check for zstd support
Not all configurations will support zstd. Add a check so we can
conditionalize tests.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 23:34:15 -0800] rev 30440
exchange: obtain compression engines from the registrar
util.compengines has knowledge of all registered compression engines
and the metadata that associates them with various bundle types.
This patch removes the now redundant declaration of this metadata from
exchange.py and obtains it from the new source.
The effect of this patch is that once a new compression engine is
registered with util.compengines, `hg bundle -t <engine>` will just
work.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 23:29:01 -0800] rev 30439
bundle2: equate 'UN' with no compression
An upcoming patch will change the "alg" argument passed to this
function from None to "UN" when no compression is wanted.
The existing implementation of bundle2 does not set a "Compression"
parameter if no compression is used. In theory, setting
"Compression=UN" should work. But I haven't audited the code to see if
all client versions supporting bundle2 will accept this.
Rather than take the risk, avoid the BC breakage and treat "UN"
the same as None.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 23:15:02 -0800] rev 30438
util: check for compression engine availability before returning
If a requested compression engine is registered but not available,
requesting it will now abort.
To be honest, I'm not sure if this is the appropriate mechanism
for handling optional compression engines. I won't know until
all uses of compression (bundles, wire protocol, revlogs, etc)
are using the new API and zstd (our planned optional engine)
is implemented. So this API could change.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 23:03:48 -0800] rev 30437
util: expose an "available" API on compression engines
When the zstd compression engine is introduced, it won't work in all
installations, namely pure Python installs. So, we need a mechanism to
declare whether a compression engine is available. We don't want to
conditionally register the compression engine because it is sometimes
useful to know when a compression engine name or encountered data is
valid but just not available versus unknown.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 22:26:35 -0800] rev 30436
setup: compile zstd C extension
Now that zstd and python-zstandard are vendored, we can start compiling
them as part of the install.
python-zstandard provides a self-contained Python function that returns
a distutils.extension.Extension, so it is really easy to add zstd
to our setup.py without having to worry about defining source files,
include paths, etc. The function even allows specifying the module
name the extension should be compiled as. This conveniently allows us
to compile the module into the "mercurial" package so "our" version
won't collide with a version installed under the canonical "zstd"
module name.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 22:15:58 -0800] rev 30435
zstd: vendor python-zstandard 0.5.0
As the commit message for the previous changeset says, we wish
for zstd to be a 1st class citizen in Mercurial. To make that
happen, we need to enable Python to talk to the zstd C API. And
that requires bindings.
This commit vendors a copy of existing Python bindings. Why do we
need to vendor? As the commit message of the previous commit says,
relying on systems in the wild to have the bindings or zstd present
is a losing proposition. By distributing the zstd and bindings with
Mercurial, we significantly increase our chances that zstd will
work. Since zstd will deliver a better end-user experience by
achieving better performance, this benefits our users. Another
reason is that the Python bindings still aren't stable and the
API is somewhat fluid. While Mercurial could be coded to target
multiple versions of the Python bindings, it is safer to bundle
an explicit, known working version.
The added Python bindings are mostly a fully-featured interface
to the zstd C API. They allow one-shot operations, streaming,
reading and writing from objects implements the file object
protocol, dictionary compression, control over low-level compression
parameters, and more. The Python bindings work on Python 2.6,
2.7, and 3.3+ and have been tested on Linux and Windows. There are
CFFI bindings, but they are lacking compared to the C extension.
Upstream work will be needed before we can support zstd with PyPy.
But it will be possible.
The files added in this commit come from Git commit
e637c1b214d5f869cf8116c550dcae23ec13b677 from
https://github.com/indygreg/python-zstandard and are added without
modifications. Some files from the upstream repository have been
omitted, namely files related to continuous integration.
In the spirit of full disclosure, I'm the maintainer of the
"python-zstandard" project and have authored 100% of the code
added in this commit. Unfortunately, the Python bindings have
not been formally code reviewed by anyone. While I've tested
much of the code thoroughly (I even have tests that fuzz APIs),
there's a good chance there are bugs, memory leaks, not well
thought out APIs, etc. If someone wants to review the code and
send feedback to the GitHub project, it would be greatly
appreciated.
Despite my involvement with both projects, my opinions of code
style differ from Mercurial's. The code in this commit introduces
numerous code style violations in Mercurial's linters. So, the code
is excluded from most lints. However, some violations I agree with.
These have been added to the known violations ignore list for now.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 21:45:29 -0800] rev 30434
zstd: vendor zstd 1.1.1
zstd is a new compression format and it is awesome, yielding
higher compression ratios and significantly faster compression
and decompression operations compared to zlib (our current
compression engine of choice) across the board.
We want zstd to be a 1st class citizen in Mercurial and to eventually
be the preferred compression format for various operations.
This patch starts the formal process of supporting zstd by vendoring
a copy of zstd. Why do we need to vendor zstd? Good question.
First, zstd is relatively new and not widely available yet. If we
didn't vendor zstd or distribute it with Mercurial, most users likely
wouldn't have zstd installed or even available to install. What good
is a feature if you can't use it? Vendoring and distributing the zstd
sources gives us the highest liklihood that zstd will be available to
Mercurial installs.
Second, the Python bindings to zstd (which will be vendored in a
separate changeset) make use of zstd APIs that are only available
via static linking. One reason they are only available via static
linking is that they are unstable and could change at any time.
While it might be possible for the Python bindings to attempt to
talk to different versions of the zstd C library, the safest thing to
do is link against a specific, known-working version of zstd. This
is why the Python zstd bindings themselves vendor zstd and why we
must as well. This also explains why the added files are in a
"python-zstandard" directory.
The added files are from the 1.1.1 release of zstd (Git commit
4c0b44f8ced84c4c8edfa07b564d31e4fa3e8885 from
https://github.com/facebook/zstd) and are added without modifications.
Not all files from the zstd "distribution" have been added. Notably
missing are files to support interacting with "legacy," pre-1.0
versions of zstd. The decision of which files to include is made by
the upstream python-zstandard project (which I'm the author of). The
files in this commit are a snapshot of the files from the 0.5.0
release of that project, Git commit
e637c1b214d5f869cf8116c550dcae23ec13b677 from
https://github.com/indygreg/python-zstandard.
Mads Kiilerich <madski@unity3d.com> [Tue, 15 Nov 2016 21:56:49 +0100] rev 30433
bdiff: give slight preference to removing trailing lines
[This change could be folded into the previous changeset to minimize the repo
churn ...]
Similar to the previous change, introduce an exception to the general
preference for matches in the middle of bdiff ranges: If the best match on the
B side starts at the beginning of the bdiff range, don't aim for the
middle-most A side match but for the earliest.
New (later) matches on the A side will only be considered better if the
corresponding match on the B side *not* is at the beginning of the range.
Thus, if the best (middle-most) match on the B side turns out to be at the
beginning of the range, the earliest match on the A side will be used.
The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22807275 to
22808120 bytes - a 0.004% increase.
Mads Kiilerich <madski@unity3d.com> [Tue, 15 Nov 2016 21:56:49 +0100] rev 30432
bdiff: give slight preference to appending lines
[This change could be folded into the previous changeset to minimize the repo
churn ...]
The general preference to matches in the middle of bdiff ranges helps getting
balanced recursion and efficient computation. But, as previous changes have
shown, it might also give diffs that seems "obviously wrong".
To mitigate that: If the best match on the A side starts at the beginning of
the bdiff range, don't aim for the middle-most B side match but for the
earliest.
This will make the matches balanced (by both sides being "early") even though
the bisection will be less balanced. Still, this case only apply if the *best*
and middle-most match was fully unbalanced on the A side. Each recursion will
thus even in this worst case reduce the problem significantly and we are not
re-introducing the problem that was fixed in
f1ca249696ed.
The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22806817 to
22807275 bytes - a 0.002% increase.
This make the recent test-bdiff.py changes give a more pretty output ... but
they no longer show that the recursion is around middle matches (because it in
these cases isn't).
Mads Kiilerich <madski@unity3d.com> [Tue, 08 Nov 2016 18:37:33 +0100] rev 30431
bdiff: give slight preference to longest matches in the middle of the B side
We already have a slight preference for matches close to the middle on the A
side. Now, do the same on the B side.
j is iterating the b range backwards and we thus accept a new j if the previous
match was in the upper half.
This makes the test-bhalf diff "correct". It obviously also gives more
preference to balanced recursion than to appending to sequences. That is kind
of correct, but will also unfortunately make some bundles bigger. No doubt, we
can also create examples where it will make them smaller ...
The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22803824 to
22806817 bytes - an 0.01% increase.
Mads Kiilerich <madski@unity3d.com> [Tue, 08 Nov 2016 18:37:33 +0100] rev 30430
bdiff: rearrange the "better longest match" code
This is primarily to make the code more managable and prepare for later
changes.
More specific assignments might also be slightly faster, even thought it also
might generate a bit more code.
Mads Kiilerich <madski@unity3d.com> [Tue, 08 Nov 2016 18:37:33 +0100] rev 30429
bdiff: adjust criteria for getting optimal longest match in the A side middle
We prefer matches closer to the middle to balance recursion, as introduced in
f1ca249696ed.
For ranges with uneven length, matches starting exactly in the middle should
have preference. That will be optimal for matches of length 1. We will thus
accept equality in the half check.
For ranges with even length, half was ceil'ed when calculated but we got the
preference for low matches from the 'less than half' check. To get the same
result as before when we also accept equality, floor it. Without that,
test-annotate.t would show some different (still correct but less optimal)
results.
This will change the heuristics. Tests shows a slightly different output - and
sometimes slightly smaller bundles.
The bundle size for 4.0 (hg bundle --base null -r 4.0 x.hg) happens to go from
22804885 to
22803824 bytes - an 0.005% reduction.
Mads Kiilerich <madski@unity3d.com> [Tue, 08 Nov 2016 18:37:33 +0100] rev 30428
tests: explore some bdiff cases
Mads Kiilerich <madski@unity3d.com> [Tue, 15 Nov 2016 21:56:49 +0100] rev 30427
tests: make test-bdiff.py easier to maintain
Add more stdout logging to help navigate the .out file.
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 17 Nov 2016 08:52:52 -0800] rev 30426
perf: unbust perfbdiff --alldata
This broke in
f84fc6a92817 due to a refactored manifest API.
The fix is a bit hacky - perfbdiff doesn't yet support tree manifests
for example. But it gets the job done.
A test has been added for --alldata so this doesn't happen again.
Yuya Nishihara <yuya@tcha.org> [Thu, 17 Nov 2016 20:57:09 +0900] rev 30425
worker: discard waited pid by anyone who noticed it first
This makes sure all waited pids are removed before calling killworkers()
even if waitpid()-pids.discard() sequence is interrupted by another SIGCHLD.
Yuya Nishihara <yuya@tcha.org> [Thu, 17 Nov 2016 21:08:58 +0900] rev 30424
worker: kill workers after all zombie processes are reaped
Since we now wait child processes in non-blocking way (changed by
7bc25549e084
and
e8fb03cfbbde), we don't have to kill them in the middle of the waitpid()
loop. This change will help solving a possible race of waitpid()-pids.discard()
sequence and another SIGCHLD.
waitforworkers() is called by cleanup(), in which case we do killworkers()
beforehand so we can remove killworkers() from waitforworkers().
Yuya Nishihara <yuya@tcha.org> [Thu, 17 Nov 2016 20:44:05 +0900] rev 30423
worker: make sure killworkers() never be interrupted by another SIGCHLD
killworkers() iterates over pids, which can be updated by SIGCHLD handler.
So we should either copy pids or prevent killworkers() from being interrupted
by SIGCHLD. I chose the latter as it is simpler and can make pids handling
more consistent.
This fixes a possible "set changed size during iteration" error at
killworkers() before cleanup().
Yuya Nishihara <yuya@tcha.org> [Thu, 17 Nov 2016 21:43:01 +0900] rev 30422
worker: fix missed break on successful waitpid()
Follow-up for
5069a8a40b1b.
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:49:42 -0500] rev 30421
filterpyflakes: dramatically simplify the entire thing by blacklisting
We've only got one kind of pyflakes failure left in our codebase, so
it's time to switch over to a blacklist-based checking scheme. I've
left in the filtering of two undefined names for now out of paranoia,
but those can probably go too.
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:07:24 -0500] rev 30420
run-tests: forward Python USER_BASE from site (
issue5425)
We do this so that any linters installed via pip install --user don't
break. See https://docs.python.org/2/library/site.html#site.USER_BASE
for a description of what this nonsense is all about.
An alternative would be to not set HOME, but that'll cause other
problems (see
issue2707), or to forward every single path entry from
sys.path in PYTHONPATH (which seems sketchy in its own way).
Mads Kiilerich <madski@unity3d.com> [Mon, 14 Nov 2016 22:43:25 +0100] rev 30419
shelve: add missing space in help text
The change is trivial and unlikely to have been translated so we update
translation files too.
Jun Wu <quark@fb.com> [Tue, 15 Nov 2016 20:25:51 +0000] rev 30418
util: improve iterfile so it chooses code path wisely
We have performance concerns on "iterfile" as it is 4X slower on normal
files. While modern systems have the nice property that reading a "fast"
(on-disk) file cannot be interrupted and should be made use of.
This patch dumps the related knowledge in comments. And "iterfile" chooses
code paths wisely:
1. If it's CPython 3, or PyPY, use the fast path.
2. If fp is a normal file, use the fast path.
3. If fp is not a normal file and CPython version >= 2.7.4, use the same
workaround (4x slower) as before.
4. If fp is not a normal file and CPython version < 2.7.4, use another
workaround (2x slower but may block longer then necessary) which
basically re-invents the buffer + readline logic in Python.
This will give us good confidence on both correctness and performance
dealing with EINTR in iterfile(fp) for all known supported Python versions.
Augie Fackler <augie@google.com> [Wed, 16 Nov 2016 23:29:28 -0500] rev 30417
merge with stable
Jun Wu <quark@fb.com> [Sat, 12 Nov 2016 03:06:07 +0000] rev 30416
worker: stop using a separate thread waiting for children
Now that we have a SIGCHLD hander, and it could get executed when waiting
for I/O. It's no longer necessary to have a separated waitpid thread. So
just remove it.
Jun Wu <quark@fb.com> [Sat, 12 Nov 2016 03:07:22 +0000] rev 30415
worker: add a SIGCHLD handler to collect worker immediately
As planned by previous patches, add a SIGCHLD handler to get notifications
about worker exits, and deals with worker failure immediately.
Note that the SIGCHLD handler gets unregistered before killworkers(), so
SIGCHLD won't interrupt "killworkers" - making it harder to send kill
signals to waited processes.
Jun Wu <quark@fb.com> [Tue, 15 Nov 2016 02:12:16 +0000] rev 30414
worker: make waitforworkers reentrant
We are going to use it in the SIGCHLD handler. The handler will be executed
in the main thread with the non-blocking version of waitpid, while the
waitforworkers thread runs the blocking version. It's possible that one of
them collects a worker and makes the other error out (no child to wait).
This patch handles these errors: ECHILD is ignored. EINTR needs a retry.
The "pids" set is designed to be only modifiable by "waitforworkers". And we
only remove items after a successful waitpid. Since a child process can only
be "waitpid"-ed once. It's guaranteed that "pids.remove(p)" won't be called
with duplicated "p"s. And once a "p" is removed from "pids", that "p" does
not need to be killed or waited any more.
Jun Wu <quark@fb.com> [Tue, 15 Nov 2016 02:10:40 +0000] rev 30413
worker: change "pids" to a set
There is no need to keep any order of the "pids" array. A set is more
efficient for the "remove" operation. And the following patch will use that.
Jun Wu <quark@fb.com> [Thu, 28 Jul 2016 20:57:07 +0100] rev 30412
worker: allow waitforworkers to be non-blocking
This patch adds a boolean flag to waitforworkers and makes it non-blocking
if set to True.
This is to make it possible that we can reap our workers while keep other
unrelated children untouched, after receiving SIGCHLD.
Jun Wu <quark@fb.com> [Thu, 28 Jul 2016 20:51:20 +0100] rev 30411
worker: wait worker pid explicitly
Before this patch, waitforworkers uses os.wait() to collect child workers, and
only wait len(pids) processes. This can have serious issues if other code
spawns new processes and does not reap them: 1. worker.py may get wrong exit
code and kill innocent workers. 2. worker.py may continue without waiting for
all workers to complete.
This patch fixes the issue by using waitpid to wait worker pid explicitly.
However, this patch introduces a new issue: worker failure may not be handled
immediately. The issue will be addressed in next patches.
Jun Wu <quark@fb.com> [Thu, 28 Jul 2016 20:49:57 +0100] rev 30410
worker: move killworkers and waitforworkers up
We need to use them in the SIGCHLD handler and SIGCHLD handler should be
installed before fork.
Jun Wu <quark@fb.com> [Fri, 11 Nov 2016 21:11:17 +0000] rev 30409
osutil: implement setprocname to set process title for some platforms
This patch adds a simple setprocname method to osutil. The operation is not
defined by any standard and is platform-specific, the current implementation
tries to cover some major platforms (ex. Linux, OS X, FreeBSD) that is
relatively easy to support. Other platforms (Windows [4], other BSDs, ...)
can be added in the future.
The current implementation supports two methods to change process title:
a. setproctitle if available (works in FreeBSD).
b. rewrite argv in place (works in Linux [1] and Mac OS X). [2] [3]
[1]: Linux has "prctl(PR_SET_NAME, ...)" but 1) it has 16-byte limit, which
is too small; 2) it is not quite equivalent to what we want - it changes
"/proc/self/comm", not "/proc/self/cmdline" - "comm" change won't show up
in "ps" output unless "-o comm" is used.
[2]: The implementation does not rewrite the **environ buffer like some
other implementations do, just to make the code simpler and safer. However,
this also means the buffer size we can rewrite is significantly shorter. If
we are really greedy and want the "environ" space, we can change the
implementation later.
[3]: It requires a CPython private API: Py_GetArgcArgv to get the original
argv. Unfortunately Python 3 makes a copy of argv and returns the wchar_t
version, so it is not supported for now. (if we really want to, we could
count backwards from "char **environ", given known argc and argv, not sure
if that's a good idea - probably not)
[4]: The feature is aimed to make it easier for forked command server
processes to show what they are doing. Since Windows does not support
fork(), despite it's a major platform, its support is not added in this
patch.
Jun Wu <quark@fb.com> [Fri, 11 Nov 2016 20:45:40 +0000] rev 30408
setup: test setproctitle before building osutil
We are going to use setproctitle (provided by FreeBSD) if it's available in
the next patch. Therefore provide a macro to give some clues to the C
pre-processor so it could choose code path wisely.
Henning Schild <henning@hennsch.de> [Sat, 12 Nov 2016 13:36:17 +0100] rev 30407
patch: remove unused git parameter from patch.diffstat()
Since
628a4a9e411d the parameter is not used anymore.
Philippe Pepiot <philippe.pepiot@logilab.fr> [Thu, 29 Sep 2016 10:16:34 +0200] rev 30406
perf: add asv benchmarks
Airspeed velocity (ASV) is a python framework for benchmarking Python packages
over their lifetime. The results are displayed in an interactive web frontend.
Add ASV benchmarks for mercurial that use contrib/perf.py extension that could
be run against multiple reference repositories.
The benchmark suite now includes revsets from contrib/base-revsets.txt with
variants, perftags, perfstatus, perfmanifest and perfheads.
Installation requires asv>=0.2, python-hglib and virtualenv
This is part of PerformanceTrackingSuitePlan
https://www.mercurial-scm.org/wiki/PerformanceTrackingSuitePlan
Philippe Pepiot <philippe.pepiot@logilab.fr> [Tue, 15 Nov 2016 16:10:57 +0100] rev 30405
perf: omit copying ui and redirect to ferr if buffer API is in use
This allow to get the output of contrib/perf.py commands using the
ui.pushbuffer() API.
Durham Goode <durham@fb.com> [Mon, 14 Nov 2016 15:24:07 -0800] rev 30404
manifest: change treemanifestctx to construct subtrees from the manifestlog
Previously, treemanifestctx would directly construct its subtrees. By making it
get the subtrees through manifestlog.get() we consolidate all treemanifestctx
creation into manifestlog.get() and therefore extensions that need to wrap
manifestctx creation (like narrow-hg) can intercept manifestctxs at that single
place.
This also means fetching subtrees will take advantage of the manifestlog ctx
cache now, which it did not before.
Durham Goode <durham@fb.com> [Mon, 14 Nov 2016 15:17:27 -0800] rev 30403
manifest: make revlog verification optional
This patches adds an parameter to manifestlog.get() to disable hash checking.
This will be used in an upcoming patch to support treemanifestctx reading
sub-trees without loading them from the revlog. (This is already supported but
does not go through the manifestlog.get() code path)
Gregory Szorc <gregory.szorc@gmail.com> [Thu, 10 Nov 2016 09:45:42 -0800] rev 30402
debugcommands: move debugbuilddag
And we drop some now unused imports from commands.py.
Gregory Szorc <gregory.szorc@gmail.com> [Wed, 17 Aug 2016 21:07:38 -0700] rev 30401
debugcommands: introduce standalone module for debug commands
commands.py is our largest .py file by nearly 2x. Debug commands live
in a world of their own. So let's extract them to their own module.
We start with "debugancestor."
We currently reuse the commands table with commands.py and have a hack
in dispatch.py for loading debugcommands.py. In the future, we could
potentially use a separate commands table and avoid the import of
debugcommands.py.
Jun Wu <quark@fb.com> [Mon, 14 Nov 2016 23:17:15 +0000] rev 30400
convert: migrate to util.iterfile
Jun Wu <quark@fb.com> [Mon, 14 Nov 2016 23:16:05 +0000] rev 30399
match: migrate to util.iterfile
Jun Wu <quark@fb.com> [Mon, 14 Nov 2016 23:15:01 +0000] rev 30398
store: migrate to util.iterfile
Jun Wu <quark@fb.com> [Mon, 14 Nov 2016 23:14:06 +0000] rev 30397
patch: migrate to util.iterfile
Jun Wu <quark@fb.com> [Mon, 14 Nov 2016 23:12:11 +0000] rev 30396
worker: migrate to util.iterfile
Jun Wu <quark@fb.com> [Mon, 14 Nov 2016 23:32:54 +0000] rev 30395
util: add iterfile to workaround a fileobj.__iter__ issue with EINTR
The fileobj.__iter__ implementation in Python 2.7.12 (hg changeset
45d4cea97b04) is buggy: it cannot handle EINTR correctly.
In Objects/fileobject.c:
size_t Py_UniversalNewlineFread(....) {
....
if (!f->f_univ_newline)
return fread(buf, 1, n, stream);
....
}
According to the "fread" man page:
If an error occurs, or the end of the file is reached, the return value
is a short item count (or zero).
Therefore it's possible for "fread" (and "Py_UniversalNewlineFread") to
return a positive value while errno is set to EINTR and ferror(stream)
changes from zero to non-zero.
There are multiple "Py_UniversalNewlineFread": "file_read", "file_readinto",
"file_readlines", "readahead". While the first 3 have code to handle the
EINTR case, the last one "readahead" doesn't:
static int readahead(PyFileObject *f, Py_ssize_t bufsize) {
....
chunksize = Py_UniversalNewlineFread(
f->f_buf, bufsize, f->f_fp, (PyObject *)f);
....
if (chunksize == 0) {
if (ferror(f->f_fp)) {
PyErr_SetFromErrno(PyExc_IOError);
....
}
}
....
}
It means "readahead" could ignore EINTR, if "Py_UniversalNewlineFread"
returns a non-zero value. And at the next time "readahead" got executed, if
"Py_UniversalNewlineFread" returns 0, "readahead" would raise a Python error
without a incorrect errno - could be 0 - thus "IOError: [Errno 0] Error".
The only user of "readahead" is "readahead_get_line_skip".
The only user of "readahead_get_line_skip" is "file_iternext", aka.
"fileobj.__iter__", which should be avoided.
There are multiple places where the pattern "for x in fp" is used. This
patch adds a "iterfile" method in "util.py" so we can migrate our code from
"for x in fp" to "fox x in util.iterfile(fp)".
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:37:18 -0500] rev 30394
filterpyflakes: whitelist listcomp aliasing checking
The test change is because of how filterpyflakes is organized - a line
number changed.
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:35:54 -0500] rev 30393
verify: avoid shadowing two variables with a list comprehension
The variable names are clearly worse now, but since we're really just
transposing key and value I'm not too worried about the clarity loss.
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:35:10 -0500] rev 30392
revset: avoid shadowing a variable with a list comprehension
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:34:43 -0500] rev 30391
revlog: avoid shadowing several variables using list comprehensions
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:33:41 -0500] rev 30390
minirst: avoid shadowing a variable in a list comprehension
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:33:23 -0500] rev 30389
hbisect: avoid shadowing a variable in a list comprehension
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:33:07 -0500] rev 30388
filemerge: avoid shadowing a variable in a list comprehension
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:32:51 -0500] rev 30387
color: avoid shadowing a variable inside a list comprehension
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 16:32:38 -0500] rev 30386
memory: avoid shadowing variables inside a list comprehension
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:15:41 -0800] rev 30385
shelve: move shelve-finishing logic to a separate function
With future obs-based shelve, finishing shelve will be different
from just aborting a transaction and I would like to keep both
variants of this functionality in a separate function.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:20:28 -0800] rev 30384
shelve: move unknown files handling to a separate function
This change has nothing to do with future obsshelve introduction,
it is done just for readability purposes.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:07:20 -0800] rev 30383
shelve: move actual created commit shelving to a separate function
Currently, this code does not have any branching, it just bundles
a commit and saves a patch file. Later, obsolescence-based shelve
will be added, so this code will also create some obsmarkers and
will be one of the few places where obsshelve will be different
from traditional shelve.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:33:01 -0800] rev 30382
shelve: move 'nothing changed' messaging to a separate function
This has nothing to do with the future obsshelve implementation, I just
thought that moving this messaging to a separate function will improve
shelve code readability.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:26:31 -0800] rev 30381
shelve: move commitfunc creation to a separate function
Special commitfuncs are created as closures at least twice in shelve's
code and one time special commitfunc is used within another closure.
They all serve very specific purposes like temporarily tweak some
configuration or enable editor, etc. This is not immediately important
to someone reading shelve code, so I think moving this logic to a separate
function is a good idea.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:24:07 -0800] rev 30380
shelve: move mutableancestors to not be a closure
There's no value in it being a closure and everyone who tries to read
the outer function code will be distracted by it. IMO moving it out
significantly improves readability, especially given how clear it is
what mutableancestors function does from its name.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:22:55 -0800] rev 30379
shelve: move shelve name generation to a separate function
This has nothing to do with future obsshelve introduction, done just
for readability purposes.
Kostia Balytskyi <ikostia@fb.com> [Thu, 10 Nov 2016 03:07:20 -0800] rev 30378
shelve: move possible shelve file extensions to a single place
This and a couple of following patches are a preparation to
implementing obsolescense-enabled shelve which was discussed
on a Sprint. If this refactoring is not done, shelve is going
to look even more hackish than now.
This particular commit introduces a slight behavior change. Previously,
if only .hg/shelve/name.patch file exists, but .hg/name.hg does not,
'hg shelve -d name' would fail saying "shelve not found". Now deletion
will only fail if .patch file does not exist (since .patch is used
as an indicator of an existing shelve). Other shelve files being absent
are skipped silently to accommodate for future introduction of obs-based
shelve, which will mean that for some shelves .hg and .patch files exist,
while for others .hg and .oshelve.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30377
manifest: delete manifest.manifest class
Now that nothing uses the primary manifest class, we can delete it.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30376
localrepo: delete localrepo.manifest
Now that nothing uses normal manifests, we can delete localrepo.manifest.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30375
manifest: remove last uses of repo.manifest
Now that all the functionality has been moved to manifestlog/manifestrevlog/etc,
we can finally change all the uses of repo.manifest to use the new versions. A
future diff will then delete repo.manifest.
One additional change in this commit is to change repo.manifestlog to be a
@storecache property instead of @property. This is required by some uses of
repo.manifest require that it be settable (contrib/perf.py and the static http
server). We can't do this in a prior change because we can't use @storecache on
this until repo.manifest is no longer used anywhere.
Durham Goode <durham@fb.com> [Fri, 11 Nov 2016 01:20:13 -0800] rev 30374
manifest: add unionmanifestlog support
As part of deprecating manifest, we need to make the union repo support
manifestlog.
Durham Goode <durham@fb.com> [Fri, 11 Nov 2016 01:15:59 -0800] rev 30373
manifest: add bundlemanifestlog support
As part of deprecating manifest.manifest we need to make bundlerepo support
manifestlog.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30372
manifest: make manifestlog use it's own cache
As we start to make manifestlog the primary manifest source, the dependency on
manifest.manifest will cause circular dependency problems. Let's break this
dependency by making manifestlog use it's own cache. In a near future patch we
will remove the previous manifest cache so we're not duplicating it.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30371
manifest: delete unused dirlog and _newmanifest functions
As part of migrating all manifest functionality out of manifest.manifest, let's
migrate a couple spots off of manifest.dirlog() to use the revlog specific
accessor. Then we can delete manifest.dirlog() and other unused functions.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30370
manifest: move clearcaches to manifestlog
This is part of removing all functionality from manifest.manifest so we can
delete the class entirely.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30369
manifest: remove usages of manifest.read
Now that the two manifestctx implementations have working read() functions,
let's remove the existing uses of manifest.read and drop the function.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:13:19 -0800] rev 30368
manifest: remove dependency on manifestrevlog being able to create trees
A future patch will be removing the read() function from the manifest class.
Since manifestrevlog currently depends on the read function that manifest
implements (as a derived class), we need to break the dependency from
manifestrevlog to read(). We do this by adding an argument to
manifestrevlog.write() which provides it with the ability to read a manifest.
This is good in general because it further separates revlog as the storage
format from the actual inmemory data structure implementation.
Xidorn Quan <me@upsuper.org> [Fri, 11 Nov 2016 13:06:05 +1100] rev 30367
color: show mode warning based on ui.formatted
ui.interactive is only for input and ui.formatted is for output.
Augie Fackler <augie@google.com> [Thu, 10 Nov 2016 15:14:05 -0500] rev 30366
protocol: drop unused import of zlib
Something weird is happening that breaks pyflakes installed via 'pip
install --user'. I haven't had a chance to finish debugging this, but
this at least fixes the build.
Yuya Nishihara <yuya@tcha.org> [Tue, 08 Nov 2016 22:41:45 +0900] rev 30365
hook: lower inflated use of sys.__stdout__ and __stderr__
They were introduced at
9f76df0edb7d, where sys.stdout could be replaced by
sys.stderr. After that, we've changed the way of stdout redirection by
afccc64eea73, so we no longer need to reference the original __stdout__ and
__stderr__ objects.
Let's move away from using __std*__ objects so we can simply wrap sys.std*
objects for Python 3 porting.
Yuya Nishihara <yuya@tcha.org> [Tue, 08 Nov 2016 22:22:22 +0900] rev 30364
hook: flush stdout before restoring stderr redirection
There was a similar issue to
8b011ededfb2. If an in-process hook writes
to stdout, the data may be buffered. In which case, stdout must be flushed
before restoring its file descriptor. Otherwise, remaining data would be sent
over the ssh wire and corrupts the protocol.
Note that this is a different redirection from the one I've just removed.
Yuya Nishihara <yuya@tcha.org> [Thu, 20 Oct 2016 22:39:59 +0900] rev 30363
hook: do not redirect stdout/err/in to ui while running in-process hooks (BC)
It was introduced by
a59058fd074a to address command-server issues. After
that, I've made a complete fix by
69f86b937035, so we don't need to replace
sys.stdio objects to protect the IPC channels.
This change means we no longer see data written to sys.stdout/err by an
in-process hook on command server. I think that's okay because the canonical
way is to use ui functions and in-process hooks should respect the Mercurial
API.
This will help Python 3 porting, where sys.stdout is TextIO but ui.fout is
BytesIO.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:21:15 -0800] rev 30362
merge: change modified indicator to be 20 bytes
Previously we indicated that the .hgsubstate file was dirty by adding a '+' to
the end of its hash in the wctx manifest. This made is complicated to have new
manifest implementations that rely on the node length being fixed.
In previous patches we added added and modified node placeholders, so let's use
those to indicate dirty here as well. It doesn't look like anything ever
depended on this '+' (aside from it being different to the parent), so nothing
else needed to change here.
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:19:16 -0800] rev 30361
dirstate: change added/modified placeholder hash length to 20 bytes
Previously the added/modified placeholder hash for manifests generated from the
dirstate was a 21byte long string consisting of the p1 file hash plus a single
character to indicate an add or a modify. Normal hashes are only 20 bytes long.
This makes it complicated to implement more efficient manifest implementations
which rely on the hashes being fixed length.
Let's change this hash to just be 20 bytes long, and rely on the astronomical
improbability of an actual hash being these 20 bytes (just like we rely on no
hash every being the nullid).
This changes the possible behavior slightly in that the hash for all
added/modified entries in the dirstate manifest will now be the same (so simple
node comparisons would say they are equal), but we should never be doing simple
node comparisons on these nodes even with the old hashes, because they did not
accurately represent the content (i.e. two files based off the same p1 file
node, with different working copy contents would have the same hash (even with
the appended character) in the old scheme too, so we couldn't depend on the
hashes period).
Durham Goode <durham@fb.com> [Thu, 10 Nov 2016 02:17:22 -0800] rev 30360
dirstate: change placeholder hash length to 20 bytes
Previously the new-node placeholder hash for manifests generated from the
dirstate was a 21byte long string of "!" characters. Normal hashes are only 20
bytes long. This makes it complicated to implement more efficient manifest
implementations which rely on the hashes being fixed length.
Let's change this hash to just be 20 bytes long, and rely on the astronomical
improbability of an actual hash being 20 "!" bytes in a row (just like we rely
on no hash ever being the nullid).
A future diff will do this for added and modified dirstate markers as well, so
we're putting the new newnodeid in node.py so there's a common place for these
placeholders.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:57:54 -0800] rev 30359
util: remove compressorobj API from compression engines
All callers have been replaced with "compressstream." It is quite
low-level and redundant with "compressstream." So eliminate it.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:54:35 -0800] rev 30358
hgweb: use compression engine API for zlib compression
More low-level compression code elimination because we now have nice
APIs.
This patch also demonstrates why we needed and implemented the
"level" option on the "compressstream" API.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:46:37 -0800] rev 30357
bundle2: use compressstream compression engine API
Compression engines now have an API for compressing a stream of
chunks. Switch to it and make low-level compression code disappear.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:57:07 -0800] rev 30356
util: add a stream compression API to compression engines
It is a common pattern throughout the code to perform compression
on an iterator of chunks, yielding an iterator of compressed chunks.
Let's formalize that as part of the compression engine API.
The zlib and bzip2 implementations allow an optional "level" option
to control the compression level. The default values are the same as
what the Python modules use. This option will be used in subsequent
patches.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:39:08 -0800] rev 30355
util: remove decompressors dict (API)
All in-tree consumers are now using the compengines registrar.
Extensions should switch to it as well.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:38:13 -0800] rev 30354
changegroup: use compression engines API
The new API doesn't have the equivalence for None and 'UN' so we
introduce code to use 'UN' explicitly.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:36:48 -0800] rev 30353
bundle2: use compression engines API to obtain decompressor
Like the recent change for the compressor side, this too is
relatively straightforward. We now store a compression engine
on the instance instead of a low-level decompressor. Again, this
will allow us to easily transition to different compression engine
APIs when they are implemented.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:34:51 -0800] rev 30352
util: remove compressors dict (API)
We no longer have any in-tree consumers of this object. Use
util.compengines instead.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:35:43 -0800] rev 30351
bundle2: use new compression engine API for compression
Now that we have a new API to define compression engines, let's put it
to use!
The new code stores a reference to the compression engine instead of
a low-level compressor object. This will allow us to more easily
transition to different APIs on the compression engine interface
once we implement them.
As part of this, we change the registration in bundletypes to use 'UN'
instead of None. Previously, util.compressors had the no-op compressor
registered under both the 'UN' and None keys. Since we're switching to
a new API, I don't see the point in carrying this dual registration
forward.
Gregory Szorc <gregory.szorc@gmail.com> [Mon, 07 Nov 2016 18:31:39 -0800] rev 30350
util: create new abstraction for compression engines
Currently, util.py has "compressors" and "decompressors" dicts
mapping compression algorithms to callables returning objects that
perform well-defined operations. In addition, revlog.py has code
for calling into a compressor or decompressor explicitly. And, there
is code in the wire protocol for performing zlib compression.
The 3rd party lz4revlog extension has demonstrated the utility of
supporting alternative compression formats for revlog storage. But
it stops short of supporting lz4 for bundles and the wire protocol.
There are also plans to support zstd as a general compression
replacement.
So, there appears to be a market for a unified API for registering
compression engines. This commit starts the process of establishing
one.
This commit establishes a base class/interface for defining
compression engines and how they will be used. A collection class
to hold references to registered compression engines has also been
introduced.
The built-in zlib, bz2, truncated bz2, and no-op compression engines
are registered with a singleton instance of the collection class.
The compression engine API will change once consumers are ported
to the new API and some common patterns can be simplified at the
engine API level. So don't get too attached to the API...
Augie Fackler <augie@google.com> [Sun, 09 Oct 2016 09:25:39 -0400] rev 30349
config: mark parser regexes as bytes explicitly
r-strings are not transformed into bytes by our source transformer magic.