Sat, 13 Feb 2016 23:20:47 +0900 templater: expand list of parsed templates to template node
Yuya Nishihara <yuya@tcha.org> [Sat, 13 Feb 2016 23:20:47 +0900] rev 28547
templater: expand list of parsed templates to template node This patch eliminates a nested data structure other than the parsed tree. ('template', [(op, data), ..]) -> ('template', (op, data), ..) New expanded tree can be processed by common parser functions. This change will help implementing template aliases. Because a (template ..) node should have at least one child node, an empty template (template []) is mapped to (string ''). Also a trivial string (template [(string ..)]) node is unwrapped to (string ..) at parsing phase, instead of compiling phase.
Sun, 14 Feb 2016 15:42:49 +0900 templater: relax type of mapped template
Yuya Nishihara <yuya@tcha.org> [Sun, 14 Feb 2016 15:42:49 +0900] rev 28546
templater: relax type of mapped template Now compiled template fragments are packed into a generic type, (func, data), a string can be a valid template. This change allows us to unwrap a trivial string node. See the next patch for details.
Sat, 13 Feb 2016 23:54:24 +0900 templater: lift parsed and compiled templates to generic data types
Yuya Nishihara <yuya@tcha.org> [Sat, 13 Feb 2016 23:54:24 +0900] rev 28545
templater: lift parsed and compiled templates to generic data types Before this patch, parsed and compiled templates were kept as lists. That was inconvenient for applying transformation such as alias expansion. This patch changes the types of the outermost objects as follows: stage old new -------- -------------- ------------------------------ parsed [(op, ..)] ('template', [(op, ..)]) compiled [(func, data)] (runtemplate, [(func, data)]) New templater.parse() function has the same signature as revset.parse() and fileset.parse().
Tue, 15 Mar 2016 15:50:57 -0700 tests: python executable path should always be globbed
Danek Duvall <danek.duvall@oracle.com> [Tue, 15 Mar 2016 15:50:57 -0700] rev 28544
tests: python executable path should always be globbed Although this is coming in under the guise of consistency, part of the desire for this is that at least as part of the official Solaris builds, we build with a versioned python interpreter, such as "python2.7", which doesn't match "*python".
Mon, 14 Mar 2016 15:01:27 +0000 crecord: use ui.interface to choose curses interface
Simon Farnsworth <simonfar@fb.com> [Mon, 14 Mar 2016 15:01:27 +0000] rev 28543
crecord: use ui.interface to choose curses interface use ui.interface to select curses mode, instead of experimental.crecord
Mon, 14 Mar 2016 15:01:27 +0000 ui: add new config flag for interface selection
Simon Farnsworth <simonfar@fb.com> [Mon, 14 Mar 2016 15:01:27 +0000] rev 28542
ui: add new config flag for interface selection This patch introduces a new config flag ui.interface to select the interface for interactive commands. It currently only applies to chunks selection. The config can be overridden on a per feature basis with the flag ui.interface.<feature>. features for the moment can only be 'chunkselector', moving forward we expect to have 'histedit' and other commands there. If an incorrect value is given to ui.interface we print a warning and use the default interface: text. If HGPLAIN is specified we also use the default interface: text. Note that we fail quickly if a feature does not handle all the interfaces that we permit in ui.interface; in future, we could design a fallback path (e.g. blackpearl to curses, curses to text), but let's leave that until we need it.
Fri, 11 Mar 2016 10:30:08 +0000 extensions: also search for extension in the 'hgext3rd' package
Pierre-Yves David <pierre-yves.david@fb.com> [Fri, 11 Mar 2016 10:30:08 +0000] rev 28541
extensions: also search for extension in the 'hgext3rd' package Mercurial extensions are not meant to be normal python package/module. Yet the lack of an official location to install them means that a lot of them actually install as root level python package, polluting the global Python package namespace and risking collision with more legit packages. As we recently discovered, core python actually support namespace package. A way for multiples distinct "distribution" to share a common top level package without fear of installation headache. (Namespace package allow submodule installed in different location (of the 'sys.path') to be imported properly. So we are fine as long as extension includes a proper 'hgext3rd.__init__.py' to declare the namespace package.) Therefore we introduce a 'hgext3rd' namespace packages and search for extension in it. We'll then recommend third extensions to install themselves in it. Strictly speaking we could just get third party extensions to install in 'hgext' as it is also a namespace package. However, this would make the integration of formerly third party extensions in the main distribution more complicated as the third party install would overwrite the file from the main install. Moreover, having an explicit split between third party and core extensions seems like a good idea. The name 'hgext3rd' have been picked because it is short and seems explicit enough. Other alternative I could think of where: - hgextcontrib - hgextother - hgextunofficial
Sun, 13 Mar 2016 05:17:06 +0900 hgext: use templatekeyword to mark a function as template keyword
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sun, 13 Mar 2016 05:17:06 +0900] rev 28540
hgext: use templatekeyword to mark a function as template keyword This patch replaces registration of template keyword function in bundled extensions by registrar.templatekeyword decorator all at once.
Sun, 13 Mar 2016 05:17:06 +0900 templatekw: use templatekeyword to mark a function as template keyword
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sun, 13 Mar 2016 05:17:06 +0900] rev 28539
templatekw: use templatekeyword to mark a function as template keyword Using decorator can localize changes for adding (or removing) a template keyword function in source code. This patch also removes leading ":KEYWORD:" part in help document of each keywords, because using templatekeyword makes it useless. For similarity to decorator introduced by subsequent patches, this patch uses 'templatekeyword' instead of 'keyword' as a decorator name, even though the former is a little redundant in 'templatekw.py'. file name reason =================== ================= ================================== templatekw.py templatekeyword for similarity to others templatefilters.py templatefilter 'filter' hides Python built-in one templaters.py templatefunc 'func' is too generic
Sun, 13 Mar 2016 05:17:06 +0900 registrar: add templatekeyword to mark a function as template keyword (API)
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sun, 13 Mar 2016 05:17:06 +0900] rev 28538
registrar: add templatekeyword to mark a function as template keyword (API) _templateregistrarbase is defined as a super class of templatekeyword, for ease of adding template common features between "keyword", "filter" and "function". This patch also adds loadkeyword() to templatekw, because this combination helps to figure out how they cooperate with each other. Listing up loadkeyword() in dispatch.extraloaders causes implicit loading template keyword functions at loading (3rd party) extension. This change requires that "templatekeyword" attribute of (3rd party) extension is registrar.templatekeyword or so.
Wed, 16 Mar 2016 11:57:09 +0000 chgserver: do not keep repo object
Jun Wu <quark@fb.com> [Wed, 16 Mar 2016 11:57:09 +0000] rev 28537
chgserver: do not keep repo object The current chgserver design is to use one server to handle multiple repos which has same [extensions] config. Previously the client uses --cwd / to avoid creating a repo object. Now we need to set repo to None before we have code to make "serve" command norepo when it's chg.
Sat, 12 Mar 2016 04:24:11 +0000 chgserver: invalidate the server if extensions fail to load
Jun Wu <quark@fb.com> [Sat, 12 Mar 2016 04:24:11 +0000] rev 28536
chgserver: invalidate the server if extensions fail to load Previously, if extensions fail to load, chg server will just keep working without those extensions. It will print a warning message but only if a new server starts. This patch invalidates the server if any extension failed to load, but still serve the client (hopefully just) once. It will help chg pass some test cases of test-bad-extension.t.
Mon, 14 Mar 2016 13:48:33 +0000 chgserver: add an explicit "reconnect" instruction to validate
Jun Wu <quark@fb.com> [Mon, 14 Mar 2016 13:48:33 +0000] rev 28535
chgserver: add an explicit "reconnect" instruction to validate In some rare cases (next patch), we may want validate to do "unlink" without forcing the client reconnect. This patch addes a new "reconnect" instruction and makes "unlink" not to reconnect by default.
Mon, 14 Mar 2016 11:06:34 +0000 dispatch: flush ui before returning from dispatch
Jun Wu <quark@fb.com> [Mon, 14 Mar 2016 11:06:34 +0000] rev 28534
dispatch: flush ui before returning from dispatch A chg client may exit after received the result from runcommand. It is necessary to do a flush to make sure the warning message is printed out and the process waiting for the chg client will actually see the output. This helps chg to pass test-alias.t.
Tue, 15 Mar 2016 00:14:53 +0900 tests: make tests for convert with svn portable
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Tue, 15 Mar 2016 00:14:53 +0900] rev 28533
tests: make tests for convert with svn portable svn 1.6.x (at least, 1.6.12 or 1.6.17) might display empty lines, even though svn 1.9.x (at least, 1.9.3) doesn't. To make tests for convert with svn portable, this patch adds "|(^$)" regexp to egrep in filter_svn_output. To avoid similar future issue, this patch adds "|(^$)" regexp to all filter_svn_output (and adjusts test-subrepo-svn.t), even though only test-convert-svn-source.t fails with svn 1.6.x, AFAIK.
Tue, 15 Mar 2016 14:10:46 -0700 merge with stable
Matt Mackall <mpm@selenic.com> [Tue, 15 Mar 2016 14:10:46 -0700] rev 28532
merge with stable
Fri, 11 Mar 2016 20:34:49 -0500 test-pager: add a test for pager with color enabled
Augie Fackler <augie@google.com> [Fri, 11 Mar 2016 20:34:49 -0500] rev 28531
test-pager: add a test for pager with color enabled
Fri, 11 Mar 2016 11:37:00 -0500 http: support sending hgargs via POST body instead of in GET or headers
Augie Fackler <augie@google.com> [Fri, 11 Mar 2016 11:37:00 -0500] rev 28530
http: support sending hgargs via POST body instead of in GET or headers narrowhg (for its narrow spec) and remotefilelog (for its large batch requests) would like to be able to make requests with argument sets so absurdly large that they blow out total request size limit on some http servers. As a workaround, support stuffing args at the start of the POST body. We will probably want to leave this behavior off by default in servers forever, because it makes the old "POSTs are only for writes" assumption wrong, which might break some of the simpler authentication configurations.
Mon, 14 Mar 2016 21:15:59 -0400 fsmonitor: flag msc_stdint as no-check-code
Augie Fackler <augie@google.com> [Mon, 14 Mar 2016 21:15:59 -0400] rev 28529
fsmonitor: flag msc_stdint as no-check-code I'd rather not modify code that we're vendoring, so I'm just marking it this way.
Mon, 14 Mar 2016 17:53:47 +0100 fsmonitor: use custom stdint.h file when compiling with Visual C
Sune Foldager <sune.foldager@me.com> [Mon, 14 Mar 2016 17:53:47 +0100] rev 28528
fsmonitor: use custom stdint.h file when compiling with Visual C Visual C/C++ 9, which Python 2.7 is compatible with, doesn't have C99 support and thus doesn't contain a stdint.h file. This changeset adds a custom version of stdint.h, created specifically for Visual C, and uses it when building with that compiler.
Sun, 13 Mar 2016 02:36:03 +0100 tests: handle getaddrinfo reporting "No address associated with hostname"
Mads Kiilerich <madski@unity3d.com> [Sun, 13 Mar 2016 02:36:03 +0100] rev 28527
tests: handle getaddrinfo reporting "No address associated with hostname" This has been seen on some Fedora 23 systems.
Mon, 14 Mar 2016 14:08:28 -0700 httpconnection: remove obsolete comment about open()
Martin von Zweigbergk <martinvonz@google.com> [Mon, 14 Mar 2016 14:08:28 -0700] rev 28526
httpconnection: remove obsolete comment about open() When httpsendfile was moved from url.py into httpconnection.py in e7525a555a64 (url: use new http support if requested by the user, 2011-05-06), the comment about not being able to just call open() became obsolete.
Sun, 13 Mar 2016 14:03:58 -0700 sslutil: allow multiple fingerprints per host
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 13 Mar 2016 14:03:58 -0700] rev 28525
sslutil: allow multiple fingerprints per host Certificate pinning via [hostfingerprints] is a useful security feature. Currently, we only support one fingerprint per hostname. This is simple but it fails in the real world: * Switching certificates breaks clients until they change the pinned certificate fingerprint. This incurs client downtime and can require massive amounts of coordination to perform certificate changes. * Some servers operate with multiple certificates on the same hostname. This patch adds support for defining multiple certificate fingerprints per host. This overcomes the deficiencies listed above. I anticipate the primary use case of this feature will be to define both the old and new certificate so a certificate transition can occur with minimal interruption, so this scenario has been called out in the help documentation.
Sun, 13 Mar 2016 13:51:01 -0700 help: add empty lines to hostfingerprints section
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 13 Mar 2016 13:51:01 -0700] rev 28524
help: add empty lines to hostfingerprints section I think this is now much easier to read.
Sat, 12 Mar 2016 18:51:07 -0800 help: document requirements
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 12 Mar 2016 18:51:07 -0800] rev 28523
help: document requirements We didn't have unified documentation of the various repository requirements. This patch changes that.
Sun, 13 Mar 2016 01:59:18 +0530 showstack: use absolute_import
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 13 Mar 2016 01:59:18 +0530] rev 28522
showstack: use absolute_import
Mon, 14 Mar 2016 14:12:13 +0530 contrib: use absolute_import in win32/hgwebdir_wsgi.py
Pulkit Goyal <7895pulkit@gmail.com> [Mon, 14 Mar 2016 14:12:13 +0530] rev 28521
contrib: use absolute_import in win32/hgwebdir_wsgi.py
Sun, 27 Dec 2015 13:38:46 +0900 dispatch: catch KeyboardInterrupt more broadly
Yuya Nishihara <yuya@tcha.org> [Sun, 27 Dec 2015 13:38:46 +0900] rev 28520
dispatch: catch KeyboardInterrupt more broadly Because _runcatch() can run long operations in its exception handler, it wasn't enough to catch KeyboardInterrupt at the same level. For example, "hg unknown" will load all extension modules, so we could easily make it crashed by Ctrl-C.
Sun, 13 Mar 2016 16:46:49 -0700 histedit: have dropmissing abort on empty plan
Mateusz Kwapich <mitrandir@fb.com> [Sun, 13 Mar 2016 16:46:49 -0700] rev 28519
histedit: have dropmissing abort on empty plan We noticed that many users have the intuition of laving the editor empty when they want to abort the operation. The fact that dropmissing allows user to delete all edited commits is not intuitive even for users that asked for it. Let's prevent people from this footgun.
Sun, 13 Mar 2016 02:29:11 +0100 streamclone: fix error when store files grow while stream cloning stable
Mads Kiilerich <madski@unity3d.com> [Sun, 13 Mar 2016 02:29:11 +0100] rev 28518
streamclone: fix error when store files grow while stream cloning Effectively a backout of 9fea6b38a8da, but updated to using 'with'.
Sun, 13 Mar 2016 02:28:46 +0100 tests: add test of stream clone of repo that is changing stable
Mads Kiilerich <madski@unity3d.com> [Sun, 13 Mar 2016 02:28:46 +0100] rev 28517
tests: add test of stream clone of repo that is changing This reveals an error introduced by 9fea6b38a8da.
Mon, 14 Mar 2016 12:52:35 +0000 chgserver: handle ParseError during validate
Jun Wu <quark@fb.com> [Mon, 14 Mar 2016 12:52:35 +0000] rev 28516
chgserver: handle ParseError during validate Currently the validate command in chgserver expects config can be loaded without issues but the config can be broken and chg will print a stacktrace instead of the parsing error, if a chg server is already running. This patch adds a handler for ParseError in validate and a new instruction "exit" to make the client exit without abortmsg. A test is also added to make sure it will behave as expected.
Mon, 14 Mar 2016 12:32:09 +0000 dispatch: extract common logic for handling ParseError
Jun Wu <quark@fb.com> [Mon, 14 Mar 2016 12:32:09 +0000] rev 28515
dispatch: extract common logic for handling ParseError The way ParseError is handled at two different places in dispatch.py is the same. Move common logic into _formatparse.
Mon, 14 Mar 2016 11:23:04 +0000 chgserver: resolve relative path before sending via system channel
Jun Wu <quark@fb.com> [Mon, 14 Mar 2016 11:23:04 +0000] rev 28514
chgserver: resolve relative path before sending via system channel The chgserver may have a different cwd from the client because of the side effect of "--cwd" and other possible os.chdir done by extensions. Therefore relative paths can be misunderstood by the client. This patch solves it by expanding relative cwd path to absolute one before sending them via the 'S' channel. It can help chg to pass a testcase in test-alias.t later.
Sat, 12 Mar 2016 13:19:19 -0800 mercurial: use pure Python module policy on Python 3
Gregory Szorc <gregory.szorc@gmail.com> [Sat, 12 Mar 2016 13:19:19 -0800] rev 28513
mercurial: use pure Python module policy on Python 3 The C extensions don't yet work with Python 3. Let's minimize the work required to get Mercurial running on Python 3 by always using the pure Python module policy on Python 3.
Sat, 12 Mar 2016 22:17:30 +0900 chg: provide early exception to user
Yuya Nishihara <yuya@tcha.org> [Sat, 12 Mar 2016 22:17:30 +0900] rev 28512
chg: provide early exception to user See the previous patch for details. Since the socket will be closed by the server, handleresponse() will never return: Traceback (most recent call last): ... chg: abort: failed to read channel
Sat, 12 Mar 2016 22:03:30 +0900 cmdserver: write early exception to 'e' channel in 'unix' mode
Yuya Nishihara <yuya@tcha.org> [Sat, 12 Mar 2016 22:03:30 +0900] rev 28511
cmdserver: write early exception to 'e' channel in 'unix' mode In 'unix' mode, the server is typically detached from the console. Therefore a client couldn't see the exception that occurred while instantiating the server object. This patch tries to catch the early error and send it to 'e' channel even if the server isn't instantiated yet. This means the error may be sent before the initial hello message. So it's up to the client implementation whether to handle the early error message or error out as protocol violation. The error handling code is also copied to chgserver.py. I'll factor out them later if we manage to get chg passes the test suite.
Sun, 13 Mar 2016 01:32:42 +0530 contrib: make memory.py use absolute_import
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 13 Mar 2016 01:32:42 +0530] rev 28510
contrib: make memory.py use absolute_import
Sun, 13 Mar 2016 01:08:39 +0530 check-code: use absolute_import and print_function
Pulkit Goyal <7895pulkit@gmail.com> [Sun, 13 Mar 2016 01:08:39 +0530] rev 28509
check-code: use absolute_import and print_function
Fri, 11 Mar 2016 21:27:26 -0800 encoding: use range() instead of xrange()
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 11 Mar 2016 21:27:26 -0800] rev 28508
encoding: use range() instead of xrange() Python 3 doesn't have xrange(). Instead, range() on Python 3 is a generator, like xrange() is on Python 2. The benefits of xrange() over range() are when there are very large ranges that are too expensive to pre-allocate. The code here is only creating <128 values, so the benefits of xrange() should be negligible. With this patch, encoding.py imports safely on Python 3.
Fri, 11 Mar 2016 21:23:34 -0800 encoding: make HFS+ ignore code Python 3 compatible
Gregory Szorc <gregory.szorc@gmail.com> [Fri, 11 Mar 2016 21:23:34 -0800] rev 28507
encoding: make HFS+ ignore code Python 3 compatible unichr() doesn't exist in Python 3. chr() is the equivalent there. Unfortunately, we can't use chr() outright because Python 2 only accepts values smaller than 256. Also, Python 3 returns an int when accessing a character of a bytes type (s[x]). So, we have to ord() the values in the assert statement.
Fri, 11 Mar 2016 10:28:58 +0000 extensions: factor import error reporting out
Pierre-Yves David <pierre-yves.david@fb.com> [Fri, 11 Mar 2016 10:28:58 +0000] rev 28506
extensions: factor import error reporting out To clarify third party extensions lookup, we are about to add a third place where extensions are searched for. So we factor the error reporting logic out to be able to easily reuse it in the next patch.
Fri, 11 Mar 2016 10:24:54 +0000 extensions: extract the 'importh' closure as normal function
Pierre-Yves David <pierre-yves.david@fb.com> [Fri, 11 Mar 2016 10:24:54 +0000] rev 28505
extensions: extract the 'importh' closure as normal function There is no reason for this to be a closure so we extract it for clarity.
Fri, 11 Mar 2016 15:40:58 -0800 zeroconf: remove leftover camelcase identifier
Martin von Zweigbergk <martinvonz@google.com> [Fri, 11 Mar 2016 15:40:58 -0800] rev 28504
zeroconf: remove leftover camelcase identifier eb9d0e828c30 (zeroconf: remove camelcase in identifiers, 2016-03-01) forgot one occurrence of "numAuthorities", which makes test-paths.t fail for me. I don't even know what zeroconf is, but this patch seems obviously correct and it fixes the failing test case.
Sat, 12 Mar 2016 04:35:42 +0900 hg: acquire wlock while updating the working directory via updatetotally
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 12 Mar 2016 04:35:42 +0900] rev 28503
hg: acquire wlock while updating the working directory via updatetotally updatetotally() might be invoked outside wlock scope (e.g. invocation via postincoming() at "hg unbundle" or "hg pull"). In such case, acquisition of wlock is needed for consistent view, because parallel "hg update" and/or "hg bookmarks" might change working directory status while executing updatetotally(). Strictly speaking, truly consistent updating should acquire also store lock, because active bookmark might be moved to another one outside wlock scope (e.g. pulling from other repository causes updating current active one). Acquisition of wlock in this patch ensures consistency in as same level as past "hg update".
Sat, 12 Mar 2016 04:35:42 +0900 commands: add postincoming docstring for explanation of arguments
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 12 Mar 2016 04:35:42 +0900] rev 28502
commands: add postincoming docstring for explanation of arguments
Sat, 12 Mar 2016 04:35:42 +0900 commands: centralize code to update with extra care for non-file components
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 12 Mar 2016 04:35:42 +0900] rev 28501
commands: centralize code to update with extra care for non-file components This patch centralizes similar code paths to update the working directory with extra care for non-file components (e.g. bookmark) into newly added function updatetotally(). 'if True' at the beginning of updatetotally() is redundant at this patch, but useful to reduce amount of changes in subsequent patch.
Sat, 12 Mar 2016 04:35:42 +0900 update: omit redundant activating message for already active bookmark
FUJIWARA Katsunori <foozy@lares.dti.ne.jp> [Sat, 12 Mar 2016 04:35:42 +0900] rev 28500
update: omit redundant activating message for already active bookmark This patch also adds "hg bookmarks" invocation into tests, where redundant message is omitted but bookmark activity isn't clear from context.
Fri, 11 Mar 2016 11:44:03 -0800 tests: make test-verify-repo-operations.py not run by default
Martin von Zweigbergk <martinvonz@google.com> [Fri, 11 Mar 2016 11:44:03 -0800] rev 28499
tests: make test-verify-repo-operations.py not run by default test-verify-repo-operations.py currently starts way too late and extends the running time with -j50 on my machine from around 3:48 min to 6:30 min. We could of course make it run earlier, but the test case seems unlikely to find bugs not covered by other tests, so let's mark it "slow" instead. I think this type of test is better suited to running separately in a long-running job.
Fri, 29 Jan 2016 14:37:16 +0000 ui: log devel warnings
timeless <timeless@mozdev.org> [Fri, 29 Jan 2016 14:37:16 +0000] rev 28498
ui: log devel warnings
Fri, 11 Mar 2016 17:22:04 +0000 util: refactor getstackframes
timeless <timeless@mozdev.org> [Fri, 11 Mar 2016 17:22:04 +0000] rev 28497
util: refactor getstackframes
Fri, 11 Mar 2016 16:50:14 +0000 util: reword debugstacktrace comment
timeless <timeless@mozdev.org> [Fri, 11 Mar 2016 16:50:14 +0000] rev 28496
util: reword debugstacktrace comment
Sun, 06 Mar 2016 15:40:20 -0800 changelog: avoid slicing raw data until needed
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 15:40:20 -0800] rev 28495
changelog: avoid slicing raw data until needed Before, we were slicing the original raw text and storing individual variables with values corresponding to each field. This is avoidable overhead. With this patch, we store the offsets of the fields at construction time and perform the slice when a property is accessed. This appears to show a very marginal performance win on its own and the gains are so small as to not be worth reporting. However, this patch marks the end of our parsing refactor, so it is worth reporting the gains from the entire series: author(mpm) 0.896565 0.795987 89% desc(bug) 0.887169 0.803438 90% date(2015) 0.878797 0.773961 88% extra(rebase_source) 0.865446 0.761603 88% author(mpm) or author(greg) 1.801832 1.576025 87% author(mpm) or desc(bug) 1.812438 1.593335 88% date(2015) or branch(default) 0.968276 0.875270 90% author(mpm) or desc(bug) or date(2015) or extra(rebase_source) 3.656193 3.183104 87% Pretty consistent speed-up across the board for any revset accessing changelog revision data. Not bad! It's also worth noting that PyPy appears to experience a similar to marginally greater speed-up as well! According to statprof, revsets accessing changelog revision data are now clearly dominated by zlib decompression (16-17% of execution time). Surprisingly, it appears the most expensive part of revision parsing are the various text.index() calls to search for newlines! These appear to cumulatively add up to 5+% of execution time. I reckon implementing the parsing in C would make things marginally faster. If accessing larger strings (such as the commit message), encoding.tolocal() is the most expensive procedure outside of decompression.
Sun, 06 Mar 2016 13:13:54 -0800 changelog: parse description last
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 13:13:54 -0800] rev 28494
changelog: parse description last Before, we first searched for the double newline as the first step in the parse then moved to the front of the string and worked our way to the back again. This made sense when we were splitting the raw text on the double newline. But in our new newline scanning based approach, this feels awkward. This patch updates the parsing logic to parse the text linearly and deal with the description field last. Because we're avoiding an extra string scan, revsets appear to demonstrate a very slight performance win. But the percentage change is marginal, so the numbers aren't worth reporting.
Sun, 06 Mar 2016 14:31:06 -0800 changelog: lazily parse files
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 14:31:06 -0800] rev 28493
changelog: lazily parse files More of the same. Again, modest revset performance wins: author(mpm) 0.896565 0.822961 0.805156 desc(bug) 0.887169 0.847054 0.798101 date(2015) 0.878797 0.811613 0.786689 extra(rebase_source) 0.865446 0.797756 0.777408 author(mpm) or author(greg) 1.801832 1.668172 1.626547 author(mpm) or desc(bug) 1.812438 1.677608 1.613941 date(2015) or branch(default) 0.968276 0.896032 0.869017
Sun, 06 Mar 2016 14:30:25 -0800 changelog: lazily parse date/extra field
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 14:30:25 -0800] rev 28492
changelog: lazily parse date/extra field This is probably the most complicated patch in the parsing refactor. Because the date and extras are encoded in the same field, we stuff the entire field into a dedicated variable and add a property for accessing the sub-components of each. There is some duplicated code here. But the code is relatively simple, so it shouldn't be a big deal. We see revset performance wins across the board: author(mpm) 0.896565 0.876713 0.822961 desc(bug) 0.887169 0.895514 0.847054 date(2015) 0.878797 0.820987 0.811613 extra(rebase_source) 0.865446 0.823811 0.797756 author(mpm) or author(greg) 1.801832 1.784160 1.668172 author(mpm) or desc(bug) 1.812438 1.822756 1.677608 date(2015) or branch(default) 0.968276 0.910981 0.896032 author(mpm) or desc(bug) or date(2015) or extra(rebase_source) 3.656193 3.516788 3.265024 We see a speed-up on revsets accessing date and extras because the new parsing code only parses what you access. Even though they are stored the same text field, we avoid parsing dates when accessing extras and vice-versa. But strangely revsets accessing both date and extras appeared to speed up as well! I'm not sure if this is due to refactoring the parsing code or due to an optimization in revsets. You can't argue with the results!
Sun, 06 Mar 2016 14:29:46 -0800 changelog: lazily parse user
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 14:29:46 -0800] rev 28491
changelog: lazily parse user Same strategy as before. Revsets not accessing the user demonstrate a slight performance win: desc(bug) 0.887169 0.910400 0.895514 date(2015) 0.878797 0.870697 0.820987 extra(rebase_source) 0.865446 0.841644 0.823811 date(2015) or branch(default) 0.968276 0.945792 0.910981
Sun, 06 Mar 2016 14:29:13 -0800 changelog: lazily parse manifest node
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 14:29:13 -0800] rev 28490
changelog: lazily parse manifest node Like the description, we store the raw bytes and convert from hex on access. This patch also marks the beginning of our new parsing method, which is based on newline offsets and doesn't rely on str.split(). Many revsets showed a performance improvement: author(mpm) 0.896565 0.869085 0.868598 desc(bug) 0.887169 0.928164 0.910400 extra(rebase_source) 0.865446 0.871500 0.841644 author(mpm) or author(greg) 1.801832 1.791589 1.731503 author(mpm) or desc(bug) 1.812438 1.851003 1.798764 date(2015) or branch(default) 0.968276 0.974027 0.945792
Sun, 06 Mar 2016 14:28:46 -0800 changelog: lazily parse description
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 14:28:46 -0800] rev 28489
changelog: lazily parse description Before, the description field was converted to a localstr at parse time. With this patch, we store the raw description and convert to a localstr when it is first accessed. We see a revset speedup for revsets that don't access the description: author(mpm) 0.896565 0.914234 0.869085 date(2015) 0.878797 0.891980 0.862525 extra(rebase_source) 0.865446 0.912514 0.871500 author(mpm) or author(greg) 1.801832 1.860402 1.791589 date(2015) or branch(default) 0.968276 0.994673 0.974027 author(mpm) or desc(bug) or date(2015) or extra(rebase_source) 3.656193 3.721032 3.643593 As you can see, most of these revsets are already faster than from before this refactoring: we have already offset the performance loss from the introduction of the new class representing parsed changelog entries!
Sun, 06 Mar 2016 13:26:37 -0800 context: use changelogrevision
Gregory Szorc <gregory.szorc@gmail.com> [Sun, 06 Mar 2016 13:26:37 -0800] rev 28488
context: use changelogrevision Upcoming patches will make the changelogrevision object perform lazy parsing. Let's switch to it. Because we're switching from a tuple to an object, everthing that accesses the internal cached attribute needs to be updated to access via attributes. A nice side-effect is this makes the code easier to read! Surprisingly, this appears to make revsets accessing this data slightly faster (values are before series, p1, this patch): author(mpm) 0.896565 0.929984 0.914234 desc(bug) 0.887169 0.935642 0.921073 date(2015) 0.878797 0.908094 0.891980 extra(rebase_source) 0.865446 0.922624 0.912514 author(mpm) or author(greg) 1.801832 1.902112 1.860402 author(mpm) or desc(bug) 1.812438 1.860977 1.844850 date(2015) or branch(default) 0.968276 1.005824 0.994673 author(mpm) or desc(bug) or date(2015) or extra(rebase_source) 3.656193 3.743381 3.721032
(0) -10000 -3000 -1000 -300 -100 -60 +60 +100 +300 +1000 +3000 +10000 tip