demandimport: avoid infinite recursion at actual module importing (
issue5304)
Before this patch, importing C module on Windows environment causes
infinite recursion call, if py2exe is used with -b2 option.
At importing C module "a.b", extra hooking by zipextimporter of py2exe
causes:
0. assumption before accessing "b" of "a":
- built-in module object is created for "a",
(= "a" is actually imported)
- _demandmod is created for "a.b" as a proxy object, and
(= "a.b" is not yet imported)
- an attribute "b" of "a" is initialized by the latter
1. invocation of __import__ via _hgextimport() in _demandmod._load()
for "a.b" implies _demandimport() for "a.b"
This is unintentional, because _demandmod might be returned by
_hgextimport() instead of built-in module object.
2. _demandimport() at (1) is invoked with not context of "a", but
context of zipextimporter
Just after invocation of _hgextimport() in _demandimport(), an
attribute "b" of the built-in module object for "a" is still
bound to the proxy object for "a.b", because context of "a" isn't
updated by actual importing "a.b". even though the built-in
module object for "a.b" already appears in sys.modules.
Therefore, chainmodules() returns _demandmod for "a.b", which is
gotten from the attribute "b" of "a".
3. processfromitem() on "a.b" causes _demandmod._load() for "a.b"
again
_demandimport() takes context of "a" in this case.
Therefore, attributes below are bound to built-in module object
for "a.b", as expected:
- "b" of built-in module object for "a"
- _module of _demandmod for "a.b"
4. but _demandimport() invoked at (1) returns _demandmod object
because _demandimport() just returns the object returned by
chainmodules() at (3) above.
5. then, _demandmod._load() causes infinite recursion call
_demandimport() returns _demandmod for "a.b", and it is "self" at
_demandmod._load().
To avoid infinite recursion at actual module importing, this patch
uses self._module, if _hgextimport() returns _demandmod itself. If
_demandmod._module isn't yet bound at this point, execution should be
aborted, because actual importing failed.
In this patch, _demandmod._module is examined not on _demandimport()
side, but on _demandmod._load() side, because:
- the former has some exit points
- only the latter uses _hgextimport(), except for _demandimport()
BTW, this issue occurs only in the code path for non .py/.pyc files in
zipextimporter (strictly speaking, in _memimporter) of py2exe.
Even if zipextimporter is enabled, .py/.pyc files are handled by
zipimporter, and it doesn't imply unintentional _demandimport() at
invocation of __import__ via _hgextimport().
--- a/mercurial/demandimport.py Fri Jul 29 00:45:24 2016 +0200
+++ b/mercurial/demandimport.py Sun Jul 31 05:39:59 2016 +0900
@@ -94,6 +94,23 @@
if not self._module:
head, globals, locals, after, level, modrefs = self._data
mod = _hgextimport(_import, head, globals, locals, None, level)
+ if mod is self:
+ # In this case, _hgextimport() above should imply
+ # _demandimport(). Otherwise, _hgextimport() never
+ # returns _demandmod. This isn't intentional behavior,
+ # in fact. (see also issue5304 for detail)
+ #
+ # If self._module is already bound at this point, self
+ # should be already _load()-ed while _hgextimport().
+ # Otherwise, there is no way to import actual module
+ # as expected, because (re-)invoking _hgextimport()
+ # should cause same result.
+ # This is reason why _load() returns without any more
+ # setup but assumes self to be already bound.
+ mod = self._module
+ assert mod and mod is not self, "%s, %s" % (self, mod)
+ return
+
# load submodules
def subload(mod, p):
h, t = p, None