Mercurial > hg-stable
view tests/test-largefiles-cache.t @ 26457:7e81305092a0
demandimport: replace more references to _demandmod instances
_demandmod instances may be referenced by multiple importing modules.
Before this patch, the _demandmod instance only maintained a reference
to its first consumer when using the "from X import Y" syntax. This is
because we only created a single _demandmod instance (attached to the
parent X module). If multiple modules A and B performed
"from X import Y", we'd produce a single _demandmod instance
"demandmod" with the following references:
X.Y = <demandmod>
A.Y = <demandmod>
B.Y = <demandmod>
The locals from the first consumer (A) would be stored in <demandmod1>.
When <demandmod1> was loaded, we'd look at the locals for the first
consumer and replace the symbol, if necessary. This resulted in state:
X.Y = <module>
A.Y = <module>
B.Y = <demandmod>
B's reference to Y wasn't updated and was still using the proxy object
because we just didn't record that B had a reference to <demandmod> that
needed updating!
With this patch, we add support for tracking which modules in addition
to the initial importer have a reference to the _demandmod instance and
we replace those references at module load time.
In the case of posix.py, this fixes an issue where the "encoding" module
was being proxied, resulting in hundreds of thousands of
__getattribute__ lookups on the _demandmod instance during dirstate
operations on mozilla-central, speeding up execution by many
milliseconds. There are likely several other operation that benefit from
this change as well.
The new mechanism isn't perfect: references in locals (not globals) may
likely linger. So, if there is an import inside a function and a symbol
from that module is used in a hot loop, we could have unwanted overhead
from proxying through _demandmod. Non-global imports are discouraged
anyway. So hopefully this isn't a big deal in practice. We could
potentially deploy a code checker that bans use of attribute lookups of
function-level-imported modules inside loops.
This deficiency in theory could be avoided by storing the set of globals
and locals dicts to update in the _demandmod instance. However, I tried
this and it didn't work. One reason is that some globals are _demandmod
instances. We could work around this, but it's a bit more work. There
also might be other module import foo at play. The solution as
implemented is better than what we had and IMO is good enough for the
time being.
It's worth noting that this sub-optimal behavior was made worse by the
introduction of absolute_import and its recommended "from . import X"
syntax for importing modules from the "mercurial" package. If we ever
wrote performance tests, measuring the amount of module imports and
__getattribute__ proxy calls through _demandmod instances would be
something I'd have it check.
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sun, 04 Oct 2015 11:17:43 -0700 |
parents | 2a3f24786d09 |
children | d881c072050a |
line wrap: on
line source
Create user cache directory $ USERCACHE=`pwd`/cache; export USERCACHE $ cat <<EOF >> ${HGRCPATH} > [extensions] > hgext.largefiles= > [largefiles] > usercache=${USERCACHE} > EOF $ mkdir -p ${USERCACHE} Create source repo, and commit adding largefile. $ hg init src $ cd src $ echo large > large $ hg add --large large $ hg commit -m 'add largefile' $ hg rm large $ hg commit -m 'branchhead without largefile' $ hg up -qr 0 $ cd .. Discard all cached largefiles in USERCACHE $ rm -rf ${USERCACHE} Create mirror repo, and pull from source without largefile: "pull" is used instead of "clone" for suppression of (1) updating to tip (= caching largefile from source repo), and (2) recording source repo as "default" path in .hg/hgrc. $ hg init mirror $ cd mirror $ hg pull ../src pulling from ../src requesting all changes adding changesets adding manifests adding file changes added 2 changesets with 1 changes to 1 files (run 'hg update' to get a working copy) Update working directory to "tip", which requires largefile("large"), but there is no cache file for it. So, hg must treat it as "missing"(!) file. $ hg update -r0 getting changed largefiles large: largefile 7f7097b041ccf68cc5561e9600da4655d21c6d18 not available from file:/*/$TESTTMP/mirror (glob) 0 largefiles updated, 0 removed 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ hg status ! large Update working directory to null: this cleanup .hg/largefiles/dirstate $ hg update null getting changed largefiles 0 largefiles updated, 0 removed 0 files updated, 0 files merged, 1 files removed, 0 files unresolved Update working directory to tip, again. $ hg update -r0 getting changed largefiles large: largefile 7f7097b041ccf68cc5561e9600da4655d21c6d18 not available from file:/*/$TESTTMP/mirror (glob) 0 largefiles updated, 0 removed 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ hg status ! large $ cd .. Verify that largefiles from pulled branchheads are fetched, also to an empty repo $ hg init mirror2 $ hg -R mirror2 pull src -r0 pulling from src adding changesets adding manifests adding file changes added 1 changesets with 1 changes to 1 files (run 'hg update' to get a working copy) #if unix-permissions Portable way to print file permissions: $ cat > ls-l.py <<EOF > #!/usr/bin/env python > import sys, os > path = sys.argv[1] > print '%03o' % (os.lstat(path).st_mode & 0777) > EOF $ chmod +x ls-l.py Test that files in .hg/largefiles inherit mode from .hg/store, not from file in working copy: $ cd src $ chmod 750 .hg/store $ chmod 660 large $ echo change >> large $ hg commit -m change created new head $ ../ls-l.py .hg/largefiles/e151b474069de4ca6898f67ce2f2a7263adf8fea 640 Test permission of with files in .hg/largefiles created by update: $ cd ../mirror $ rm -r "$USERCACHE" .hg/largefiles # avoid links $ chmod 750 .hg/store $ hg pull ../src --update -q $ ../ls-l.py .hg/largefiles/e151b474069de4ca6898f67ce2f2a7263adf8fea 640 Test permission of files created by push: $ hg serve -R ../src -d -p $HGPORT --pid-file hg.pid \ > --config "web.allow_push=*" --config web.push_ssl=no $ cat hg.pid >> $DAEMON_PIDS $ echo change >> large $ hg commit -m change $ rm -r "$USERCACHE" $ hg push -q http://localhost:$HGPORT/ $ ../ls-l.py ../src/.hg/largefiles/b734e14a0971e370408ab9bce8d56d8485e368a9 640 $ cd .. #endif Test issue 4053 (remove --after on a deleted, uncommitted file shouldn't say it is missing, but a remove on a nonexistent unknown file still should. Same for a forget.) $ cd src $ touch x $ hg add x $ mv x y $ hg remove -A x y ENOENT ENOENT: * (glob) not removing y: file is untracked [1] $ hg add y $ mv y z $ hg forget y z ENOENT ENOENT: * (glob) not removing z: file is already untracked [1] Largefiles are accessible from the share's store $ cd .. $ hg share -q src share_dst --config extensions.share= $ hg -R share_dst update -r0 getting changed largefiles 1 largefiles updated, 0 removed 1 files updated, 0 files merged, 0 files removed, 0 files unresolved $ echo modified > share_dst/large $ hg -R share_dst ci -m modified created new head Only dirstate is in the local store for the share, and the largefile is in the share source's local store. Avoid the extra largefiles added in the unix conditional above. $ hash=`hg -R share_dst cat share_dst/.hglf/large` $ echo $hash e2fb5f2139d086ded2cb600d5a91a196e76bf020 $ find share_dst/.hg/largefiles/* | sort share_dst/.hg/largefiles/dirstate $ find src/.hg/largefiles/* | egrep "(dirstate|$hash)" | sort src/.hg/largefiles/dirstate src/.hg/largefiles/e2fb5f2139d086ded2cb600d5a91a196e76bf020