view tests/test-convert-darcs.t @ 44118:f81c17ec303c

hgdemandimport: apply lazy module loading to sys.meta_path finders Python's `sys.meta_path` finders are the primary objects whose job it is to find a module at import time. When `import` is called, Python iterates objects in this list and calls `o.find_spec(...)` to find a `ModuleSpec` (or None if the module couldn't be found by that finder). If no meta path finder can find a module, import fails. One of the default meta path finders is `PathFinder`. Its job is to import modules from the filesystem and is probably the most important importer. This finder looks at `sys.path` and `sys.path_hooks` to do its job. The `ModuleSpec` returned by `MetaPathImporter.find_spec()` has a `loader` attribute, which defines the concrete module loader to use. `sys.path_hooks` is a hook point for teaching `PathFinder` to instantiate custom loader types. Previously, we injected a custom `sys.path_hook` that told `PathFinder` to wrap the default loaders with a loader that creates a module object that is lazy. This approach worked. But its main limitation was that it only applied to the `PathFinder` meta path importer. There are other meta path importers that are registered. And in the case of PyOxidizer loading modules from memory, `PathFinder` doesn't come into play since PyOxidizer's own meta path importer was handling all imports. This commit changes our approach to lazy module loading by proxying all meta path importers. Specifically, we overload the `find_spec()` method to swap in a wrapped loader on the `ModuleSpec` before it is returned. The end result of this is all meta path importers should be lazy. As much as I would have loved to utilize .__class__ manipulation to achieve this, some meta path importers are implemented in C/Rust in such a way that they cannot be monkeypatched. This is why we use __getattribute__ to define a proxy. Also, this change could theoretically open us up to regressions in meta path importers whose loader is creating module objects which can't be monkeypatched. But I'm not aware of any of these in the wild. So I think we'll be safe. According to hyperfine, this change yields a decent startup time win of 5-6ms: ``` Benchmark #1: ~/.pyenv/versions/3.6.10/bin/python ./hg version Time (mean ± σ): 86.8 ms ± 0.5 ms [User: 78.0 ms, System: 8.7 ms] Range (min … max): 86.0 ms … 89.1 ms 50 runs Time (mean ± σ): 81.1 ms ± 2.7 ms [User: 74.5 ms, System: 6.5 ms] Range (min … max): 77.8 ms … 90.5 ms 50 runs Benchmark #2: ~/.pyenv/versions/3.7.6/bin/python ./hg version Time (mean ± σ): 78.9 ms ± 0.6 ms [User: 70.2 ms, System: 8.7 ms] Range (min … max): 78.1 ms … 81.2 ms 50 runs Time (mean ± σ): 73.4 ms ± 0.6 ms [User: 65.3 ms, System: 8.0 ms] Range (min … max): 72.4 ms … 75.7 ms 50 runs Benchmark #3: ~/.pyenv/versions/3.8.1/bin/python ./hg version Time (mean ± σ): 78.1 ms ± 0.6 ms [User: 70.2 ms, System: 7.9 ms] Range (min … max): 77.4 ms … 80.9 ms 50 runs Time (mean ± σ): 72.1 ms ± 0.4 ms [User: 64.4 ms, System: 7.6 ms] Range (min … max): 71.4 ms … 74.1 ms 50 runs ``` Differential Revision: https://phab.mercurial-scm.org/D7954
author Gregory Szorc <gregory.szorc@gmail.com>
date Mon, 20 Jan 2020 23:51:25 -0800
parents ab929a174f7b
children
line wrap: on
line source

#require darcs

  $ echo "[extensions]" >> $HGRCPATH
  $ echo "convert=" >> $HGRCPATH
  $ DARCS_EMAIL='test@example.org'; export DARCS_EMAIL

initialize darcs repo

  $ mkdir darcs-repo
  $ cd darcs-repo
  $ darcs init -q
  $ echo a > a
  $ darcs record -a -l -m p0
  Finished recording patch 'p0'
  $ cd ..

branch and update

  $ darcs get -q darcs-repo darcs-clone >/dev/null
  $ cd darcs-clone
  $ echo c >> a
  $ echo c > c
  $ darcs record -a -l -m p1.1
  Finished recording patch 'p1.1'
  $ cd ..

skip if we can't import elementtree

  $ if hg convert darcs-repo darcs-dummy 2>&1 | grep ElementTree > /dev/null; then
  >     echo 'skipped: missing feature: elementtree module'
  >     exit 80
  > fi

update source

  $ cd darcs-repo
  $ echo b >> a
  $ echo b > b
  $ darcs record -a -l -m p1.2
  Finished recording patch 'p1.2'

  $ darcs pull -q -a --no-set-default ../darcs-clone
  Backing up ./a(*) (glob)
  We have conflicts in the following files:
  ./a
   (?)
  $ sleep 1
  $ echo e > a
  $ echo f > f
  $ mkdir dir
  $ echo d > dir/d
  $ echo d > dir/d2
  $ darcs record -a -l -m p2
  Finished recording patch 'p2'

test file and directory move

  $ darcs mv -q f ff

Test remove + move

  $ darcs remove -q dir/d2
  $ rm dir/d2
  $ darcs mv -q dir dir2
  $ darcs record -a -l -m p3
  Finished recording patch 'p3'

The converter does not currently handle patch conflicts very well.
When they occur, it reverts *all* changes and moves forward,
letting the conflict resolving patch fix collisions.
Unfortunately, non-conflicting changes, like the addition of the
"c" file in p1.1 patch are reverted too.
Just to say that manifest not listing "c" here is a bug.

  $ cd ..
  $ hg convert darcs-repo darcs-repo-hg
  initializing destination darcs-repo-hg repository
  scanning source...
  sorting...
  converting...
  4 p0
  3 p1.2
  2 p1.1
  1 p2
  0 p3
  $ hg log -R darcs-repo-hg -g --template '{rev} "{desc|firstline}" ({author}) files: {files}\n' "$@"
  4 "p3" (test@example.org) files: dir/d dir/d2 dir2/d f ff
  3 "p2" (test@example.org) files: a dir/d dir/d2 f
  2 "p1.1" (test@example.org) files: 
  1 "p1.2" (test@example.org) files: a b
  0 "p0" (test@example.org) files: a

  $ hg up -q -R darcs-repo-hg
  $ hg -R darcs-repo-hg manifest --debug
  7225b30cdf38257d5cc7780772c051b6f33e6d6b 644   a
  1e88685f5ddec574a34c70af492f95b6debc8741 644   b
  37406831adc447ec2385014019599dfec953c806 644   dir2/d
  b783a337463792a5c7d548ad85a7d3253c16ba8c 644   ff

#if no-outer-repo

try converting darcs1 repository

  $ hg clone -q "$TESTDIR/bundles/darcs1.hg" darcs
  $ hg convert -s darcs darcs/darcs1 2>&1 | grep darcs-1.0
  darcs-1.0 repository format is unsupported, please upgrade

#endif