Mercurial > hg
view mercurial/policy.py @ 37711:65a23cc8e75b
cborutil: implement support for streaming encoding, bytestring decoding
The vendored cbor2 package is... a bit disappointing.
On the encoding side, it insists that you pass it something with
a write() to send data to. That means if you want to emit data to
a generator, you have to construct an e.g. io.BytesIO(), write()
to it, then get the data back out. There can be non-trivial overhead
involved.
The encoder also doesn't support indefinite types - bytestrings, arrays,
and maps that don't have a known length. Again, this is really
unfortunate because it requires you to buffer the entire source and
destination in memory to encode large things.
On the decoding side, it supports reading indefinite length types.
But it buffers them completely before returning. More sadness.
This commit implements "streaming" encoders for various CBOR types.
Encoding emits a generator of hunks. So you can efficiently stream
encoded data elsewhere.
It also implements support for emitting indefinite length bytestrings,
arrays, and maps.
On the decoding side, we only implement support for decoding an
indefinite length bytestring from a file object. It will emit a
generator of raw chunks from the source.
I didn't want to reinvent so many wheels. But profiling the wire
protocol revealed that the overhead of constructing io.BytesIO()
instances to temporarily hold results has a non-trivial overhead.
We're talking >15% of execution time for operations like
"transfer the fulltexts of all files in a revision." So I can
justify this effort.
Fortunately, CBOR is a relatively straightforward format. And we have
a reference implementation in the repo we can test against.
Differential Revision: https://phab.mercurial-scm.org/D3303
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sat, 14 Apr 2018 16:36:15 -0700 |
parents | 2025bf60adb2 |
children | 0304f22497fa |
line wrap: on
line source
# policy.py - module policy logic for Mercurial. # # Copyright 2015 Gregory Szorc <gregory.szorc@gmail.com> # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. from __future__ import absolute_import import os import sys # Rules for how modules can be loaded. Values are: # # c - require C extensions # allow - allow pure Python implementation when C loading fails # cffi - required cffi versions (implemented within pure module) # cffi-allow - allow pure Python implementation if cffi version is missing # py - only load pure Python modules # # By default, fall back to the pure modules so the in-place build can # run without recompiling the C extensions. This will be overridden by # __modulepolicy__ generated by setup.py. policy = b'allow' _packageprefs = { # policy: (versioned package, pure package) b'c': (r'cext', None), b'allow': (r'cext', r'pure'), b'cffi': (r'cffi', None), b'cffi-allow': (r'cffi', r'pure'), b'py': (None, r'pure'), } try: from . import __modulepolicy__ policy = __modulepolicy__.modulepolicy except ImportError: pass # PyPy doesn't load C extensions. # # The canonical way to do this is to test platform.python_implementation(). # But we don't import platform and don't bloat for it here. if r'__pypy__' in sys.builtin_module_names: policy = b'cffi' # Environment variable can always force settings. if sys.version_info[0] >= 3: if r'HGMODULEPOLICY' in os.environ: policy = os.environ[r'HGMODULEPOLICY'].encode(r'utf-8') else: policy = os.environ.get(r'HGMODULEPOLICY', policy) def _importfrom(pkgname, modname): # from .<pkgname> import <modname> (where . is looked through this module) fakelocals = {} pkg = __import__(pkgname, globals(), fakelocals, [modname], level=1) try: fakelocals[modname] = mod = getattr(pkg, modname) except AttributeError: raise ImportError(r'cannot import name %s' % modname) # force import; fakelocals[modname] may be replaced with the real module getattr(mod, r'__doc__', None) return fakelocals[modname] # keep in sync with "version" in C modules _cextversions = { (r'cext', r'base85'): 1, (r'cext', r'bdiff'): 3, (r'cext', r'mpatch'): 1, (r'cext', r'osutil'): 4, (r'cext', r'parsers'): 4, } # map import request to other package or module _modredirects = { (r'cext', r'charencode'): (r'cext', r'parsers'), (r'cffi', r'base85'): (r'pure', r'base85'), (r'cffi', r'charencode'): (r'pure', r'charencode'), (r'cffi', r'parsers'): (r'pure', r'parsers'), } def _checkmod(pkgname, modname, mod): expected = _cextversions.get((pkgname, modname)) actual = getattr(mod, r'version', None) if actual != expected: raise ImportError(r'cannot import module %s.%s ' r'(expected version: %d, actual: %r)' % (pkgname, modname, expected, actual)) def importmod(modname): """Import module according to policy and check API version""" try: verpkg, purepkg = _packageprefs[policy] except KeyError: raise ImportError(r'invalid HGMODULEPOLICY %r' % policy) assert verpkg or purepkg if verpkg: pn, mn = _modredirects.get((verpkg, modname), (verpkg, modname)) try: mod = _importfrom(pn, mn) if pn == verpkg: _checkmod(pn, mn, mod) return mod except ImportError: if not purepkg: raise pn, mn = _modredirects.get((purepkg, modname), (purepkg, modname)) return _importfrom(pn, mn)