Mercurial > hg
view hgdemandimport/__init__.py @ 37711:65a23cc8e75b
cborutil: implement support for streaming encoding, bytestring decoding
The vendored cbor2 package is... a bit disappointing.
On the encoding side, it insists that you pass it something with
a write() to send data to. That means if you want to emit data to
a generator, you have to construct an e.g. io.BytesIO(), write()
to it, then get the data back out. There can be non-trivial overhead
involved.
The encoder also doesn't support indefinite types - bytestrings, arrays,
and maps that don't have a known length. Again, this is really
unfortunate because it requires you to buffer the entire source and
destination in memory to encode large things.
On the decoding side, it supports reading indefinite length types.
But it buffers them completely before returning. More sadness.
This commit implements "streaming" encoders for various CBOR types.
Encoding emits a generator of hunks. So you can efficiently stream
encoded data elsewhere.
It also implements support for emitting indefinite length bytestrings,
arrays, and maps.
On the decoding side, we only implement support for decoding an
indefinite length bytestring from a file object. It will emit a
generator of raw chunks from the source.
I didn't want to reinvent so many wheels. But profiling the wire
protocol revealed that the overhead of constructing io.BytesIO()
instances to temporarily hold results has a non-trivial overhead.
We're talking >15% of execution time for operations like
"transfer the fulltexts of all files in a revision." So I can
justify this effort.
Fortunately, CBOR is a relatively straightforward format. And we have
a reference implementation in the repo we can test against.
Differential Revision: https://phab.mercurial-scm.org/D3303
author | Gregory Szorc <gregory.szorc@gmail.com> |
---|---|
date | Sat, 14 Apr 2018 16:36:15 -0700 |
parents | 3cfc9070245f |
children | 670eb4fa1b86 |
line wrap: on
line source
# hgdemandimport - global demand-loading of modules for Mercurial # # Copyright 2017 Facebook Inc. # # This software may be used and distributed according to the terms of the # GNU General Public License version 2 or any later version. '''demandimport - automatic demand-loading of modules''' # This is in a separate package from mercurial because in Python 3, # demand loading is per-package. Keeping demandimport in the mercurial package # would disable demand loading for any modules in mercurial. from __future__ import absolute_import import os import sys if sys.version_info[0] >= 3: from . import demandimportpy3 as demandimport else: from . import demandimportpy2 as demandimport # Extensions can add to this list if necessary. ignore = [ '__future__', '_hashlib', # ImportError during pkg_resources/__init__.py:fixup_namespace_package '_imp', '_xmlplus', 'fcntl', 'nt', # pathlib2 tests the existence of built-in 'nt' module 'win32com.gen_py', 'win32com.shell', # 'appdirs' tries to import win32com.shell '_winreg', # 2.7 mimetypes needs immediate ImportError 'pythoncom', # imported by tarfile, not available under Windows 'pwd', 'grp', # imported by profile, itself imported by hotshot.stats, # not available under Windows 'resource', # this trips up many extension authors 'gtk', # setuptools' pkg_resources.py expects "from __main__ import x" to # raise ImportError if x not defined '__main__', '_ssl', # conditional imports in the stdlib, issue1964 '_sre', # issue4920 'rfc822', 'mimetools', 'sqlalchemy.events', # has import-time side effects (issue5085) # setuptools 8 expects this module to explode early when not on windows 'distutils.msvc9compiler', '__builtin__', 'builtins', 'urwid.command_map', # for pudb ] _pypy = '__pypy__' in sys.builtin_module_names if _pypy: ignore.extend([ # _ctypes.pointer is shadowed by "from ... import pointer" (PyPy 5) '_ctypes.pointer', ]) demandimport.init(ignore) # Re-export. isenabled = demandimport.isenabled disable = demandimport.disable deactivated = demandimport.deactivated def enable(): # chg pre-imports modules so do not enable demandimport for it if ('CHGINTERNALMARK' not in os.environ and os.environ.get('HGDEMANDIMPORT') != 'disable'): demandimport.enable()