annotate contrib/python-zstandard/README.rst @ 39989:c9026e9297e3

identify: only query remote bookmarks if needed Instead of all the time when operating on a remote repo. This perf regression was introduced in 15a79ac823e8, in 4.3. This datahint method returns nothing for -Tjson, -Tpickle, -Tdebug --config ui.formatdebug=true and --config ui.formatjson, so the bookmarks won't show up. I don't know what these formatters are for. plainformatter and templateformatter work properly, and the few other uses of datahint should have the same kind of problem. There is further weirdness where "--template '{node}'" is not enough to avoid querying the bookmarks, you also need to pass --id or -q. Differential Revision: https://phab.mercurial-scm.org/D4819
author Valentin Gatien-Baron <vgatien-baron@janestreet.com>
date Mon, 01 Oct 2018 09:58:42 -0400
parents b1fb341d8a61
children 73fef626dae3
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
rev   line source
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1 ================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
2 python-zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
3 ================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
4
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
5 This project provides Python bindings for interfacing with the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
6 `Zstandard <http://www.zstd.net>`_ compression library. A C extension
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
7 and CFFI interface are provided.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
8
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
9 The primary goal of the project is to provide a rich interface to the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
10 underlying C API through a Pythonic interface while not sacrificing
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
11 performance. This means exposing most of the features and flexibility
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
12 of the C API while not sacrificing usability or safety that Python provides.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
13
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
14 The canonical home for this project lives in a Mercurial repository run by
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
15 the author. For convenience, that repository is frequently synchronized to
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
16 https://github.com/indygreg/python-zstandard.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
17
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
18 | |ci-status| |win-ci-status|
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
19
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
20 Requirements
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
21 ============
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
22
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
23 This extension is designed to run with Python 2.7, 3.4, 3.5, and 3.6
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
24 on common platforms (Linux, Windows, and OS X). x86 and x86_64 are well-tested
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
25 on Windows. Only x86_64 is well-tested on Linux and macOS.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
26
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
27 Installing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
28 ==========
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
29
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
30 This package is uploaded to PyPI at https://pypi.python.org/pypi/zstandard.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
31 So, to install this package::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
32
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
33 $ pip install zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
34
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
35 Binary wheels are made available for some platforms. If you need to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
36 install from a source distribution, all you should need is a working C
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
37 compiler and the Python development headers/libraries. On many Linux
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
38 distributions, you can install a ``python-dev`` or ``python-devel``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
39 package to provide these dependencies.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
40
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
41 Packages are also uploaded to Anaconda Cloud at
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
42 https://anaconda.org/indygreg/zstandard. See that URL for how to install
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
43 this package with ``conda``.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
44
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
45 Performance
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
46 ===========
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
47
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
48 zstandard is a highly tunable compression algorithm. In its default settings
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
49 (compression level 3), it will be faster at compression and decompression and
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
50 will have better compression ratios than zlib on most data sets. When tuned
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
51 for speed, it approaches lz4's speed and ratios. When tuned for compression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
52 ratio, it approaches lzma ratios and compression speed, but decompression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
53 speed is much faster. See the official zstandard documentation for more.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
54
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
55 zstandard and this library support multi-threaded compression. There is a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
56 mechanism to compress large inputs using multiple threads.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
57
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
58 The performance of this library is usually very similar to what the zstandard
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
59 C API can deliver. Overhead in this library is due to general Python overhead
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
60 and can't easily be avoided by *any* zstandard Python binding. This library
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
61 exposes multiple APIs for performing compression and decompression so callers
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
62 can pick an API suitable for their need. Contrast with the compression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
63 modules in Python's standard library (like ``zlib``), which only offer limited
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
64 mechanisms for performing operations. The API flexibility means consumers can
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
65 choose to use APIs that facilitate zero copying or minimize Python object
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
66 creation and garbage collection overhead.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
67
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
68 This library is capable of single-threaded throughputs well over 1 GB/s. For
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
69 exact numbers, measure yourself. The source code repository has a ``bench.py``
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
70 script that can be used to measure things.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
71
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
72 API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
73 ===
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
74
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
75 To interface with Zstandard, simply import the ``zstandard`` module::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
76
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
77 import zstandard
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
78
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
79 It is a popular convention to alias the module as a different name for
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
80 brevity::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
81
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
82 import zstandard as zstd
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
83
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
84 This module attempts to import and use either the C extension or CFFI
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
85 implementation. On Python platforms known to support C extensions (like
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
86 CPython), it raises an ImportError if the C extension cannot be imported.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
87 On Python platforms known to not support C extensions (like PyPy), it only
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
88 attempts to import the CFFI implementation and raises ImportError if that
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
89 can't be done. On other platforms, it first tries to import the C extension
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
90 then falls back to CFFI if that fails and raises ImportError if CFFI fails.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
91
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
92 To change the module import behavior, a ``PYTHON_ZSTANDARD_IMPORT_POLICY``
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
93 environment variable can be set. The following values are accepted:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
94
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
95 default
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
96 The behavior described above.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
97 cffi_fallback
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
98 Always try to import the C extension then fall back to CFFI if that
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
99 fails.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
100 cext
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
101 Only attempt to import the C extension.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
102 cffi
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
103 Only attempt to import the CFFI implementation.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
104
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
105 In addition, the ``zstandard`` module exports a ``backend`` attribute
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
106 containing the string name of the backend being used. It will be one
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
107 of ``cext`` or ``cffi`` (for *C extension* and *cffi*, respectively).
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
108
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
109 The types, functions, and attributes exposed by the ``zstandard`` module
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
110 are documented in the sections below.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
111
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
112 .. note::
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
113
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
114 The documentation in this section makes references to various zstd
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
115 concepts and functionality. The source repository contains a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
116 ``docs/concepts.rst`` file explaining these in more detail.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
117
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
118 ZstdCompressor
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
119 --------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
120
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
121 The ``ZstdCompressor`` class provides an interface for performing
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
122 compression operations. Each instance is essentially a wrapper around a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
123 ``ZSTD_CCtx`` from the C API.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
124
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
125 Each instance is associated with parameters that control compression
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
126 behavior. These come from the following named arguments (all optional):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
127
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
128 level
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
129 Integer compression level. Valid values are between 1 and 22.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
130 dict_data
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
131 Compression dictionary to use.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
132
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
133 Note: When using dictionary data and ``compress()`` is called multiple
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
134 times, the ``ZstdCompressionParameters`` derived from an integer
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
135 compression ``level`` and the first compressed data's size will be reused
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
136 for all subsequent operations. This may not be desirable if source data
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
137 size varies significantly.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
138 compression_params
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
139 A ``ZstdCompressionParameters`` instance defining compression settings.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
140 write_checksum
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
141 Whether a 4 byte checksum should be written with the compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
142 Defaults to False. If True, the decompressor can verify that decompressed
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
143 data matches the original input data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
144 write_content_size
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
145 Whether the size of the uncompressed data will be written into the
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
146 header of compressed data. Defaults to True. The data will only be
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
147 written if the compressor knows the size of the input data. This is
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
148 often not true for streaming compression.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
149 write_dict_id
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
150 Whether to write the dictionary ID into the compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
151 Defaults to True. The dictionary ID is only written if a dictionary
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
152 is being used.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
153 threads
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
154 Enables and sets the number of threads to use for multi-threaded compression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
155 operations. Defaults to 0, which means to use single-threaded compression.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
156 Negative values will resolve to the number of logical CPUs in the system.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
157 Read below for more info on multi-threaded compression. This argument only
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
158 controls thread count for operations that operate on individual pieces of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
159 data. APIs that spawn multiple threads for working on multiple pieces of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
160 data have their own ``threads`` argument.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
161
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
162 ``compression_params`` is mutually exclusive with ``level``, ``write_checksum``,
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
163 ``write_content_size``, ``write_dict_id``, and ``threads``.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
164
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
165 Unless specified otherwise, assume that no two methods of ``ZstdCompressor``
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
166 instances can be called from multiple Python threads simultaneously. In other
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
167 words, assume instances are not thread safe unless stated otherwise.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
168
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
169 Utility Methods
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
170 ^^^^^^^^^^^^^^^
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
171
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
172 ``frame_progression()`` returns a 3-tuple containing the number of bytes
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
173 ingested, consumed, and produced by the current compression operation.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
174
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
175 ``memory_size()`` obtains the memory utilization of the underlying zstd
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
176 compression context, in bytes.::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
177
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
178 cctx = zstd.ZstdCompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
179 memory = cctx.memory_size()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
180
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
181 Simple API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
182 ^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
183
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
184 ``compress(data)`` compresses and returns data as a one-shot operation.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
185
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
186 cctx = zstd.ZstdCompressor()
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
187 compressed = cctx.compress(b'data to compress')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
188
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
189 The ``data`` argument can be any object that implements the *buffer protocol*.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
190
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
191 Stream Reader API
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
192 ^^^^^^^^^^^^^^^^^
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
193
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
194 ``stream_reader(source)`` can be used to obtain an object conforming to the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
195 ``io.RawIOBase`` interface for reading compressed output as a stream::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
196
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
197 with open(path, 'rb') as fh:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
198 cctx = zstd.ZstdCompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
199 with cctx.stream_reader(fh) as reader:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
200 while True:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
201 chunk = reader.read(16384)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
202 if not chunk:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
203 break
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
204
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
205 # Do something with compressed chunk.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
206
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
207 The stream can only be read within a context manager. When the context
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
208 manager exits, the stream is closed and the underlying resource is
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
209 released and future operations against the compression stream stream will fail.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
210
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
211 The ``source`` argument to ``stream_reader()`` can be any object with a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
212 ``read(size)`` method or any object implementing the *buffer protocol*.
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
213
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
214 ``stream_reader()`` accepts a ``size`` argument specifying how large the input
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
215 stream is. This is used to adjust compression parameters so they are
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
216 tailored to the source size.::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
217
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
218 with open(path, 'rb') as fh:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
219 cctx = zstd.ZstdCompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
220 with cctx.stream_reader(fh, size=os.stat(path).st_size) as reader:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
221 ...
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
222
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
223 If the ``source`` is a stream, you can specify how large ``read()`` requests
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
224 to that stream should be via the ``read_size`` argument. It defaults to
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
225 ``zstandard.COMPRESSION_RECOMMENDED_INPUT_SIZE``.::
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
226
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
227 with open(path, 'rb') as fh:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
228 cctx = zstd.ZstdCompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
229 # Will perform fh.read(8192) when obtaining data to feed into the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
230 # compressor.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
231 with cctx.stream_reader(fh, read_size=8192) as reader:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
232 ...
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
233
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
234 The stream returned by ``stream_reader()`` is neither writable nor seekable
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
235 (even if the underlying source is seekable). ``readline()`` and
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
236 ``readlines()`` are not implemented because they don't make sense for
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
237 compressed data. ``tell()`` returns the number of compressed bytes
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
238 emitted so far.
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
239
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
240 Streaming Input API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
241 ^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
242
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
243 ``stream_writer(fh)`` (which behaves as a context manager) allows you to *stream*
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
244 data into a compressor.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
245
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
246 cctx = zstd.ZstdCompressor(level=10)
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
247 with cctx.stream_writer(fh) as compressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
248 compressor.write(b'chunk 0')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
249 compressor.write(b'chunk 1')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
250 ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
251
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
252 The argument to ``stream_writer()`` must have a ``write(data)`` method. As
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
253 compressed data is available, ``write()`` will be called with the compressed
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
254 data as its argument. Many common Python types implement ``write()``, including
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
255 open file handles and ``io.BytesIO``.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
256
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
257 ``stream_writer()`` returns an object representing a streaming compressor
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
258 instance. It **must** be used as a context manager. That object's
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
259 ``write(data)`` method is used to feed data into the compressor.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
260
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
261 A ``flush()`` method can be called to evict whatever data remains within the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
262 compressor's internal state into the output object. This may result in 0 or
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
263 more ``write()`` calls to the output object.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
264
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
265 Both ``write()`` and ``flush()`` return the number of bytes written to the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
266 object's ``write()``. In many cases, small inputs do not accumulate enough
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
267 data to cause a write and ``write()`` will return ``0``.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
268
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
269 If the size of the data being fed to this streaming compressor is known,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
270 you can declare it before compression begins::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
271
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
272 cctx = zstd.ZstdCompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
273 with cctx.stream_writer(fh, size=data_len) as compressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
274 compressor.write(chunk0)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
275 compressor.write(chunk1)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
276 ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
277
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
278 Declaring the size of the source data allows compression parameters to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
279 be tuned. And if ``write_content_size`` is used, it also results in the
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
280 content size being written into the frame header of the output data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
281
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
282 The size of chunks being ``write()`` to the destination can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
283
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
284 cctx = zstd.ZstdCompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
285 with cctx.stream_writer(fh, write_size=32768) as compressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
286 ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
287
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
288 To see how much memory is being used by the streaming compressor::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
289
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
290 cctx = zstd.ZstdCompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
291 with cctx.stream_writer(fh) as compressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
292 ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
293 byte_size = compressor.memory_size()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
294
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
295 Thte total number of bytes written so far are exposed via ``tell()``::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
296
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
297 cctx = zstd.ZstdCompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
298 with cctx.stream_writer(fh) as compressor:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
299 ...
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
300 total_written = compressor.tell()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
301
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
302 Streaming Output API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
303 ^^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
304
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
305 ``read_to_iter(reader)`` provides a mechanism to stream data out of a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
306 compressor as an iterator of data chunks.::
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
307
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
308 cctx = zstd.ZstdCompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
309 for chunk in cctx.read_to_iter(fh):
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
310 # Do something with emitted data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
311
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
312 ``read_to_iter()`` accepts an object that has a ``read(size)`` method or
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
313 conforms to the buffer protocol.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
314
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
315 Uncompressed data is fetched from the source either by calling ``read(size)``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
316 or by fetching a slice of data from the object directly (in the case where
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
317 the buffer protocol is being used). The returned iterator consists of chunks
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
318 of compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
319
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
320 If reading from the source via ``read()``, ``read()`` will be called until
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
321 it raises or returns an empty bytes (``b''``). It is perfectly valid for
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
322 the source to deliver fewer bytes than were what requested by ``read(size)``.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
323
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
324 Like ``stream_writer()``, ``read_to_iter()`` also accepts a ``size`` argument
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
325 declaring the size of the input stream::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
326
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
327 cctx = zstd.ZstdCompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
328 for chunk in cctx.read_to_iter(fh, size=some_int):
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
329 pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
330
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
331 You can also control the size that data is ``read()`` from the source and
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
332 the ideal size of output chunks::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
333
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
334 cctx = zstd.ZstdCompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
335 for chunk in cctx.read_to_iter(fh, read_size=16384, write_size=8192):
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
336 pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
337
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
338 Unlike ``stream_writer()``, ``read_to_iter()`` does not give direct control
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
339 over the sizes of chunks fed into the compressor. Instead, chunk sizes will
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
340 be whatever the object being read from delivers. These will often be of a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
341 uniform size.
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
342
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
343 Stream Copying API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
344 ^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
345
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
346 ``copy_stream(ifh, ofh)`` can be used to copy data between 2 streams while
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
347 compressing it.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
348
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
349 cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
350 cctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
351
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
352 For example, say you wish to compress a file::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
353
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
354 cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
355 with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
356 cctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
357
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
358 It is also possible to declare the size of the source stream::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
359
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
360 cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
361 cctx.copy_stream(ifh, ofh, size=len_of_input)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
362
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
363 You can also specify how large the chunks that are ``read()`` and ``write()``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
364 from and to the streams::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
365
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
366 cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
367 cctx.copy_stream(ifh, ofh, read_size=32768, write_size=16384)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
368
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
369 The stream copier returns a 2-tuple of bytes read and written::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
370
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
371 cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
372 read_count, write_count = cctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
373
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
374 Compressor API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
375 ^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
376
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
377 ``compressobj()`` returns an object that exposes ``compress(data)`` and
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
378 ``flush()`` methods. Each returns compressed data or an empty bytes.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
379
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
380 The purpose of ``compressobj()`` is to provide an API-compatible interface
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
381 with ``zlib.compressobj``, ``bz2.BZ2Compressor``, etc. This allows callers to
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
382 swap in different compressor objects while using the same API.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
383
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
384 ``flush()`` accepts an optional argument indicating how to end the stream.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
385 ``zstd.COMPRESSOBJ_FLUSH_FINISH`` (the default) ends the compression stream.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
386 Once this type of flush is performed, ``compress()`` and ``flush()`` can
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
387 no longer be called. This type of flush **must** be called to end the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
388 compression context. If not called, returned data may be incomplete.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
389
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
390 A ``zstd.COMPRESSOBJ_FLUSH_BLOCK`` argument to ``flush()`` will flush a
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
391 zstd block. Flushes of this type can be performed multiple times. The next
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
392 call to ``compress()`` will begin a new zstd block.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
393
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
394 Here is how this API should be used::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
395
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
396 cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
397 cobj = cctx.compressobj()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
398 data = cobj.compress(b'raw input 0')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
399 data = cobj.compress(b'raw input 1')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
400 data = cobj.flush()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
401
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
402 Or to flush blocks::
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
403
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
404 cctx.zstd.ZstdCompressor()
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
405 cobj = cctx.compressobj()
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
406 data = cobj.compress(b'chunk in first block')
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
407 data = cobj.flush(zstd.COMPRESSOBJ_FLUSH_BLOCK)
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
408 data = cobj.compress(b'chunk in second block')
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
409 data = cobj.flush()
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
410
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
411 For best performance results, keep input chunks under 256KB. This avoids
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
412 extra allocations for a large output object.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
413
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
414 It is possible to declare the input size of the data that will be fed into
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
415 the compressor::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
416
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
417 cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
418 cobj = cctx.compressobj(size=6)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
419 data = cobj.compress(b'foobar')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
420 data = cobj.flush()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
421
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
422 Batch Compression API
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
423 ^^^^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
424
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
425 (Experimental. Not yet supported in CFFI bindings.)
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
426
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
427 ``multi_compress_to_buffer(data, [threads=0])`` performs compression of multiple
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
428 inputs as a single operation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
429
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
430 Data to be compressed can be passed as a ``BufferWithSegmentsCollection``, a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
431 ``BufferWithSegments``, or a list containing byte like objects. Each element of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
432 the container will be compressed individually using the configured parameters
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
433 on the ``ZstdCompressor`` instance.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
434
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
435 The ``threads`` argument controls how many threads to use for compression. The
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
436 default is ``0`` which means to use a single thread. Negative values use the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
437 number of logical CPUs in the machine.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
438
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
439 The function returns a ``BufferWithSegmentsCollection``. This type represents
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
440 N discrete memory allocations, eaching holding 1 or more compressed frames.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
441
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
442 Output data is written to shared memory buffers. This means that unlike
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
443 regular Python objects, a reference to *any* object within the collection
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
444 keeps the shared buffer and therefore memory backing it alive. This can have
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
445 undesirable effects on process memory usage.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
446
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
447 The API and behavior of this function is experimental and will likely change.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
448 Known deficiencies include:
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
449
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
450 * If asked to use multiple threads, it will always spawn that many threads,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
451 even if the input is too small to use them. It should automatically lower
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
452 the thread count when the extra threads would just add overhead.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
453 * The buffer allocation strategy is fixed. There is room to make it dynamic,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
454 perhaps even to allow one output buffer per input, facilitating a variation
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
455 of the API to return a list without the adverse effects of shared memory
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
456 buffers.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
457
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
458 ZstdDecompressor
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
459 ----------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
460
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
461 The ``ZstdDecompressor`` class provides an interface for performing
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
462 decompression. It is effectively a wrapper around the ``ZSTD_DCtx`` type from
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
463 the C API.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
464
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
465 Each instance is associated with parameters that control decompression. These
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
466 come from the following named arguments (all optional):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
467
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
468 dict_data
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
469 Compression dictionary to use.
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
470 max_window_size
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
471 Sets an uppet limit on the window size for decompression operations in
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
472 kibibytes. This setting can be used to prevent large memory allocations
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
473 for inputs using large compression windows.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
474 format
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
475 Set the format of data for the decoder. By default, this is
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
476 ``zstd.FORMAT_ZSTD1``. It can be set to ``zstd.FORMAT_ZSTD1_MAGICLESS`` to
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
477 allow decoding frames without the 4 byte magic header. Not all decompression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
478 APIs support this mode.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
479
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
480 The interface of this class is very similar to ``ZstdCompressor`` (by design).
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
481
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
482 Unless specified otherwise, assume that no two methods of ``ZstdDecompressor``
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
483 instances can be called from multiple Python threads simultaneously. In other
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
484 words, assume instances are not thread safe unless stated otherwise.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30435
diff changeset
485
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
486 Utility Methods
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
487 ^^^^^^^^^^^^^^^
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
488
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
489 ``memory_size()`` obtains the size of the underlying zstd decompression context,
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
490 in bytes.::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
491
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
492 dctx = zstd.ZstdDecompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
493 size = dctx.memory_size()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
494
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
495 Simple API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
496 ^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
497
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
498 ``decompress(data)`` can be used to decompress an entire compressed zstd
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
499 frame in a single operation.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
500
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
501 dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
502 decompressed = dctx.decompress(data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
503
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
504 By default, ``decompress(data)`` will only work on data written with the content
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
505 size encoded in its header (this is the default behavior of
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
506 ``ZstdCompressor().compress()`` but may not be true for streaming compression). If
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
507 compressed data without an embedded content size is seen, ``zstd.ZstdError`` will
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
508 be raised.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
509
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
510 If the compressed data doesn't have its content size embedded within it,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
511 decompression can be attempted by specifying the ``max_output_size``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
512 argument.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
513
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
514 dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
515 uncompressed = dctx.decompress(data, max_output_size=1048576)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
516
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
517 Ideally, ``max_output_size`` will be identical to the decompressed output
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
518 size.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
519
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
520 If ``max_output_size`` is too small to hold the decompressed data,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
521 ``zstd.ZstdError`` will be raised.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
522
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
523 If ``max_output_size`` is larger than the decompressed data, the allocated
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
524 output buffer will be resized to only use the space required.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
525
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
526 Please note that an allocation of the requested ``max_output_size`` will be
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
527 performed every time the method is called. Setting to a very large value could
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
528 result in a lot of work for the memory allocator and may result in
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
529 ``MemoryError`` being raised if the allocation fails.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
530
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
531 .. important::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
532
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
533 If the exact size of decompressed data is unknown (not passed in explicitly
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
534 and not stored in the zstandard frame), for performance reasons it is
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
535 encouraged to use a streaming API.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
536
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
537 Stream Reader API
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
538 ^^^^^^^^^^^^^^^^^
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
539
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
540 ``stream_reader(source)`` can be used to obtain an object conforming to the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
541 ``io.RawIOBase`` interface for reading decompressed output as a stream::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
542
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
543 with open(path, 'rb') as fh:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
544 dctx = zstd.ZstdDecompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
545 with dctx.stream_reader(fh) as reader:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
546 while True:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
547 chunk = reader.read(16384)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
548 if not chunk:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
549 break
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
550
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
551 # Do something with decompressed chunk.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
552
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
553 The stream can only be read within a context manager. When the context
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
554 manager exits, the stream is closed and the underlying resource is
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
555 released and future operations against the stream will fail.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
556
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
557 The ``source`` argument to ``stream_reader()`` can be any object with a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
558 ``read(size)`` method or any object implementing the *buffer protocol*.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
559
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
560 If the ``source`` is a stream, you can specify how large ``read()`` requests
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
561 to that stream should be via the ``read_size`` argument. It defaults to
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
562 ``zstandard.DECOMPRESSION_RECOMMENDED_INPUT_SIZE``.::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
563
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
564 with open(path, 'rb') as fh:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
565 dctx = zstd.ZstdDecompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
566 # Will perform fh.read(8192) when obtaining data for the decompressor.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
567 with dctx.stream_reader(fh, read_size=8192) as reader:
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
568 ...
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
569
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
570 The stream returned by ``stream_reader()`` is not writable.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
571
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
572 The stream returned by ``stream_reader()`` is *partially* seekable.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
573 Absolute and relative positions (``SEEK_SET`` and ``SEEK_CUR``) forward
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
574 of the current position are allowed. Offsets behind the current read
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
575 position and offsets relative to the end of stream are not allowed and
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
576 will raise ``ValueError`` if attempted.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
577
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
578 ``tell()`` returns the number of decompressed bytes read so far.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
579
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
580 Not all I/O methods are implemented. Notably missing is support for
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
581 ``readline()``, ``readlines()``, and linewise iteration support. Support for
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
582 these is planned for a future release.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
583
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
584 Streaming Input API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
585 ^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
586
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
587 ``stream_writer(fh)`` can be used to incrementally send compressed data to a
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
588 decompressor.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
589
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
590 dctx = zstd.ZstdDecompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
591 with dctx.stream_writer(fh) as decompressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
592 decompressor.write(compressed_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
593
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
594 This behaves similarly to ``zstd.ZstdCompressor``: compressed data is written to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
595 the decompressor by calling ``write(data)`` and decompressed output is written
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
596 to the output object by calling its ``write(data)`` method.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
597
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
598 Calls to ``write()`` will return the number of bytes written to the output
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
599 object. Not all inputs will result in bytes being written, so return values
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
600 of ``0`` are possible.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
601
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
602 The size of chunks being ``write()`` to the destination can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
603
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
604 dctx = zstd.ZstdDecompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
605 with dctx.stream_writer(fh, write_size=16384) as decompressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
606 pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
607
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
608 You can see how much memory is being used by the decompressor::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
609
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
610 dctx = zstd.ZstdDecompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
611 with dctx.stream_writer(fh) as decompressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
612 byte_size = decompressor.memory_size()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
613
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
614 Streaming Output API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
615 ^^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
616
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
617 ``read_to_iter(fh)`` provides a mechanism to stream decompressed data out of a
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
618 compressed source as an iterator of data chunks.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
619
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
620 dctx = zstd.ZstdDecompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
621 for chunk in dctx.read_to_iter(fh):
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
622 # Do something with original data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
623
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
624 ``read_to_iter()`` accepts an object with a ``read(size)`` method that will
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
625 return compressed bytes or an object conforming to the buffer protocol that
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
626 can expose its data as a contiguous range of bytes.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
627
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
628 ``read_to_iter()`` returns an iterator whose elements are chunks of the
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
629 decompressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
630
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
631 The size of requested ``read()`` from the source can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
632
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
633 dctx = zstd.ZstdDecompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
634 for chunk in dctx.read_to_iter(fh, read_size=16384):
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
635 pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
636
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
637 It is also possible to skip leading bytes in the input data::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
638
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
639 dctx = zstd.ZstdDecompressor()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
640 for chunk in dctx.read_to_iter(fh, skip_bytes=1):
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
641 pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
642
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
643 .. tip::
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
644
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
645 Skipping leading bytes is useful if the source data contains extra
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
646 *header* data. Traditionally, you would need to create a slice or
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
647 ``memoryview`` of the data you want to decompress. This would create
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
648 overhead. It is more efficient to pass the offset into this API.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
649
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
650 Similarly to ``ZstdCompressor.read_to_iter()``, the consumer of the iterator
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
651 controls when data is decompressed. If the iterator isn't consumed,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
652 decompression is put on hold.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
653
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
654 When ``read_to_iter()`` is passed an object conforming to the buffer protocol,
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
655 the behavior may seem similar to what occurs when the simple decompression
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
656 API is used. However, this API works when the decompressed size is unknown.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
657 Furthermore, if feeding large inputs, the decompressor will work in chunks
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
658 instead of performing a single operation.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
659
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
660 Stream Copying API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
661 ^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
662
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
663 ``copy_stream(ifh, ofh)`` can be used to copy data across 2 streams while
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
664 performing decompression.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
665
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
666 dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
667 dctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
668
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
669 e.g. to decompress a file to another file::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
670
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
671 dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
672 with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
673 dctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
674
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
675 The size of chunks being ``read()`` and ``write()`` from and to the streams
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
676 can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
677
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
678 dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
679 dctx.copy_stream(ifh, ofh, read_size=8192, write_size=16384)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
680
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
681 Decompressor API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
682 ^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
683
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
684 ``decompressobj()`` returns an object that exposes a ``decompress(data)``
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
685 method. Compressed data chunks are fed into ``decompress(data)`` and
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
686 uncompressed output (or an empty bytes) is returned. Output from subsequent
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
687 calls needs to be concatenated to reassemble the full decompressed byte
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
688 sequence.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
689
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
690 The purpose of ``decompressobj()`` is to provide an API-compatible interface
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
691 with ``zlib.decompressobj`` and ``bz2.BZ2Decompressor``. This allows callers
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
692 to swap in different decompressor objects while using the same API.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
693
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
694 Each object is single use: once an input frame is decoded, ``decompress()``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
695 can no longer be called.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
696
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
697 Here is how this API should be used::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
698
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
699 dctx = zstd.ZstdDecompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
700 dobj = dctx.decompressobj()
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
701 data = dobj.decompress(compressed_chunk_0)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
702 data = dobj.decompress(compressed_chunk_1)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
703
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
704 By default, calls to ``decompress()`` write output data in chunks of size
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
705 ``DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE``. These chunks are concatenated
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
706 before being returned to the caller. It is possible to define the size of
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
707 these temporary chunks by passing ``write_size`` to ``decompressobj()``::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
708
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
709 dctx = zstd.ZstdDecompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
710 dobj = dctx.decompressobj(write_size=1048576)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
711
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
712 .. note::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
713
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
714 Because calls to ``decompress()`` may need to perform multiple
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
715 memory (re)allocations, this streaming decompression API isn't as
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
716 efficient as other APIs.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
717
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
718 Batch Decompression API
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
719 ^^^^^^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
720
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
721 (Experimental. Not yet supported in CFFI bindings.)
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
722
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
723 ``multi_decompress_to_buffer()`` performs decompression of multiple
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
724 frames as a single operation and returns a ``BufferWithSegmentsCollection``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
725 containing decompressed data for all inputs.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
726
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
727 Compressed frames can be passed to the function as a ``BufferWithSegments``,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
728 a ``BufferWithSegmentsCollection``, or as a list containing objects that
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
729 conform to the buffer protocol. For best performance, pass a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
730 ``BufferWithSegmentsCollection`` or a ``BufferWithSegments``, as
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
731 minimal input validation will be done for that type. If calling from
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
732 Python (as opposed to C), constructing one of these instances may add
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
733 overhead cancelling out the performance overhead of validation for list
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
734 inputs.::
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
735
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
736 dctx = zstd.ZstdDecompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
737 results = dctx.multi_decompress_to_buffer([b'...', b'...'])
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
738
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
739 The decompressed size of each frame MUST be discoverable. It can either be
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
740 embedded within the zstd frame (``write_content_size=True`` argument to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
741 ``ZstdCompressor``) or passed in via the ``decompressed_sizes`` argument.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
742
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
743 The ``decompressed_sizes`` argument is an object conforming to the buffer
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
744 protocol which holds an array of 64-bit unsigned integers in the machine's
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
745 native format defining the decompressed sizes of each frame. If this argument
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
746 is passed, it avoids having to scan each frame for its decompressed size.
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
747 This frame scanning can add noticeable overhead in some scenarios.::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
748
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
749 frames = [...]
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
750 sizes = struct.pack('=QQQQ', len0, len1, len2, len3)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
751
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
752 dctx = zstd.ZstdDecompressor()
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
753 results = dctx.multi_decompress_to_buffer(frames, decompressed_sizes=sizes)
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
754
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
755 The ``threads`` argument controls the number of threads to use to perform
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
756 decompression operations. The default (``0``) or the value ``1`` means to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
757 use a single thread. Negative values use the number of logical CPUs in the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
758 machine.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
759
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
760 .. note::
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
761
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
762 It is possible to pass a ``mmap.mmap()`` instance into this function by
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
763 wrapping it with a ``BufferWithSegments`` instance (which will define the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
764 offsets of frames within the memory mapped region).
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
765
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
766 This function is logically equivalent to performing ``dctx.decompress()``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
767 on each input frame and returning the result.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
768
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
769 This function exists to perform decompression on multiple frames as fast
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
770 as possible by having as little overhead as possible. Since decompression is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
771 performed as a single operation and since the decompressed output is stored in
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
772 a single buffer, extra memory allocations, Python objects, and Python function
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
773 calls are avoided. This is ideal for scenarios where callers know up front that
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
774 they need to access data for multiple frames, such as when *delta chains* are
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
775 being used.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
776
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
777 Currently, the implementation always spawns multiple threads when requested,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
778 even if the amount of work to do is small. In the future, it will be smarter
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
779 about avoiding threads and their associated overhead when the amount of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
780 work to do is small.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
781
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
782 Prefix Dictionary Chain Decompression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
783 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
784
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
785 ``decompress_content_dict_chain(frames)`` performs decompression of a list of
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
786 zstd frames produced using chained *prefix* dictionary compression. Such
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
787 a list of frames is produced by compressing discrete inputs where each
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
788 non-initial input is compressed with a *prefix* dictionary consisting of the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
789 content of the previous input.
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
790
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
791 For example, say you have the following inputs::
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
792
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
793 inputs = [b'input 1', b'input 2', b'input 3']
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
794
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
795 The zstd frame chain consists of:
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
796
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
797 1. ``b'input 1'`` compressed in standalone/discrete mode
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
798 2. ``b'input 2'`` compressed using ``b'input 1'`` as a *prefix* dictionary
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
799 3. ``b'input 3'`` compressed using ``b'input 2'`` as a *prefix* dictionary
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
800
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
801 Each zstd frame **must** have the content size written.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
802
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
803 The following Python code can be used to produce a *prefix dictionary chain*::
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
804
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
805 def make_chain(inputs):
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
806 frames = []
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
807
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
808 # First frame is compressed in standalone/discrete mode.
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
809 zctx = zstd.ZstdCompressor()
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
810 frames.append(zctx.compress(inputs[0]))
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
811
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
812 # Subsequent frames use the previous fulltext as a prefix dictionary
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
813 for i, raw in enumerate(inputs[1:]):
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
814 dict_data = zstd.ZstdCompressionDict(
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
815 inputs[i], dict_type=zstd.DICT_TYPE_RAWCONTENT)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
816 zctx = zstd.ZstdCompressor(dict_data=dict_data)
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
817 frames.append(zctx.compress(raw))
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
818
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
819 return frames
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
820
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
821 ``decompress_content_dict_chain()`` returns the uncompressed data of the last
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
822 element in the input chain.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
823
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
824
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
825 .. note::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
826
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
827 It is possible to implement *prefix dictionary chain* decompression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
828 on top of other APIs. However, this function will likely be faster -
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
829 especially for long input chains - as it avoids the overhead of instantiating
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
830 and passing around intermediate objects between C and Python.
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
831
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
832 Multi-Threaded Compression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
833 --------------------------
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
834
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
835 ``ZstdCompressor`` accepts a ``threads`` argument that controls the number
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
836 of threads to use for compression. The way this works is that input is split
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
837 into segments and each segment is fed into a worker pool for compression. Once
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
838 a segment is compressed, it is flushed/appended to the output.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
839
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
840 .. note::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
841
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
842 These threads are created at the C layer and are not Python threads. So they
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
843 work outside the GIL. It is therefore possible to CPU saturate multiple cores
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
844 from Python.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
845
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
846 The segment size for multi-threaded compression is chosen from the window size
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
847 of the compressor. This is derived from the ``window_log`` attribute of a
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
848 ``ZstdCompressionParameters`` instance. By default, segment sizes are in the 1+MB
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
849 range.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
850
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
851 If multi-threaded compression is requested and the input is smaller than the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
852 configured segment size, only a single compression thread will be used. If the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
853 input is smaller than the segment size multiplied by the thread pool size or
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
854 if data cannot be delivered to the compressor fast enough, not all requested
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
855 compressor threads may be active simultaneously.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
856
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
857 Compared to non-multi-threaded compression, multi-threaded compression has
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
858 higher per-operation overhead. This includes extra memory operations,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
859 thread creation, lock acquisition, etc.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
860
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
861 Due to the nature of multi-threaded compression using *N* compression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
862 *states*, the output from multi-threaded compression will likely be larger
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
863 than non-multi-threaded compression. The difference is usually small. But
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
864 there is a CPU/wall time versus size trade off that may warrant investigation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
865
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
866 Output from multi-threaded compression does not require any special handling
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
867 on the decompression side. To the decompressor, data generated with single
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
868 threaded compressor looks the same as data generated by a multi-threaded
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
869 compressor and does not require any special handling or additional resource
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
870 requirements.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
871
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
872 Dictionary Creation and Management
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
873 ----------------------------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
874
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
875 Compression dictionaries are represented with the ``ZstdCompressionDict`` type.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
876
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
877 Instances can be constructed from bytes::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
878
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
879 dict_data = zstd.ZstdCompressionDict(data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
880
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
881 It is possible to construct a dictionary from *any* data. If the data doesn't
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
882 begin with a magic header, it will be treated as a *prefix* dictionary.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
883 *Prefix* dictionaries allow compression operations to reference raw data
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
884 within the dictionary.
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
885
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
886 It is possible to force the use of *prefix* dictionaries or to require a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
887 dictionary header:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
888
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
889 dict_data = zstd.ZstdCompressionDict(data,
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
890 dict_type=zstd.DICT_TYPE_RAWCONTENT)
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
891
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
892 dict_data = zstd.ZstdCompressionDict(data,
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
893 dict_type=zstd.DICT_TYPE_FULLDICT)
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
894
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
895 You can see how many bytes are in the dictionary by calling ``len()``::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
896
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
897 dict_data = zstd.train_dictionary(size, samples)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
898 dict_size = len(dict_data) # will not be larger than ``size``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
899
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
900 Once you have a dictionary, you can pass it to the objects performing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
901 compression and decompression::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
902
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
903 dict_data = zstd.train_dictionary(131072, samples)
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
904
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
905 cctx = zstd.ZstdCompressor(dict_data=dict_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
906 for source_data in input_data:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
907 compressed = cctx.compress(source_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
908 # Do something with compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
909
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
910 dctx = zstd.ZstdDecompressor(dict_data=dict_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
911 for compressed_data in input_data:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
912 buffer = io.BytesIO()
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
913 with dctx.stream_writer(buffer) as decompressor:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
914 decompressor.write(compressed_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
915 # Do something with raw data in ``buffer``.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
916
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
917 Dictionaries have unique integer IDs. You can retrieve this ID via::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
918
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
919 dict_id = zstd.dictionary_id(dict_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
920
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
921 You can obtain the raw data in the dict (useful for persisting and constructing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
922 a ``ZstdCompressionDict`` later) via ``as_bytes()``::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
923
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
924 dict_data = zstd.train_dictionary(size, samples)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
925 raw_data = dict_data.as_bytes()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
926
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
927 By default, when a ``ZstdCompressionDict`` is *attached* to a
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
928 ``ZstdCompressor``, each ``ZstdCompressor`` performs work to prepare the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
929 dictionary for use. This is fine if only 1 compression operation is being
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
930 performed or if the ``ZstdCompressor`` is being reused for multiple operations.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
931 But if multiple ``ZstdCompressor`` instances are being used with the dictionary,
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
932 this can add overhead.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
933
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
934 It is possible to *precompute* the dictionary so it can readily be consumed
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
935 by multiple ``ZstdCompressor`` instances::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
936
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
937 d = zstd.ZstdCompressionDict(data)
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
938
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
939 # Precompute for compression level 3.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
940 d.precompute_compress(level=3)
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
941
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
942 # Precompute with specific compression parameters.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
943 params = zstd.ZstdCompressionParameters(...)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
944 d.precompute_compress(compression_params=params)
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
945
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
946 .. note::
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
947
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
948 When a dictionary is precomputed, the compression parameters used to
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
949 precompute the dictionary overwrite some of the compression parameters
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
950 specified to ``ZstdCompressor.__init__``.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
951
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
952 Training Dictionaries
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
953 ^^^^^^^^^^^^^^^^^^^^^
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
954
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
955 Unless using *prefix* dictionaries, dictionary data is produced by *training*
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
956 on existing data::
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
957
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
958 dict_data = zstd.train_dictionary(size, samples)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
959
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
960 This takes a target dictionary size and list of bytes instances and creates and
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
961 returns a ``ZstdCompressionDict``.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
962
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
963 The dictionary training mechanism is known as *cover*. More details about it are
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
964 available in the paper *Effective Construction of Relative Lempel-Ziv
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
965 Dictionaries* (authors: Liao, Petri, Moffat, Wirth).
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
966
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
967 The cover algorithm takes parameters ``k` and ``d``. These are the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
968 *segment size* and *dmer size*, respectively. The returned dictionary
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
969 instance created by this function has ``k`` and ``d`` attributes
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
970 containing the values for these parameters. If a ``ZstdCompressionDict``
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
971 is constructed from raw bytes data (a content-only dictionary), the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
972 ``k`` and ``d`` attributes will be ``0``.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
973
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
974 The segment and dmer size parameters to the cover algorithm can either be
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
975 specified manually or ``train_dictionary()`` can try multiple values
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
976 and pick the best one, where *best* means the smallest compressed data size.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
977 This later mode is called *optimization* mode.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
978
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
979 If none of ``k``, ``d``, ``steps``, ``threads``, ``level``, ``notifications``,
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
980 or ``dict_id`` (basically anything from the underlying ``ZDICT_cover_params_t``
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
981 struct) are defined, *optimization* mode is used with default parameter
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
982 values.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
983
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
984 If ``steps`` or ``threads`` are defined, then *optimization* mode is engaged
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
985 with explicit control over those parameters. Specifying ``threads=0`` or
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
986 ``threads=1`` can be used to engage *optimization* mode if other parameters
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
987 are not defined.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
988
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
989 Otherwise, non-*optimization* mode is used with the parameters specified.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
990
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
991 This function takes the following arguments:
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
992
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
993 dict_size
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
994 Target size in bytes of the dictionary to generate.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
995 samples
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
996 A list of bytes holding samples the dictionary will be trained from.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
997 k
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
998 Parameter to cover algorithm defining the segment size. A reasonable range
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
999 is [16, 2048+].
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1000 d
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1001 Parameter to cover algorithm defining the dmer size. A reasonable range is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1002 [6, 16]. ``d`` must be less than or equal to ``k``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1003 dict_id
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1004 Integer dictionary ID for the produced dictionary. Default is 0, which uses
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1005 a random value.
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1006 steps
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1007 Number of steps through ``k`` values to perform when trying parameter
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1008 variations.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1009 threads
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1010 Number of threads to use when trying parameter variations. Default is 0,
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1011 which means to use a single thread. A negative value can be specified to
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1012 use as many threads as there are detected logical CPUs.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1013 level
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1014 Integer target compression level when trying parameter variations.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1015 notifications
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1016 Controls writing of informational messages to ``stderr``. ``0`` (the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1017 default) means to write nothing. ``1`` writes errors. ``2`` writes
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1018 progression info. ``3`` writes more details. And ``4`` writes all info.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1019
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1020 Explicit Compression Parameters
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1021 -------------------------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1022
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1023 Zstandard offers a high-level *compression level* that maps to lower-level
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1024 compression parameters. For many consumers, this numeric level is the only
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1025 compression setting you'll need to touch.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1026
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1027 But for advanced use cases, it might be desirable to tweak these lower-level
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1028 settings.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1029
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1030 The ``ZstdCompressionParameters`` type represents these low-level compression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1031 settings.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1032
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1033 Instances of this type can be constructed from a myriad of keyword arguments
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1034 (defined below) for complete low-level control over each adjustable
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1035 compression setting.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1036
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1037 From a higher level, one can construct a ``ZstdCompressionParameters`` instance
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1038 given a desired compression level and target input and dictionary size
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1039 using ``ZstdCompressionParameters.from_level()``. e.g.::
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1040
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1041 # Derive compression settings for compression level 7.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1042 params = zstd.ZstdCompressionParameters.from_level(7)
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1043
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1044 # With an input size of 1MB
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1045 params = zstd.ZstdCompressionParameters.from_level(7, source_size=1048576)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1046
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1047 Using ``from_level()``, it is also possible to override individual compression
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1048 parameters or to define additional settings that aren't automatically derived.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1049 e.g.::
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1050
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1051 params = zstd.ZstdCompressionParameters.from_level(4, window_log=10)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1052 params = zstd.ZstdCompressionParameters.from_level(5, threads=4)
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1053
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1054 Or you can define low-level compression settings directly::
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1055
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1056 params = zstd.ZstdCompressionParameters(window_log=12, enable_ldm=True)
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1057
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1058 Once a ``ZstdCompressionParameters`` instance is obtained, it can be used to
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1059 configure a compressor::
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1060
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1061 cctx = zstd.ZstdCompressor(compression_params=params)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1062
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1063 The named arguments and attributes of ``ZstdCompressionParameters`` are as
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1064 follows:
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1065
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1066 * format
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1067 * compression_level
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1068 * window_log
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1069 * hash_log
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1070 * chain_log
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1071 * search_log
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1072 * min_match
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1073 * target_length
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1074 * compression_strategy
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1075 * write_content_size
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1076 * write_checksum
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1077 * write_dict_id
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1078 * job_size
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1079 * overlap_size_log
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1080 * compress_literals
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1081 * force_max_window
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1082 * enable_ldm
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1083 * ldm_hash_log
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1084 * ldm_min_match
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1085 * ldm_bucket_size_log
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1086 * ldm_hash_every_log
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1087 * threads
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1088
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1089 Some of these are very low-level settings. It may help to consult the official
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1090 zstandard documentation for their behavior. Look for the ``ZSTD_p_*`` constants
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1091 in ``zstd.h`` (https://github.com/facebook/zstd/blob/dev/lib/zstd.h).
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1092
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1093 Frame Inspection
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1094 ----------------
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1095
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1096 Data emitted from zstd compression is encapsulated in a *frame*. This frame
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1097 begins with a 4 byte *magic number* header followed by 2 to 14 bytes describing
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1098 the frame in more detail. For more info, see
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1099 https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1100
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1101 ``zstd.get_frame_parameters(data)`` parses a zstd *frame* header from a bytes
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1102 instance and return a ``FrameParameters`` object describing the frame.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1103
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1104 Depending on which fields are present in the frame and their values, the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1105 length of the frame parameters varies. If insufficient bytes are passed
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1106 in to fully parse the frame parameters, ``ZstdError`` is raised. To ensure
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1107 frame parameters can be parsed, pass in at least 18 bytes.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1108
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1109 ``FrameParameters`` instances have the following attributes:
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1110
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1111 content_size
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1112 Integer size of original, uncompressed content. This will be ``0`` if the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1113 original content size isn't written to the frame (controlled with the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1114 ``write_content_size`` argument to ``ZstdCompressor``) or if the input
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1115 content size was ``0``.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1116
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1117 window_size
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1118 Integer size of maximum back-reference distance in compressed data.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1119
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1120 dict_id
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1121 Integer of dictionary ID used for compression. ``0`` if no dictionary
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1122 ID was used or if the dictionary ID was ``0``.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1123
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1124 has_checksum
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1125 Bool indicating whether a 4 byte content checksum is stored at the end
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1126 of the frame.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1127
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1128 ``zstd.frame_header_size(data)`` returns the size of the zstandard frame
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1129 header.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1130
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1131 ``zstd.frame_content_size(data)`` returns the content size as parsed from
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1132 the frame header. ``-1`` means the content size is unknown. ``0`` means
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1133 an empty frame. The content size is usually correct. However, it may not
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1134 be accurate.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1135
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1136 Misc Functionality
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1137 ------------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1138
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1139 estimate_decompression_context_size()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1140 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1141
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1142 Estimate the memory size requirements for a decompressor instance.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1143
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1144 Constants
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1145 ---------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1146
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1147 The following module constants/attributes are exposed:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1148
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1149 ZSTD_VERSION
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1150 This module attribute exposes a 3-tuple of the Zstandard version. e.g.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1151 ``(1, 0, 0)``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1152 MAX_COMPRESSION_LEVEL
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1153 Integer max compression level accepted by compression functions
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1154 COMPRESSION_RECOMMENDED_INPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1155 Recommended chunk size to feed to compressor functions
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1156 COMPRESSION_RECOMMENDED_OUTPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1157 Recommended chunk size for compression output
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1158 DECOMPRESSION_RECOMMENDED_INPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1159 Recommended chunk size to feed into decompresor functions
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1160 DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1161 Recommended chunk size for decompression output
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1162
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1163 FRAME_HEADER
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1164 bytes containing header of the Zstandard frame
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1165 MAGIC_NUMBER
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1166 Frame header as an integer
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1167
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1168 CONTENTSIZE_UNKNOWN
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1169 Value for content size when the content size is unknown.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1170 CONTENTSIZE_ERROR
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1171 Value for content size when content size couldn't be determined.
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1172
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1173 WINDOWLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1174 Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1175 WINDOWLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1176 Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1177 CHAINLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1178 Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1179 CHAINLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1180 Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1181 HASHLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1182 Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1183 HASHLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1184 Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1185 SEARCHLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1186 Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1187 SEARCHLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1188 Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1189 SEARCHLENGTH_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1190 Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1191 SEARCHLENGTH_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1192 Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1193 TARGETLENGTH_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1194 Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1195 STRATEGY_FAST
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1196 Compression strategy
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1197 STRATEGY_DFAST
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1198 Compression strategy
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1199 STRATEGY_GREEDY
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1200 Compression strategy
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1201 STRATEGY_LAZY
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1202 Compression strategy
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1203 STRATEGY_LAZY2
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1204 Compression strategy
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1205 STRATEGY_BTLAZY2
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1206 Compression strategy
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1207 STRATEGY_BTOPT
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1208 Compression strategy
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1209 STRATEGY_BTULTRA
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1210 Compression strategy
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1211
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1212 FORMAT_ZSTD1
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1213 Zstandard frame format
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1214 FORMAT_ZSTD1_MAGICLESS
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1215 Zstandard frame format without magic header
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1216
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1217 Performance Considerations
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1218 --------------------------
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1219
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1220 The ``ZstdCompressor`` and ``ZstdDecompressor`` types maintain state to a
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1221 persistent compression or decompression *context*. Reusing a ``ZstdCompressor``
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1222 or ``ZstdDecompressor`` instance for multiple operations is faster than
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1223 instantiating a new ``ZstdCompressor`` or ``ZstdDecompressor`` for each
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1224 operation. The differences are magnified as the size of data decreases. For
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1225 example, the difference between *context* reuse and non-reuse for 100,000
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1226 100 byte inputs will be significant (possiby over 10x faster to reuse contexts)
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1227 whereas 10 100,000,000 byte inputs will be more similar in speed (because the
30895
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
1228 time spent doing compression dwarfs time spent creating new *contexts*).
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1229
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1230 Buffer Types
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1231 ------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1232
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1233 The API exposes a handful of custom types for interfacing with memory buffers.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1234 The primary goal of these types is to facilitate efficient multi-object
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1235 operations.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1236
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1237 The essential idea is to have a single memory allocation provide backing
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1238 storage for multiple logical objects. This has 2 main advantages: fewer
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1239 allocations and optimal memory access patterns. This avoids having to allocate
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1240 a Python object for each logical object and furthermore ensures that access of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1241 data for objects can be sequential (read: fast) in memory.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1242
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1243 BufferWithSegments
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1244 ^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1245
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1246 The ``BufferWithSegments`` type represents a memory buffer containing N
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1247 discrete items of known lengths (segments). It is essentially a fixed size
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1248 memory address and an array of 2-tuples of ``(offset, length)`` 64-bit
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1249 unsigned native endian integers defining the byte offset and length of each
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1250 segment within the buffer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1251
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1252 Instances behave like containers.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1253
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1254 ``len()`` returns the number of segments within the instance.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1255
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1256 ``o[index]`` or ``__getitem__`` obtains a ``BufferSegment`` representing an
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1257 individual segment within the backing buffer. That returned object references
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1258 (not copies) memory. This means that iterating all objects doesn't copy
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1259 data within the buffer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1260
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1261 The ``.size`` attribute contains the total size in bytes of the backing
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1262 buffer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1263
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1264 Instances conform to the buffer protocol. So a reference to the backing bytes
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1265 can be obtained via ``memoryview(o)``. A *copy* of the backing bytes can also
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1266 be obtained via ``.tobytes()``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1267
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1268 The ``.segments`` attribute exposes the array of ``(offset, length)`` for
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1269 segments within the buffer. It is a ``BufferSegments`` type.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1270
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1271 BufferSegment
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1272 ^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1273
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1274 The ``BufferSegment`` type represents a segment within a ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1275 It is essentially a reference to N bytes within a ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1276
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1277 ``len()`` returns the length of the segment in bytes.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1278
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1279 ``.offset`` contains the byte offset of this segment within its parent
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1280 ``BufferWithSegments`` instance.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1281
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1282 The object conforms to the buffer protocol. ``.tobytes()`` can be called to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1283 obtain a ``bytes`` instance with a copy of the backing bytes.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1284
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1285 BufferSegments
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1286 ^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1287
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1288 This type represents an array of ``(offset, length)`` integers defining segments
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1289 within a ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1290
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1291 The array members are 64-bit unsigned integers using host/native bit order.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1292
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1293 Instances conform to the buffer protocol.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1294
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1295 BufferWithSegmentsCollection
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1296 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1297
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1298 The ``BufferWithSegmentsCollection`` type represents a virtual spanning view
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1299 of multiple ``BufferWithSegments`` instances.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1300
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1301 Instances are constructed from 1 or more ``BufferWithSegments`` instances. The
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1302 resulting object behaves like an ordered sequence whose members are the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1303 segments within each ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1304
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1305 ``len()`` returns the number of segments within all ``BufferWithSegments``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1306 instances.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1307
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1308 ``o[index]`` and ``__getitem__(index)`` return the ``BufferSegment`` at
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1309 that offset as if all ``BufferWithSegments`` instances were a single
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1310 entity.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1311
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1312 If the object is composed of 2 ``BufferWithSegments`` instances with the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1313 first having 2 segments and the second have 3 segments, then ``b[0]``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1314 and ``b[1]`` access segments in the first object and ``b[2]``, ``b[3]``,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1315 and ``b[4]`` access segments from the second.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1316
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1317 Choosing an API
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1318 ===============
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1319
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1320 There are multiple APIs for performing compression and decompression. This is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1321 because different applications have different needs and the library wants to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1322 facilitate optimal use in as many use cases as possible.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1323
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1324 From a high-level, APIs are divided into *one-shot* and *streaming*: either you
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1325 are operating on all data at once or you operate on it piecemeal.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1326
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1327 The *one-shot* APIs are useful for small data, where the input or output
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1328 size is known. (The size can come from a buffer length, file size, or
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1329 stored in the zstd frame header.) A limitation of the *one-shot* APIs is that
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1330 input and output must fit in memory simultaneously. For say a 4 GB input,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1331 this is often not feasible.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1332
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1333 The *one-shot* APIs also perform all work as a single operation. So, if you
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1334 feed it large input, it could take a long time for the function to return.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1335
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1336 The streaming APIs do not have the limitations of the simple API. But the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1337 price you pay for this flexibility is that they are more complex than a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1338 single function call.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1339
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1340 The streaming APIs put the caller in control of compression and decompression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1341 behavior by allowing them to directly control either the input or output side
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1342 of the operation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1343
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1344 With the *streaming input*, *compressor*, and *decompressor* APIs, the caller
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1345 has full control over the input to the compression or decompression stream.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1346 They can directly choose when new data is operated on.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1347
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1348 With the *streaming ouput* APIs, the caller has full control over the output
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1349 of the compression or decompression stream. It can choose when to receive
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1350 new data.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1351
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1352 When using the *streaming* APIs that operate on file-like or stream objects,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1353 it is important to consider what happens in that object when I/O is requested.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1354 There is potential for long pauses as data is read or written from the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1355 underlying stream (say from interacting with a filesystem or network). This
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1356 could add considerable overhead.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1357
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1358 Thread Safety
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1359 =============
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1360
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1361 ``ZstdCompressor`` and ``ZstdDecompressor`` instances have no guarantees
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1362 about thread safety. Do not operate on the same ``ZstdCompressor`` and
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1363 ``ZstdDecompressor`` instance simultaneously from different threads. It is
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1364 fine to have different threads call into a single instance, just not at the
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1365 same time.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1366
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1367 Some operations require multiple function calls to complete. e.g. streaming
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1368 operations. A single ``ZstdCompressor`` or ``ZstdDecompressor`` cannot be used
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1369 for simultaneously active operations. e.g. you must not start a streaming
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1370 operation when another streaming operation is already active.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1371
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1372 The C extension releases the GIL during non-trivial calls into the zstd C
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1373 API. Non-trivial calls are notably compression and decompression. Trivial
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1374 calls are things like parsing frame parameters. Where the GIL is released
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1375 is considered an implementation detail and can change in any release.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1376
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1377 APIs that accept bytes-like objects don't enforce that the underlying object
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1378 is read-only. However, it is assumed that the passed object is read-only for
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1379 the duration of the function call. It is possible to pass a mutable object
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1380 (like a ``bytearray``) to e.g. ``ZstdCompressor.compress()``, have the GIL
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1381 released, and mutate the object from another thread. Such a race condition
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1382 is a bug in the consumer of python-zstandard. Most Python data types are
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1383 immutable, so unless you are doing something fancy, you don't need to
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1384 worry about this.
31796
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30895
diff changeset
1385
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1386 Note on Zstandard's *Experimental* API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1387 ======================================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1388
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1389 Many of the Zstandard APIs used by this module are marked as *experimental*
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1390 within the Zstandard project.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1391
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1392 It is unclear how Zstandard's C API will evolve over time, especially with
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1393 regards to this *experimental* functionality. We will try to maintain
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1394 backwards compatibility at the Python API level. However, we cannot
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1395 guarantee this for things not under our control.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1396
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1397 Since a copy of the Zstandard source code is distributed with this
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1398 module and since we compile against it, the behavior of a specific
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1399 version of this module should be constant for all of time. So if you
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1400 pin the version of this module used in your projects (which is a Python
37495
b1fb341d8a61 zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 31796
diff changeset
1401 best practice), you should be shielded from unwanted future changes.
30435
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1402
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1403 Donate
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1404 ======
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1405
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1406 A lot of time has been invested into this project by the author.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1407
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1408 If you find this project useful and would like to thank the author for
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1409 their work, consider donating some money. Any amount is appreciated.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1410
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1411 .. image:: https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1412 :target: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=gregory%2eszorc%40gmail%2ecom&lc=US&item_name=python%2dzstandard&currency_code=USD&bn=PP%2dDonationsBF%3abtn_donate_LG%2egif%3aNonHosted
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1413 :alt: Donate via PayPal
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1414
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1415 .. |ci-status| image:: https://travis-ci.org/indygreg/python-zstandard.svg?branch=master
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1416 :target: https://travis-ci.org/indygreg/python-zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1417
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1418 .. |win-ci-status| image:: https://ci.appveyor.com/api/projects/status/github/indygreg/python-zstandard?svg=true
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1419 :target: https://ci.appveyor.com/project/indygreg/python-zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
1420 :alt: Windows build status