Mercurial > hg
annotate contrib/python-zstandard/README.rst @ 50825:973fbb27ab45
backout: migrate `opts` to native kwargs
It will take a bit to unwind `cmdutil.commit`, so there's a conversion to
byteskwargs there, without changing the type of `opts` in this function. That's
also useful to flag it as needing to be changed.
author | Matt Harbison <matt_harbison@yahoo.com> |
---|---|
date | Sun, 20 Aug 2023 00:27:27 -0400 |
parents | de7838053207 |
children |
rev | line source |
---|---|
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1 ================ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
2 python-zstandard |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
3 ================ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
4 |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
5 This project provides Python bindings for interfacing with the |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
6 `Zstandard <http://www.zstd.net>`_ compression library. A C extension |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
7 and CFFI interface are provided. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
8 |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
9 The primary goal of the project is to provide a rich interface to the |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
10 underlying C API through a Pythonic interface while not sacrificing |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
11 performance. This means exposing most of the features and flexibility |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
12 of the C API while not sacrificing usability or safety that Python provides. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
13 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
14 The canonical home for this project lives in a Mercurial repository run by |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
15 the author. For convenience, that repository is frequently synchronized to |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
16 https://github.com/indygreg/python-zstandard. |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
17 |
42937
69de49c4e39c
zstandard: vendor python-zstandard 0.12
Gregory Szorc <gregory.szorc@gmail.com>
parents:
42070
diff
changeset
|
18 | |ci-status| |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
19 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
20 Requirements |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
21 ============ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
22 |
43994
de7838053207
zstandard: vendor python-zstandard 0.13.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
42937
diff
changeset
|
23 This extension is designed to run with Python 2.7, 3.5, 3.6, 3.7, and 3.8 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
24 on common platforms (Linux, Windows, and OS X). On PyPy (both PyPy2 and PyPy3) we support version 6.0.0 and above. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
25 x86 and x86_64 are well-tested on Windows. Only x86_64 is well-tested on Linux and macOS. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
26 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
27 Installing |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
28 ========== |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
29 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
30 This package is uploaded to PyPI at https://pypi.python.org/pypi/zstandard. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
31 So, to install this package:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
32 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
33 $ pip install zstandard |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
34 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
35 Binary wheels are made available for some platforms. If you need to |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
36 install from a source distribution, all you should need is a working C |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
37 compiler and the Python development headers/libraries. On many Linux |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
38 distributions, you can install a ``python-dev`` or ``python-devel`` |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
39 package to provide these dependencies. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
40 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
41 Packages are also uploaded to Anaconda Cloud at |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
42 https://anaconda.org/indygreg/zstandard. See that URL for how to install |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
43 this package with ``conda``. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
44 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
45 Performance |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
46 =========== |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
47 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
48 zstandard is a highly tunable compression algorithm. In its default settings |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
49 (compression level 3), it will be faster at compression and decompression and |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
50 will have better compression ratios than zlib on most data sets. When tuned |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
51 for speed, it approaches lz4's speed and ratios. When tuned for compression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
52 ratio, it approaches lzma ratios and compression speed, but decompression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
53 speed is much faster. See the official zstandard documentation for more. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
54 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
55 zstandard and this library support multi-threaded compression. There is a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
56 mechanism to compress large inputs using multiple threads. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
57 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
58 The performance of this library is usually very similar to what the zstandard |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
59 C API can deliver. Overhead in this library is due to general Python overhead |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
60 and can't easily be avoided by *any* zstandard Python binding. This library |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
61 exposes multiple APIs for performing compression and decompression so callers |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
62 can pick an API suitable for their need. Contrast with the compression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
63 modules in Python's standard library (like ``zlib``), which only offer limited |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
64 mechanisms for performing operations. The API flexibility means consumers can |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
65 choose to use APIs that facilitate zero copying or minimize Python object |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
66 creation and garbage collection overhead. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
67 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
68 This library is capable of single-threaded throughputs well over 1 GB/s. For |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
69 exact numbers, measure yourself. The source code repository has a ``bench.py`` |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
70 script that can be used to measure things. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
71 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
72 API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
73 === |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
74 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
75 To interface with Zstandard, simply import the ``zstandard`` module:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
76 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
77 import zstandard |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
78 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
79 It is a popular convention to alias the module as a different name for |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
80 brevity:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
81 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
82 import zstandard as zstd |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
83 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
84 This module attempts to import and use either the C extension or CFFI |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
85 implementation. On Python platforms known to support C extensions (like |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
86 CPython), it raises an ImportError if the C extension cannot be imported. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
87 On Python platforms known to not support C extensions (like PyPy), it only |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
88 attempts to import the CFFI implementation and raises ImportError if that |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
89 can't be done. On other platforms, it first tries to import the C extension |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
90 then falls back to CFFI if that fails and raises ImportError if CFFI fails. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
91 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
92 To change the module import behavior, a ``PYTHON_ZSTANDARD_IMPORT_POLICY`` |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
93 environment variable can be set. The following values are accepted: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
94 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
95 default |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
96 The behavior described above. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
97 cffi_fallback |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
98 Always try to import the C extension then fall back to CFFI if that |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
99 fails. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
100 cext |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
101 Only attempt to import the C extension. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
102 cffi |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
103 Only attempt to import the CFFI implementation. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
104 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
105 In addition, the ``zstandard`` module exports a ``backend`` attribute |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
106 containing the string name of the backend being used. It will be one |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
107 of ``cext`` or ``cffi`` (for *C extension* and *cffi*, respectively). |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
108 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
109 The types, functions, and attributes exposed by the ``zstandard`` module |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
110 are documented in the sections below. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
111 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
112 .. note:: |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
113 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
114 The documentation in this section makes references to various zstd |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
115 concepts and functionality. The source repository contains a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
116 ``docs/concepts.rst`` file explaining these in more detail. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
117 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
118 ZstdCompressor |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
119 -------------- |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
120 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
121 The ``ZstdCompressor`` class provides an interface for performing |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
122 compression operations. Each instance is essentially a wrapper around a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
123 ``ZSTD_CCtx`` from the C API. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
124 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
125 Each instance is associated with parameters that control compression |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
126 behavior. These come from the following named arguments (all optional): |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
127 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
128 level |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
129 Integer compression level. Valid values are between 1 and 22. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
130 dict_data |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
131 Compression dictionary to use. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
132 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
133 Note: When using dictionary data and ``compress()`` is called multiple |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
134 times, the ``ZstdCompressionParameters`` derived from an integer |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
135 compression ``level`` and the first compressed data's size will be reused |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
136 for all subsequent operations. This may not be desirable if source data |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
137 size varies significantly. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
138 compression_params |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
139 A ``ZstdCompressionParameters`` instance defining compression settings. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
140 write_checksum |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
141 Whether a 4 byte checksum should be written with the compressed data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
142 Defaults to False. If True, the decompressor can verify that decompressed |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
143 data matches the original input data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
144 write_content_size |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
145 Whether the size of the uncompressed data will be written into the |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
146 header of compressed data. Defaults to True. The data will only be |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
147 written if the compressor knows the size of the input data. This is |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
148 often not true for streaming compression. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
149 write_dict_id |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
150 Whether to write the dictionary ID into the compressed data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
151 Defaults to True. The dictionary ID is only written if a dictionary |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
152 is being used. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
153 threads |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
154 Enables and sets the number of threads to use for multi-threaded compression |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
155 operations. Defaults to 0, which means to use single-threaded compression. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
156 Negative values will resolve to the number of logical CPUs in the system. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
157 Read below for more info on multi-threaded compression. This argument only |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
158 controls thread count for operations that operate on individual pieces of |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
159 data. APIs that spawn multiple threads for working on multiple pieces of |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
160 data have their own ``threads`` argument. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
161 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
162 ``compression_params`` is mutually exclusive with ``level``, ``write_checksum``, |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
163 ``write_content_size``, ``write_dict_id``, and ``threads``. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
164 |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
165 Unless specified otherwise, assume that no two methods of ``ZstdCompressor`` |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
166 instances can be called from multiple Python threads simultaneously. In other |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
167 words, assume instances are not thread safe unless stated otherwise. |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
168 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
169 Utility Methods |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
170 ^^^^^^^^^^^^^^^ |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
171 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
172 ``frame_progression()`` returns a 3-tuple containing the number of bytes |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
173 ingested, consumed, and produced by the current compression operation. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
174 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
175 ``memory_size()`` obtains the memory utilization of the underlying zstd |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
176 compression context, in bytes.:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
177 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
178 cctx = zstd.ZstdCompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
179 memory = cctx.memory_size() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
180 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
181 Simple API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
182 ^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
183 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
184 ``compress(data)`` compresses and returns data as a one-shot operation.:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
185 |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
186 cctx = zstd.ZstdCompressor() |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
187 compressed = cctx.compress(b'data to compress') |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
188 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
189 The ``data`` argument can be any object that implements the *buffer protocol*. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
190 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
191 Stream Reader API |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
192 ^^^^^^^^^^^^^^^^^ |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
193 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
194 ``stream_reader(source)`` can be used to obtain an object conforming to the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
195 ``io.RawIOBase`` interface for reading compressed output as a stream:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
196 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
197 with open(path, 'rb') as fh: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
198 cctx = zstd.ZstdCompressor() |
40121
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
199 reader = cctx.stream_reader(fh) |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
200 while True: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
201 chunk = reader.read(16384) |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
202 if not chunk: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
203 break |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
204 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
205 # Do something with compressed chunk. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
206 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
207 Instances can also be used as context managers:: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
208 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
209 with open(path, 'rb') as fh: |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
210 with cctx.stream_reader(fh) as reader: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
211 while True: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
212 chunk = reader.read(16384) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
213 if not chunk: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
214 break |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
215 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
216 # Do something with compressed chunk. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
217 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
218 When the context manager exits or ``close()`` is called, the stream is closed, |
40121
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
219 underlying resources are released, and future operations against the compression |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
220 stream will fail. |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
221 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
222 The ``source`` argument to ``stream_reader()`` can be any object with a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
223 ``read(size)`` method or any object implementing the *buffer protocol*. |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
224 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
225 ``stream_reader()`` accepts a ``size`` argument specifying how large the input |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
226 stream is. This is used to adjust compression parameters so they are |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
227 tailored to the source size.:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
228 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
229 with open(path, 'rb') as fh: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
230 cctx = zstd.ZstdCompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
231 with cctx.stream_reader(fh, size=os.stat(path).st_size) as reader: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
232 ... |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
233 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
234 If the ``source`` is a stream, you can specify how large ``read()`` requests |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
235 to that stream should be via the ``read_size`` argument. It defaults to |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
236 ``zstandard.COMPRESSION_RECOMMENDED_INPUT_SIZE``.:: |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
237 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
238 with open(path, 'rb') as fh: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
239 cctx = zstd.ZstdCompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
240 # Will perform fh.read(8192) when obtaining data to feed into the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
241 # compressor. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
242 with cctx.stream_reader(fh, read_size=8192) as reader: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
243 ... |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
244 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
245 The stream returned by ``stream_reader()`` is neither writable nor seekable |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
246 (even if the underlying source is seekable). ``readline()`` and |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
247 ``readlines()`` are not implemented because they don't make sense for |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
248 compressed data. ``tell()`` returns the number of compressed bytes |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
249 emitted so far. |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
250 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
251 Streaming Input API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
252 ^^^^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
253 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
254 ``stream_writer(fh)`` allows you to *stream* data into a compressor. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
255 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
256 Returned instances implement the ``io.RawIOBase`` interface. Only methods |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
257 that involve writing will do useful things. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
258 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
259 The argument to ``stream_writer()`` must have a ``write(data)`` method. As |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
260 compressed data is available, ``write()`` will be called with the compressed |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
261 data as its argument. Many common Python types implement ``write()``, including |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
262 open file handles and ``io.BytesIO``. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
263 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
264 The ``write(data)`` method is used to feed data into the compressor. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
265 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
266 The ``flush([flush_mode=FLUSH_BLOCK])`` method can be called to evict whatever |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
267 data remains within the compressor's internal state into the output object. This |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
268 may result in 0 or more ``write()`` calls to the output object. This method |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
269 accepts an optional ``flush_mode`` argument to control the flushing behavior. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
270 Its value can be any of the ``FLUSH_*`` constants. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
271 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
272 Both ``write()`` and ``flush()`` return the number of bytes written to the |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
273 object's ``write()``. In many cases, small inputs do not accumulate enough |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
274 data to cause a write and ``write()`` will return ``0``. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
275 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
276 Calling ``close()`` will mark the stream as closed and subsequent I/O |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
277 operations will raise ``ValueError`` (per the documented behavior of |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
278 ``io.RawIOBase``). ``close()`` will also call ``close()`` on the underlying |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
279 stream if such a method exists. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
280 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
281 Typically usage is as follows:: |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
282 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
283 cctx = zstd.ZstdCompressor(level=10) |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
284 compressor = cctx.stream_writer(fh) |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
285 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
286 compressor.write(b'chunk 0\n') |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
287 compressor.write(b'chunk 1\n') |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
288 compressor.flush() |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
289 # Receiver will be able to decode ``chunk 0\nchunk 1\n`` at this point. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
290 # Receiver is also expecting more data in the zstd *frame*. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
291 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
292 compressor.write(b'chunk 2\n') |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
293 compressor.flush(zstd.FLUSH_FRAME) |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
294 # Receiver will be able to decode ``chunk 0\nchunk 1\nchunk 2``. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
295 # Receiver is expecting no more data, as the zstd frame is closed. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
296 # Any future calls to ``write()`` at this point will construct a new |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
297 # zstd frame. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
298 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
299 Instances can be used as context managers. Exiting the context manager is |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
300 the equivalent of calling ``close()``, which is equivalent to calling |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
301 ``flush(zstd.FLUSH_FRAME)``:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
302 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
303 cctx = zstd.ZstdCompressor(level=10) |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
304 with cctx.stream_writer(fh) as compressor: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
305 compressor.write(b'chunk 0') |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
306 compressor.write(b'chunk 1') |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
307 ... |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
308 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
309 .. important:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
310 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
311 If ``flush(FLUSH_FRAME)`` is not called, emitted data doesn't constitute |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
312 a full zstd *frame* and consumers of this data may complain about malformed |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
313 input. It is recommended to use instances as a context manager to ensure |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
314 *frames* are properly finished. |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
315 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
316 If the size of the data being fed to this streaming compressor is known, |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
317 you can declare it before compression begins:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
318 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
319 cctx = zstd.ZstdCompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
320 with cctx.stream_writer(fh, size=data_len) as compressor: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
321 compressor.write(chunk0) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
322 compressor.write(chunk1) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
323 ... |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
324 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
325 Declaring the size of the source data allows compression parameters to |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
326 be tuned. And if ``write_content_size`` is used, it also results in the |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
327 content size being written into the frame header of the output data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
328 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
329 The size of chunks being ``write()`` to the destination can be specified:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
330 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
331 cctx = zstd.ZstdCompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
332 with cctx.stream_writer(fh, write_size=32768) as compressor: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
333 ... |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
334 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
335 To see how much memory is being used by the streaming compressor:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
336 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
337 cctx = zstd.ZstdCompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
338 with cctx.stream_writer(fh) as compressor: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
339 ... |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
340 byte_size = compressor.memory_size() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
341 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
342 Thte total number of bytes written so far are exposed via ``tell()``:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
343 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
344 cctx = zstd.ZstdCompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
345 with cctx.stream_writer(fh) as compressor: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
346 ... |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
347 total_written = compressor.tell() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
348 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
349 ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
350 the return value of ``write()``. When ``False`` (the default), ``write()`` returns |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
351 the number of bytes that were ``write()``en to the underlying object. When |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
352 ``True``, ``write()`` returns the number of bytes read from the input that |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
353 were subsequently written to the compressor. ``True`` is the *proper* behavior |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
354 for ``write()`` as specified by the ``io.RawIOBase`` interface and will become |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
355 the default value in a future release. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
356 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
357 Streaming Output API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
358 ^^^^^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
359 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
360 ``read_to_iter(reader)`` provides a mechanism to stream data out of a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
361 compressor as an iterator of data chunks.:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
362 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
363 cctx = zstd.ZstdCompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
364 for chunk in cctx.read_to_iter(fh): |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
365 # Do something with emitted data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
366 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
367 ``read_to_iter()`` accepts an object that has a ``read(size)`` method or |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
368 conforms to the buffer protocol. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
369 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
370 Uncompressed data is fetched from the source either by calling ``read(size)`` |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
371 or by fetching a slice of data from the object directly (in the case where |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
372 the buffer protocol is being used). The returned iterator consists of chunks |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
373 of compressed data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
374 |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
375 If reading from the source via ``read()``, ``read()`` will be called until |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
376 it raises or returns an empty bytes (``b''``). It is perfectly valid for |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
377 the source to deliver fewer bytes than were what requested by ``read(size)``. |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
378 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
379 Like ``stream_writer()``, ``read_to_iter()`` also accepts a ``size`` argument |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
380 declaring the size of the input stream:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
381 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
382 cctx = zstd.ZstdCompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
383 for chunk in cctx.read_to_iter(fh, size=some_int): |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
384 pass |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
385 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
386 You can also control the size that data is ``read()`` from the source and |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
387 the ideal size of output chunks:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
388 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
389 cctx = zstd.ZstdCompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
390 for chunk in cctx.read_to_iter(fh, read_size=16384, write_size=8192): |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
391 pass |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
392 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
393 Unlike ``stream_writer()``, ``read_to_iter()`` does not give direct control |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
394 over the sizes of chunks fed into the compressor. Instead, chunk sizes will |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
395 be whatever the object being read from delivers. These will often be of a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
396 uniform size. |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
397 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
398 Stream Copying API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
399 ^^^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
400 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
401 ``copy_stream(ifh, ofh)`` can be used to copy data between 2 streams while |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
402 compressing it.:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
403 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
404 cctx = zstd.ZstdCompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
405 cctx.copy_stream(ifh, ofh) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
406 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
407 For example, say you wish to compress a file:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
408 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
409 cctx = zstd.ZstdCompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
410 with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
411 cctx.copy_stream(ifh, ofh) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
412 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
413 It is also possible to declare the size of the source stream:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
414 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
415 cctx = zstd.ZstdCompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
416 cctx.copy_stream(ifh, ofh, size=len_of_input) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
417 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
418 You can also specify how large the chunks that are ``read()`` and ``write()`` |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
419 from and to the streams:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
420 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
421 cctx = zstd.ZstdCompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
422 cctx.copy_stream(ifh, ofh, read_size=32768, write_size=16384) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
423 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
424 The stream copier returns a 2-tuple of bytes read and written:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
425 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
426 cctx = zstd.ZstdCompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
427 read_count, write_count = cctx.copy_stream(ifh, ofh) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
428 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
429 Compressor API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
430 ^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
431 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
432 ``compressobj()`` returns an object that exposes ``compress(data)`` and |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
433 ``flush()`` methods. Each returns compressed data or an empty bytes. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
434 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
435 The purpose of ``compressobj()`` is to provide an API-compatible interface |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
436 with ``zlib.compressobj``, ``bz2.BZ2Compressor``, etc. This allows callers to |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
437 swap in different compressor objects while using the same API. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
438 |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
439 ``flush()`` accepts an optional argument indicating how to end the stream. |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
440 ``zstd.COMPRESSOBJ_FLUSH_FINISH`` (the default) ends the compression stream. |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
441 Once this type of flush is performed, ``compress()`` and ``flush()`` can |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
442 no longer be called. This type of flush **must** be called to end the |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
443 compression context. If not called, returned data may be incomplete. |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
444 |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
445 A ``zstd.COMPRESSOBJ_FLUSH_BLOCK`` argument to ``flush()`` will flush a |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
446 zstd block. Flushes of this type can be performed multiple times. The next |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
447 call to ``compress()`` will begin a new zstd block. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
448 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
449 Here is how this API should be used:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
450 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
451 cctx = zstd.ZstdCompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
452 cobj = cctx.compressobj() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
453 data = cobj.compress(b'raw input 0') |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
454 data = cobj.compress(b'raw input 1') |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
455 data = cobj.flush() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
456 |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
457 Or to flush blocks:: |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
458 |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
459 cctx.zstd.ZstdCompressor() |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
460 cobj = cctx.compressobj() |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
461 data = cobj.compress(b'chunk in first block') |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
462 data = cobj.flush(zstd.COMPRESSOBJ_FLUSH_BLOCK) |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
463 data = cobj.compress(b'chunk in second block') |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
464 data = cobj.flush() |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
465 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
466 For best performance results, keep input chunks under 256KB. This avoids |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
467 extra allocations for a large output object. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
468 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
469 It is possible to declare the input size of the data that will be fed into |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
470 the compressor:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
471 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
472 cctx = zstd.ZstdCompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
473 cobj = cctx.compressobj(size=6) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
474 data = cobj.compress(b'foobar') |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
475 data = cobj.flush() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
476 |
40121
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
477 Chunker API |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
478 ^^^^^^^^^^^ |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
479 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
480 ``chunker(size=None, chunk_size=COMPRESSION_RECOMMENDED_OUTPUT_SIZE)`` returns |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
481 an object that can be used to iteratively feed chunks of data into a compressor |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
482 and produce output chunks of a uniform size. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
483 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
484 The object returned by ``chunker()`` exposes the following methods: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
485 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
486 ``compress(data)`` |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
487 Feeds new input data into the compressor. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
488 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
489 ``flush()`` |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
490 Flushes all data currently in the compressor. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
491 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
492 ``finish()`` |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
493 Signals the end of input data. No new data can be compressed after this |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
494 method is called. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
495 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
496 ``compress()``, ``flush()``, and ``finish()`` all return an iterator of |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
497 ``bytes`` instances holding compressed data. The iterator may be empty. Callers |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
498 MUST iterate through all elements of the returned iterator before performing |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
499 another operation on the object. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
500 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
501 All chunks emitted by ``compress()`` will have a length of ``chunk_size``. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
502 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
503 ``flush()`` and ``finish()`` may return a final chunk smaller than |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
504 ``chunk_size``. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
505 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
506 Here is how the API should be used:: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
507 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
508 cctx = zstd.ZstdCompressor() |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
509 chunker = cctx.chunker(chunk_size=32768) |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
510 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
511 with open(path, 'rb') as fh: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
512 while True: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
513 in_chunk = fh.read(32768) |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
514 if not in_chunk: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
515 break |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
516 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
517 for out_chunk in chunker.compress(in_chunk): |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
518 # Do something with output chunk of size 32768. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
519 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
520 for out_chunk in chunker.finish(): |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
521 # Do something with output chunks that finalize the zstd frame. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
522 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
523 The ``chunker()`` API is often a better alternative to ``compressobj()``. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
524 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
525 ``compressobj()`` will emit output data as it is available. This results in a |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
526 *stream* of output chunks of varying sizes. The consistency of the output chunk |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
527 size with ``chunker()`` is more appropriate for many usages, such as sending |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
528 compressed data to a socket. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
529 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
530 ``compressobj()`` may also perform extra memory reallocations in order to |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
531 dynamically adjust the sizes of the output chunks. Since ``chunker()`` output |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
532 chunks are all the same size (except for flushed or final chunks), there is |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
533 less memory allocation overhead. |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
534 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
535 Batch Compression API |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
536 ^^^^^^^^^^^^^^^^^^^^^ |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
537 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
538 (Experimental. Not yet supported in CFFI bindings.) |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
539 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
540 ``multi_compress_to_buffer(data, [threads=0])`` performs compression of multiple |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
541 inputs as a single operation. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
542 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
543 Data to be compressed can be passed as a ``BufferWithSegmentsCollection``, a |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
544 ``BufferWithSegments``, or a list containing byte like objects. Each element of |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
545 the container will be compressed individually using the configured parameters |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
546 on the ``ZstdCompressor`` instance. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
547 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
548 The ``threads`` argument controls how many threads to use for compression. The |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
549 default is ``0`` which means to use a single thread. Negative values use the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
550 number of logical CPUs in the machine. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
551 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
552 The function returns a ``BufferWithSegmentsCollection``. This type represents |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
553 N discrete memory allocations, eaching holding 1 or more compressed frames. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
554 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
555 Output data is written to shared memory buffers. This means that unlike |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
556 regular Python objects, a reference to *any* object within the collection |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
557 keeps the shared buffer and therefore memory backing it alive. This can have |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
558 undesirable effects on process memory usage. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
559 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
560 The API and behavior of this function is experimental and will likely change. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
561 Known deficiencies include: |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
562 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
563 * If asked to use multiple threads, it will always spawn that many threads, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
564 even if the input is too small to use them. It should automatically lower |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
565 the thread count when the extra threads would just add overhead. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
566 * The buffer allocation strategy is fixed. There is room to make it dynamic, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
567 perhaps even to allow one output buffer per input, facilitating a variation |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
568 of the API to return a list without the adverse effects of shared memory |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
569 buffers. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
570 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
571 ZstdDecompressor |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
572 ---------------- |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
573 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
574 The ``ZstdDecompressor`` class provides an interface for performing |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
575 decompression. It is effectively a wrapper around the ``ZSTD_DCtx`` type from |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
576 the C API. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
577 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
578 Each instance is associated with parameters that control decompression. These |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
579 come from the following named arguments (all optional): |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
580 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
581 dict_data |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
582 Compression dictionary to use. |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
583 max_window_size |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
584 Sets an uppet limit on the window size for decompression operations in |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
585 kibibytes. This setting can be used to prevent large memory allocations |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
586 for inputs using large compression windows. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
587 format |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
588 Set the format of data for the decoder. By default, this is |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
589 ``zstd.FORMAT_ZSTD1``. It can be set to ``zstd.FORMAT_ZSTD1_MAGICLESS`` to |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
590 allow decoding frames without the 4 byte magic header. Not all decompression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
591 APIs support this mode. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
592 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
593 The interface of this class is very similar to ``ZstdCompressor`` (by design). |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
594 |
30822
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
595 Unless specified otherwise, assume that no two methods of ``ZstdDecompressor`` |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
596 instances can be called from multiple Python threads simultaneously. In other |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
597 words, assume instances are not thread safe unless stated otherwise. |
b54a2984cdd4
zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30435
diff
changeset
|
598 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
599 Utility Methods |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
600 ^^^^^^^^^^^^^^^ |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
601 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
602 ``memory_size()`` obtains the size of the underlying zstd decompression context, |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
603 in bytes.:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
604 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
605 dctx = zstd.ZstdDecompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
606 size = dctx.memory_size() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
607 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
608 Simple API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
609 ^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
610 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
611 ``decompress(data)`` can be used to decompress an entire compressed zstd |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
612 frame in a single operation.:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
613 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
614 dctx = zstd.ZstdDecompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
615 decompressed = dctx.decompress(data) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
616 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
617 By default, ``decompress(data)`` will only work on data written with the content |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
618 size encoded in its header (this is the default behavior of |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
619 ``ZstdCompressor().compress()`` but may not be true for streaming compression). If |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
620 compressed data without an embedded content size is seen, ``zstd.ZstdError`` will |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
621 be raised. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
622 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
623 If the compressed data doesn't have its content size embedded within it, |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
624 decompression can be attempted by specifying the ``max_output_size`` |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
625 argument.:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
626 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
627 dctx = zstd.ZstdDecompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
628 uncompressed = dctx.decompress(data, max_output_size=1048576) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
629 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
630 Ideally, ``max_output_size`` will be identical to the decompressed output |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
631 size. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
632 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
633 If ``max_output_size`` is too small to hold the decompressed data, |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
634 ``zstd.ZstdError`` will be raised. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
635 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
636 If ``max_output_size`` is larger than the decompressed data, the allocated |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
637 output buffer will be resized to only use the space required. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
638 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
639 Please note that an allocation of the requested ``max_output_size`` will be |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
640 performed every time the method is called. Setting to a very large value could |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
641 result in a lot of work for the memory allocator and may result in |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
642 ``MemoryError`` being raised if the allocation fails. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
643 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
644 .. important:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
645 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
646 If the exact size of decompressed data is unknown (not passed in explicitly |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
647 and not stored in the zstandard frame), for performance reasons it is |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
648 encouraged to use a streaming API. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
649 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
650 Stream Reader API |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
651 ^^^^^^^^^^^^^^^^^ |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
652 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
653 ``stream_reader(source)`` can be used to obtain an object conforming to the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
654 ``io.RawIOBase`` interface for reading decompressed output as a stream:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
655 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
656 with open(path, 'rb') as fh: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
657 dctx = zstd.ZstdDecompressor() |
40121
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
658 reader = dctx.stream_reader(fh) |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
659 while True: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
660 chunk = reader.read(16384) |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
661 if not chunk: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
662 break |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
663 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
664 # Do something with decompressed chunk. |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
665 |
40121
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
666 The stream can also be used as a context manager:: |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
667 |
40121
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
668 with open(path, 'rb') as fh: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
669 dctx = zstd.ZstdDecompressor() |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
670 with dctx.stream_reader(fh) as reader: |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
671 ... |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
672 |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
673 When used as a context manager, the stream is closed and the underlying |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
674 resources are released when the context manager exits. Future operations against |
73fef626dae3
zstandard: vendor python-zstandard 0.10.1
Gregory Szorc <gregory.szorc@gmail.com>
parents:
37495
diff
changeset
|
675 the stream will fail. |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
676 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
677 The ``source`` argument to ``stream_reader()`` can be any object with a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
678 ``read(size)`` method or any object implementing the *buffer protocol*. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
679 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
680 If the ``source`` is a stream, you can specify how large ``read()`` requests |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
681 to that stream should be via the ``read_size`` argument. It defaults to |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
682 ``zstandard.DECOMPRESSION_RECOMMENDED_INPUT_SIZE``.:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
683 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
684 with open(path, 'rb') as fh: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
685 dctx = zstd.ZstdDecompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
686 # Will perform fh.read(8192) when obtaining data for the decompressor. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
687 with dctx.stream_reader(fh, read_size=8192) as reader: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
688 ... |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
689 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
690 The stream returned by ``stream_reader()`` is not writable. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
691 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
692 The stream returned by ``stream_reader()`` is *partially* seekable. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
693 Absolute and relative positions (``SEEK_SET`` and ``SEEK_CUR``) forward |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
694 of the current position are allowed. Offsets behind the current read |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
695 position and offsets relative to the end of stream are not allowed and |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
696 will raise ``ValueError`` if attempted. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
697 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
698 ``tell()`` returns the number of decompressed bytes read so far. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
699 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
700 Not all I/O methods are implemented. Notably missing is support for |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
701 ``readline()``, ``readlines()``, and linewise iteration support. This is |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
702 because streams operate on binary data - not text data. If you want to |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
703 convert decompressed output to text, you can chain an ``io.TextIOWrapper`` |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
704 to the stream:: |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
705 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
706 with open(path, 'rb') as fh: |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
707 dctx = zstd.ZstdDecompressor() |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
708 stream_reader = dctx.stream_reader(fh) |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
709 text_stream = io.TextIOWrapper(stream_reader, encoding='utf-8') |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
710 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
711 for line in text_stream: |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
712 ... |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
713 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
714 The ``read_across_frames`` argument to ``stream_reader()`` controls the |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
715 behavior of read operations when the end of a zstd *frame* is encountered. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
716 When ``False`` (the default), a read will complete when the end of a |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
717 zstd *frame* is encountered. When ``True``, a read can potentially |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
718 return data spanning multiple zstd *frames*. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
719 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
720 Streaming Input API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
721 ^^^^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
722 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
723 ``stream_writer(fh)`` allows you to *stream* data into a decompressor. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
724 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
725 Returned instances implement the ``io.RawIOBase`` interface. Only methods |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
726 that involve writing will do useful things. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
727 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
728 The argument to ``stream_writer()`` is typically an object that also implements |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
729 ``io.RawIOBase``. But any object with a ``write(data)`` method will work. Many |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
730 common Python types conform to this interface, including open file handles |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
731 and ``io.BytesIO``. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
732 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
733 Behavior is similar to ``ZstdCompressor.stream_writer()``: compressed data |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
734 is sent to the decompressor by calling ``write(data)`` and decompressed |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
735 output is written to the underlying stream by calling its ``write(data)`` |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
736 method.:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
737 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
738 dctx = zstd.ZstdDecompressor() |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
739 decompressor = dctx.stream_writer(fh) |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
740 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
741 decompressor.write(compressed_data) |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
742 ... |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
743 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
744 |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
745 Calls to ``write()`` will return the number of bytes written to the output |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
746 object. Not all inputs will result in bytes being written, so return values |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
747 of ``0`` are possible. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
748 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
749 Like the ``stream_writer()`` compressor, instances can be used as context |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
750 managers. However, context managers add no extra special behavior and offer |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
751 little to no benefit to being used. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
752 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
753 Calling ``close()`` will mark the stream as closed and subsequent I/O operations |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
754 will raise ``ValueError`` (per the documented behavior of ``io.RawIOBase``). |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
755 ``close()`` will also call ``close()`` on the underlying stream if such a |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
756 method exists. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
757 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
758 The size of chunks being ``write()`` to the destination can be specified:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
759 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
760 dctx = zstd.ZstdDecompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
761 with dctx.stream_writer(fh, write_size=16384) as decompressor: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
762 pass |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
763 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
764 You can see how much memory is being used by the decompressor:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
765 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
766 dctx = zstd.ZstdDecompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
767 with dctx.stream_writer(fh) as decompressor: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
768 byte_size = decompressor.memory_size() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
769 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
770 ``stream_writer()`` accepts a ``write_return_read`` boolean argument to control |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
771 the return value of ``write()``. When ``False`` (the default)``, ``write()`` |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
772 returns the number of bytes that were ``write()``en to the underlying stream. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
773 When ``True``, ``write()`` returns the number of bytes read from the input. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
774 ``True`` is the *proper* behavior for ``write()`` as specified by the |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
775 ``io.RawIOBase`` interface and will become the default in a future release. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
776 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
777 Streaming Output API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
778 ^^^^^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
779 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
780 ``read_to_iter(fh)`` provides a mechanism to stream decompressed data out of a |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
781 compressed source as an iterator of data chunks.:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
782 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
783 dctx = zstd.ZstdDecompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
784 for chunk in dctx.read_to_iter(fh): |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
785 # Do something with original data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
786 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
787 ``read_to_iter()`` accepts an object with a ``read(size)`` method that will |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
788 return compressed bytes or an object conforming to the buffer protocol that |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
789 can expose its data as a contiguous range of bytes. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
790 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
791 ``read_to_iter()`` returns an iterator whose elements are chunks of the |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
792 decompressed data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
793 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
794 The size of requested ``read()`` from the source can be specified:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
795 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
796 dctx = zstd.ZstdDecompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
797 for chunk in dctx.read_to_iter(fh, read_size=16384): |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
798 pass |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
799 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
800 It is also possible to skip leading bytes in the input data:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
801 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
802 dctx = zstd.ZstdDecompressor() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
803 for chunk in dctx.read_to_iter(fh, skip_bytes=1): |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
804 pass |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
805 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
806 .. tip:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
807 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
808 Skipping leading bytes is useful if the source data contains extra |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
809 *header* data. Traditionally, you would need to create a slice or |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
810 ``memoryview`` of the data you want to decompress. This would create |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
811 overhead. It is more efficient to pass the offset into this API. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
812 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
813 Similarly to ``ZstdCompressor.read_to_iter()``, the consumer of the iterator |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
814 controls when data is decompressed. If the iterator isn't consumed, |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
815 decompression is put on hold. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
816 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
817 When ``read_to_iter()`` is passed an object conforming to the buffer protocol, |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
818 the behavior may seem similar to what occurs when the simple decompression |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
819 API is used. However, this API works when the decompressed size is unknown. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
820 Furthermore, if feeding large inputs, the decompressor will work in chunks |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
821 instead of performing a single operation. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
822 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
823 Stream Copying API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
824 ^^^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
825 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
826 ``copy_stream(ifh, ofh)`` can be used to copy data across 2 streams while |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
827 performing decompression.:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
828 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
829 dctx = zstd.ZstdDecompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
830 dctx.copy_stream(ifh, ofh) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
831 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
832 e.g. to decompress a file to another file:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
833 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
834 dctx = zstd.ZstdDecompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
835 with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
836 dctx.copy_stream(ifh, ofh) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
837 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
838 The size of chunks being ``read()`` and ``write()`` from and to the streams |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
839 can be specified:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
840 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
841 dctx = zstd.ZstdDecompressor() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
842 dctx.copy_stream(ifh, ofh, read_size=8192, write_size=16384) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
843 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
844 Decompressor API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
845 ^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
846 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
847 ``decompressobj()`` returns an object that exposes a ``decompress(data)`` |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
848 method. Compressed data chunks are fed into ``decompress(data)`` and |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
849 uncompressed output (or an empty bytes) is returned. Output from subsequent |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
850 calls needs to be concatenated to reassemble the full decompressed byte |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
851 sequence. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
852 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
853 The purpose of ``decompressobj()`` is to provide an API-compatible interface |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
854 with ``zlib.decompressobj`` and ``bz2.BZ2Decompressor``. This allows callers |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
855 to swap in different decompressor objects while using the same API. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
856 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
857 Each object is single use: once an input frame is decoded, ``decompress()`` |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
858 can no longer be called. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
859 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
860 Here is how this API should be used:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
861 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
862 dctx = zstd.ZstdDecompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
863 dobj = dctx.decompressobj() |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
864 data = dobj.decompress(compressed_chunk_0) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
865 data = dobj.decompress(compressed_chunk_1) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
866 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
867 By default, calls to ``decompress()`` write output data in chunks of size |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
868 ``DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE``. These chunks are concatenated |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
869 before being returned to the caller. It is possible to define the size of |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
870 these temporary chunks by passing ``write_size`` to ``decompressobj()``:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
871 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
872 dctx = zstd.ZstdDecompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
873 dobj = dctx.decompressobj(write_size=1048576) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
874 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
875 .. note:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
876 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
877 Because calls to ``decompress()`` may need to perform multiple |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
878 memory (re)allocations, this streaming decompression API isn't as |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
879 efficient as other APIs. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
880 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
881 For compatibility with the standard library APIs, instances expose a |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
882 ``flush([length=None])`` method. This method no-ops and has no meaningful |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
883 side-effects, making it safe to call any time. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
884 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
885 Batch Decompression API |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
886 ^^^^^^^^^^^^^^^^^^^^^^^ |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
887 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
888 (Experimental. Not yet supported in CFFI bindings.) |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
889 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
890 ``multi_decompress_to_buffer()`` performs decompression of multiple |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
891 frames as a single operation and returns a ``BufferWithSegmentsCollection`` |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
892 containing decompressed data for all inputs. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
893 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
894 Compressed frames can be passed to the function as a ``BufferWithSegments``, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
895 a ``BufferWithSegmentsCollection``, or as a list containing objects that |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
896 conform to the buffer protocol. For best performance, pass a |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
897 ``BufferWithSegmentsCollection`` or a ``BufferWithSegments``, as |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
898 minimal input validation will be done for that type. If calling from |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
899 Python (as opposed to C), constructing one of these instances may add |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
900 overhead cancelling out the performance overhead of validation for list |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
901 inputs.:: |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
902 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
903 dctx = zstd.ZstdDecompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
904 results = dctx.multi_decompress_to_buffer([b'...', b'...']) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
905 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
906 The decompressed size of each frame MUST be discoverable. It can either be |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
907 embedded within the zstd frame (``write_content_size=True`` argument to |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
908 ``ZstdCompressor``) or passed in via the ``decompressed_sizes`` argument. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
909 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
910 The ``decompressed_sizes`` argument is an object conforming to the buffer |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
911 protocol which holds an array of 64-bit unsigned integers in the machine's |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
912 native format defining the decompressed sizes of each frame. If this argument |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
913 is passed, it avoids having to scan each frame for its decompressed size. |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
914 This frame scanning can add noticeable overhead in some scenarios.:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
915 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
916 frames = [...] |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
917 sizes = struct.pack('=QQQQ', len0, len1, len2, len3) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
918 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
919 dctx = zstd.ZstdDecompressor() |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
920 results = dctx.multi_decompress_to_buffer(frames, decompressed_sizes=sizes) |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
921 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
922 The ``threads`` argument controls the number of threads to use to perform |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
923 decompression operations. The default (``0``) or the value ``1`` means to |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
924 use a single thread. Negative values use the number of logical CPUs in the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
925 machine. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
926 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
927 .. note:: |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
928 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
929 It is possible to pass a ``mmap.mmap()`` instance into this function by |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
930 wrapping it with a ``BufferWithSegments`` instance (which will define the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
931 offsets of frames within the memory mapped region). |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
932 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
933 This function is logically equivalent to performing ``dctx.decompress()`` |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
934 on each input frame and returning the result. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
935 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
936 This function exists to perform decompression on multiple frames as fast |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
937 as possible by having as little overhead as possible. Since decompression is |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
938 performed as a single operation and since the decompressed output is stored in |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
939 a single buffer, extra memory allocations, Python objects, and Python function |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
940 calls are avoided. This is ideal for scenarios where callers know up front that |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
941 they need to access data for multiple frames, such as when *delta chains* are |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
942 being used. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
943 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
944 Currently, the implementation always spawns multiple threads when requested, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
945 even if the amount of work to do is small. In the future, it will be smarter |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
946 about avoiding threads and their associated overhead when the amount of |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
947 work to do is small. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
948 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
949 Prefix Dictionary Chain Decompression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
950 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
951 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
952 ``decompress_content_dict_chain(frames)`` performs decompression of a list of |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
953 zstd frames produced using chained *prefix* dictionary compression. Such |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
954 a list of frames is produced by compressing discrete inputs where each |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
955 non-initial input is compressed with a *prefix* dictionary consisting of the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
956 content of the previous input. |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
957 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
958 For example, say you have the following inputs:: |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
959 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
960 inputs = [b'input 1', b'input 2', b'input 3'] |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
961 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
962 The zstd frame chain consists of: |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
963 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
964 1. ``b'input 1'`` compressed in standalone/discrete mode |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
965 2. ``b'input 2'`` compressed using ``b'input 1'`` as a *prefix* dictionary |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
966 3. ``b'input 3'`` compressed using ``b'input 2'`` as a *prefix* dictionary |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
967 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
968 Each zstd frame **must** have the content size written. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
969 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
970 The following Python code can be used to produce a *prefix dictionary chain*:: |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
971 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
972 def make_chain(inputs): |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
973 frames = [] |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
974 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
975 # First frame is compressed in standalone/discrete mode. |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
976 zctx = zstd.ZstdCompressor() |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
977 frames.append(zctx.compress(inputs[0])) |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
978 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
979 # Subsequent frames use the previous fulltext as a prefix dictionary |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
980 for i, raw in enumerate(inputs[1:]): |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
981 dict_data = zstd.ZstdCompressionDict( |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
982 inputs[i], dict_type=zstd.DICT_TYPE_RAWCONTENT) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
983 zctx = zstd.ZstdCompressor(dict_data=dict_data) |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
984 frames.append(zctx.compress(raw)) |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
985 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
986 return frames |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
987 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
988 ``decompress_content_dict_chain()`` returns the uncompressed data of the last |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
989 element in the input chain. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
990 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
991 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
992 .. note:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
993 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
994 It is possible to implement *prefix dictionary chain* decompression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
995 on top of other APIs. However, this function will likely be faster - |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
996 especially for long input chains - as it avoids the overhead of instantiating |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
997 and passing around intermediate objects between C and Python. |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
998 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
999 Multi-Threaded Compression |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1000 -------------------------- |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1001 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1002 ``ZstdCompressor`` accepts a ``threads`` argument that controls the number |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1003 of threads to use for compression. The way this works is that input is split |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1004 into segments and each segment is fed into a worker pool for compression. Once |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1005 a segment is compressed, it is flushed/appended to the output. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1006 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1007 .. note:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1008 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1009 These threads are created at the C layer and are not Python threads. So they |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1010 work outside the GIL. It is therefore possible to CPU saturate multiple cores |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1011 from Python. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1012 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1013 The segment size for multi-threaded compression is chosen from the window size |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1014 of the compressor. This is derived from the ``window_log`` attribute of a |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1015 ``ZstdCompressionParameters`` instance. By default, segment sizes are in the 1+MB |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1016 range. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1017 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1018 If multi-threaded compression is requested and the input is smaller than the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1019 configured segment size, only a single compression thread will be used. If the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1020 input is smaller than the segment size multiplied by the thread pool size or |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1021 if data cannot be delivered to the compressor fast enough, not all requested |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1022 compressor threads may be active simultaneously. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1023 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1024 Compared to non-multi-threaded compression, multi-threaded compression has |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1025 higher per-operation overhead. This includes extra memory operations, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1026 thread creation, lock acquisition, etc. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1027 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1028 Due to the nature of multi-threaded compression using *N* compression |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1029 *states*, the output from multi-threaded compression will likely be larger |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1030 than non-multi-threaded compression. The difference is usually small. But |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1031 there is a CPU/wall time versus size trade off that may warrant investigation. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1032 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1033 Output from multi-threaded compression does not require any special handling |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1034 on the decompression side. To the decompressor, data generated with single |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1035 threaded compressor looks the same as data generated by a multi-threaded |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1036 compressor and does not require any special handling or additional resource |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1037 requirements. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1038 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1039 Dictionary Creation and Management |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1040 ---------------------------------- |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1041 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1042 Compression dictionaries are represented with the ``ZstdCompressionDict`` type. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1043 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1044 Instances can be constructed from bytes:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1045 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1046 dict_data = zstd.ZstdCompressionDict(data) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1047 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1048 It is possible to construct a dictionary from *any* data. If the data doesn't |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1049 begin with a magic header, it will be treated as a *prefix* dictionary. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1050 *Prefix* dictionaries allow compression operations to reference raw data |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1051 within the dictionary. |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1052 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1053 It is possible to force the use of *prefix* dictionaries or to require a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1054 dictionary header: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1055 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1056 dict_data = zstd.ZstdCompressionDict(data, |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1057 dict_type=zstd.DICT_TYPE_RAWCONTENT) |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1058 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1059 dict_data = zstd.ZstdCompressionDict(data, |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1060 dict_type=zstd.DICT_TYPE_FULLDICT) |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1061 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1062 You can see how many bytes are in the dictionary by calling ``len()``:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1063 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1064 dict_data = zstd.train_dictionary(size, samples) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1065 dict_size = len(dict_data) # will not be larger than ``size`` |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1066 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1067 Once you have a dictionary, you can pass it to the objects performing |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1068 compression and decompression:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1069 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1070 dict_data = zstd.train_dictionary(131072, samples) |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1071 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1072 cctx = zstd.ZstdCompressor(dict_data=dict_data) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1073 for source_data in input_data: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1074 compressed = cctx.compress(source_data) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1075 # Do something with compressed data. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1076 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1077 dctx = zstd.ZstdDecompressor(dict_data=dict_data) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1078 for compressed_data in input_data: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1079 buffer = io.BytesIO() |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1080 with dctx.stream_writer(buffer) as decompressor: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1081 decompressor.write(compressed_data) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1082 # Do something with raw data in ``buffer``. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1083 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1084 Dictionaries have unique integer IDs. You can retrieve this ID via:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1085 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1086 dict_id = zstd.dictionary_id(dict_data) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1087 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1088 You can obtain the raw data in the dict (useful for persisting and constructing |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1089 a ``ZstdCompressionDict`` later) via ``as_bytes()``:: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1090 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1091 dict_data = zstd.train_dictionary(size, samples) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1092 raw_data = dict_data.as_bytes() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1093 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1094 By default, when a ``ZstdCompressionDict`` is *attached* to a |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1095 ``ZstdCompressor``, each ``ZstdCompressor`` performs work to prepare the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1096 dictionary for use. This is fine if only 1 compression operation is being |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1097 performed or if the ``ZstdCompressor`` is being reused for multiple operations. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1098 But if multiple ``ZstdCompressor`` instances are being used with the dictionary, |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1099 this can add overhead. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1100 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1101 It is possible to *precompute* the dictionary so it can readily be consumed |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1102 by multiple ``ZstdCompressor`` instances:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1103 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1104 d = zstd.ZstdCompressionDict(data) |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1105 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1106 # Precompute for compression level 3. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1107 d.precompute_compress(level=3) |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1108 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1109 # Precompute with specific compression parameters. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1110 params = zstd.ZstdCompressionParameters(...) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1111 d.precompute_compress(compression_params=params) |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1112 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1113 .. note:: |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1114 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1115 When a dictionary is precomputed, the compression parameters used to |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1116 precompute the dictionary overwrite some of the compression parameters |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1117 specified to ``ZstdCompressor.__init__``. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1118 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1119 Training Dictionaries |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1120 ^^^^^^^^^^^^^^^^^^^^^ |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1121 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1122 Unless using *prefix* dictionaries, dictionary data is produced by *training* |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1123 on existing data:: |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1124 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1125 dict_data = zstd.train_dictionary(size, samples) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1126 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1127 This takes a target dictionary size and list of bytes instances and creates and |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1128 returns a ``ZstdCompressionDict``. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1129 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1130 The dictionary training mechanism is known as *cover*. More details about it are |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1131 available in the paper *Effective Construction of Relative Lempel-Ziv |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1132 Dictionaries* (authors: Liao, Petri, Moffat, Wirth). |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1133 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1134 The cover algorithm takes parameters ``k` and ``d``. These are the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1135 *segment size* and *dmer size*, respectively. The returned dictionary |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1136 instance created by this function has ``k`` and ``d`` attributes |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1137 containing the values for these parameters. If a ``ZstdCompressionDict`` |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1138 is constructed from raw bytes data (a content-only dictionary), the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1139 ``k`` and ``d`` attributes will be ``0``. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1140 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1141 The segment and dmer size parameters to the cover algorithm can either be |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1142 specified manually or ``train_dictionary()`` can try multiple values |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1143 and pick the best one, where *best* means the smallest compressed data size. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1144 This later mode is called *optimization* mode. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1145 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1146 If none of ``k``, ``d``, ``steps``, ``threads``, ``level``, ``notifications``, |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1147 or ``dict_id`` (basically anything from the underlying ``ZDICT_cover_params_t`` |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1148 struct) are defined, *optimization* mode is used with default parameter |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1149 values. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1150 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1151 If ``steps`` or ``threads`` are defined, then *optimization* mode is engaged |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1152 with explicit control over those parameters. Specifying ``threads=0`` or |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1153 ``threads=1`` can be used to engage *optimization* mode if other parameters |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1154 are not defined. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1155 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1156 Otherwise, non-*optimization* mode is used with the parameters specified. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1157 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1158 This function takes the following arguments: |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1159 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1160 dict_size |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1161 Target size in bytes of the dictionary to generate. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1162 samples |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1163 A list of bytes holding samples the dictionary will be trained from. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1164 k |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1165 Parameter to cover algorithm defining the segment size. A reasonable range |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1166 is [16, 2048+]. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1167 d |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1168 Parameter to cover algorithm defining the dmer size. A reasonable range is |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1169 [6, 16]. ``d`` must be less than or equal to ``k``. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1170 dict_id |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1171 Integer dictionary ID for the produced dictionary. Default is 0, which uses |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1172 a random value. |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1173 steps |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1174 Number of steps through ``k`` values to perform when trying parameter |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1175 variations. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1176 threads |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1177 Number of threads to use when trying parameter variations. Default is 0, |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1178 which means to use a single thread. A negative value can be specified to |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1179 use as many threads as there are detected logical CPUs. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1180 level |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1181 Integer target compression level when trying parameter variations. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1182 notifications |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1183 Controls writing of informational messages to ``stderr``. ``0`` (the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1184 default) means to write nothing. ``1`` writes errors. ``2`` writes |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1185 progression info. ``3`` writes more details. And ``4`` writes all info. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1186 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1187 Explicit Compression Parameters |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1188 ------------------------------- |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1189 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1190 Zstandard offers a high-level *compression level* that maps to lower-level |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1191 compression parameters. For many consumers, this numeric level is the only |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1192 compression setting you'll need to touch. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1193 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1194 But for advanced use cases, it might be desirable to tweak these lower-level |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1195 settings. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1196 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1197 The ``ZstdCompressionParameters`` type represents these low-level compression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1198 settings. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1199 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1200 Instances of this type can be constructed from a myriad of keyword arguments |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1201 (defined below) for complete low-level control over each adjustable |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1202 compression setting. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1203 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1204 From a higher level, one can construct a ``ZstdCompressionParameters`` instance |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1205 given a desired compression level and target input and dictionary size |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1206 using ``ZstdCompressionParameters.from_level()``. e.g.:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1207 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1208 # Derive compression settings for compression level 7. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1209 params = zstd.ZstdCompressionParameters.from_level(7) |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1210 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1211 # With an input size of 1MB |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1212 params = zstd.ZstdCompressionParameters.from_level(7, source_size=1048576) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1213 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1214 Using ``from_level()``, it is also possible to override individual compression |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1215 parameters or to define additional settings that aren't automatically derived. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1216 e.g.:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1217 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1218 params = zstd.ZstdCompressionParameters.from_level(4, window_log=10) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1219 params = zstd.ZstdCompressionParameters.from_level(5, threads=4) |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1220 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1221 Or you can define low-level compression settings directly:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1222 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1223 params = zstd.ZstdCompressionParameters(window_log=12, enable_ldm=True) |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1224 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1225 Once a ``ZstdCompressionParameters`` instance is obtained, it can be used to |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1226 configure a compressor:: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1227 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1228 cctx = zstd.ZstdCompressor(compression_params=params) |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1229 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1230 The named arguments and attributes of ``ZstdCompressionParameters`` are as |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1231 follows: |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1232 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1233 * format |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1234 * compression_level |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1235 * window_log |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1236 * hash_log |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1237 * chain_log |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1238 * search_log |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1239 * min_match |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1240 * target_length |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1241 * strategy |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1242 * compression_strategy (deprecated: same as ``strategy``) |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1243 * write_content_size |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1244 * write_checksum |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1245 * write_dict_id |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1246 * job_size |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1247 * overlap_log |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1248 * overlap_size_log (deprecated: same as ``overlap_log``) |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1249 * force_max_window |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1250 * enable_ldm |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1251 * ldm_hash_log |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1252 * ldm_min_match |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1253 * ldm_bucket_size_log |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1254 * ldm_hash_rate_log |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1255 * ldm_hash_every_log (deprecated: same as ``ldm_hash_rate_log``) |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1256 * threads |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1257 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1258 Some of these are very low-level settings. It may help to consult the official |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1259 zstandard documentation for their behavior. Look for the ``ZSTD_p_*`` constants |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1260 in ``zstd.h`` (https://github.com/facebook/zstd/blob/dev/lib/zstd.h). |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1261 |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1262 Frame Inspection |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1263 ---------------- |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1264 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1265 Data emitted from zstd compression is encapsulated in a *frame*. This frame |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1266 begins with a 4 byte *magic number* header followed by 2 to 14 bytes describing |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1267 the frame in more detail. For more info, see |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1268 https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1269 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1270 ``zstd.get_frame_parameters(data)`` parses a zstd *frame* header from a bytes |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1271 instance and return a ``FrameParameters`` object describing the frame. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1272 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1273 Depending on which fields are present in the frame and their values, the |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1274 length of the frame parameters varies. If insufficient bytes are passed |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1275 in to fully parse the frame parameters, ``ZstdError`` is raised. To ensure |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1276 frame parameters can be parsed, pass in at least 18 bytes. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1277 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1278 ``FrameParameters`` instances have the following attributes: |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1279 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1280 content_size |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1281 Integer size of original, uncompressed content. This will be ``0`` if the |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1282 original content size isn't written to the frame (controlled with the |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1283 ``write_content_size`` argument to ``ZstdCompressor``) or if the input |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1284 content size was ``0``. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1285 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1286 window_size |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1287 Integer size of maximum back-reference distance in compressed data. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1288 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1289 dict_id |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1290 Integer of dictionary ID used for compression. ``0`` if no dictionary |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1291 ID was used or if the dictionary ID was ``0``. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1292 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1293 has_checksum |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1294 Bool indicating whether a 4 byte content checksum is stored at the end |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1295 of the frame. |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1296 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1297 ``zstd.frame_header_size(data)`` returns the size of the zstandard frame |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1298 header. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1299 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1300 ``zstd.frame_content_size(data)`` returns the content size as parsed from |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1301 the frame header. ``-1`` means the content size is unknown. ``0`` means |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1302 an empty frame. The content size is usually correct. However, it may not |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1303 be accurate. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1304 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1305 Misc Functionality |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1306 ------------------ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1307 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1308 estimate_decompression_context_size() |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1309 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1310 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1311 Estimate the memory size requirements for a decompressor instance. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1312 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1313 Constants |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1314 --------- |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1315 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1316 The following module constants/attributes are exposed: |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1317 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1318 ZSTD_VERSION |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1319 This module attribute exposes a 3-tuple of the Zstandard version. e.g. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1320 ``(1, 0, 0)`` |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1321 MAX_COMPRESSION_LEVEL |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1322 Integer max compression level accepted by compression functions |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1323 COMPRESSION_RECOMMENDED_INPUT_SIZE |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1324 Recommended chunk size to feed to compressor functions |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1325 COMPRESSION_RECOMMENDED_OUTPUT_SIZE |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1326 Recommended chunk size for compression output |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1327 DECOMPRESSION_RECOMMENDED_INPUT_SIZE |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1328 Recommended chunk size to feed into decompresor functions |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1329 DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1330 Recommended chunk size for decompression output |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1331 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1332 FRAME_HEADER |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1333 bytes containing header of the Zstandard frame |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1334 MAGIC_NUMBER |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1335 Frame header as an integer |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1336 |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1337 FLUSH_BLOCK |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1338 Flushing behavior that denotes to flush a zstd block. A decompressor will |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1339 be able to decode all data fed into the compressor so far. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1340 FLUSH_FRAME |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1341 Flushing behavior that denotes to end a zstd frame. Any new data fed |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1342 to the compressor will start a new frame. |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1343 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1344 CONTENTSIZE_UNKNOWN |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1345 Value for content size when the content size is unknown. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1346 CONTENTSIZE_ERROR |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1347 Value for content size when content size couldn't be determined. |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1348 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1349 WINDOWLOG_MIN |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1350 Minimum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1351 WINDOWLOG_MAX |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1352 Maximum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1353 CHAINLOG_MIN |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1354 Minimum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1355 CHAINLOG_MAX |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1356 Maximum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1357 HASHLOG_MIN |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1358 Minimum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1359 HASHLOG_MAX |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1360 Maximum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1361 SEARCHLOG_MIN |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1362 Minimum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1363 SEARCHLOG_MAX |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1364 Maximum value for compression parameter |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1365 MINMATCH_MIN |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1366 Minimum value for compression parameter |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1367 MINMATCH_MAX |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1368 Maximum value for compression parameter |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1369 SEARCHLENGTH_MIN |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1370 Minimum value for compression parameter |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1371 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1372 Deprecated: use ``MINMATCH_MIN`` |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1373 SEARCHLENGTH_MAX |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1374 Maximum value for compression parameter |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1375 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1376 Deprecated: use ``MINMATCH_MAX`` |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1377 TARGETLENGTH_MIN |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1378 Minimum value for compression parameter |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1379 STRATEGY_FAST |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1380 Compression strategy |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1381 STRATEGY_DFAST |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1382 Compression strategy |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1383 STRATEGY_GREEDY |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1384 Compression strategy |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1385 STRATEGY_LAZY |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1386 Compression strategy |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1387 STRATEGY_LAZY2 |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1388 Compression strategy |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1389 STRATEGY_BTLAZY2 |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1390 Compression strategy |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1391 STRATEGY_BTOPT |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1392 Compression strategy |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1393 STRATEGY_BTULTRA |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1394 Compression strategy |
42070
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1395 STRATEGY_BTULTRA2 |
675775c33ab6
zstandard: vendor python-zstandard 0.11
Gregory Szorc <gregory.szorc@gmail.com>
parents:
40121
diff
changeset
|
1396 Compression strategy |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1397 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1398 FORMAT_ZSTD1 |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1399 Zstandard frame format |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1400 FORMAT_ZSTD1_MAGICLESS |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1401 Zstandard frame format without magic header |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1402 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1403 Performance Considerations |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1404 -------------------------- |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1405 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1406 The ``ZstdCompressor`` and ``ZstdDecompressor`` types maintain state to a |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1407 persistent compression or decompression *context*. Reusing a ``ZstdCompressor`` |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1408 or ``ZstdDecompressor`` instance for multiple operations is faster than |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1409 instantiating a new ``ZstdCompressor`` or ``ZstdDecompressor`` for each |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1410 operation. The differences are magnified as the size of data decreases. For |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1411 example, the difference between *context* reuse and non-reuse for 100,000 |
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1412 100 byte inputs will be significant (possiby over 10x faster to reuse contexts) |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1413 whereas 10 100,000,000 byte inputs will be more similar in speed (because the |
30895
c32454d69b85
zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30822
diff
changeset
|
1414 time spent doing compression dwarfs time spent creating new *contexts*). |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1415 |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1416 Buffer Types |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1417 ------------ |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1418 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1419 The API exposes a handful of custom types for interfacing with memory buffers. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1420 The primary goal of these types is to facilitate efficient multi-object |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1421 operations. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1422 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1423 The essential idea is to have a single memory allocation provide backing |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1424 storage for multiple logical objects. This has 2 main advantages: fewer |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1425 allocations and optimal memory access patterns. This avoids having to allocate |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1426 a Python object for each logical object and furthermore ensures that access of |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1427 data for objects can be sequential (read: fast) in memory. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1428 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1429 BufferWithSegments |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1430 ^^^^^^^^^^^^^^^^^^ |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1431 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1432 The ``BufferWithSegments`` type represents a memory buffer containing N |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1433 discrete items of known lengths (segments). It is essentially a fixed size |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1434 memory address and an array of 2-tuples of ``(offset, length)`` 64-bit |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1435 unsigned native endian integers defining the byte offset and length of each |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1436 segment within the buffer. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1437 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1438 Instances behave like containers. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1439 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1440 ``len()`` returns the number of segments within the instance. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1441 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1442 ``o[index]`` or ``__getitem__`` obtains a ``BufferSegment`` representing an |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1443 individual segment within the backing buffer. That returned object references |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1444 (not copies) memory. This means that iterating all objects doesn't copy |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1445 data within the buffer. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1446 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1447 The ``.size`` attribute contains the total size in bytes of the backing |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1448 buffer. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1449 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1450 Instances conform to the buffer protocol. So a reference to the backing bytes |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1451 can be obtained via ``memoryview(o)``. A *copy* of the backing bytes can also |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1452 be obtained via ``.tobytes()``. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1453 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1454 The ``.segments`` attribute exposes the array of ``(offset, length)`` for |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1455 segments within the buffer. It is a ``BufferSegments`` type. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1456 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1457 BufferSegment |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1458 ^^^^^^^^^^^^^ |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1459 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1460 The ``BufferSegment`` type represents a segment within a ``BufferWithSegments``. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1461 It is essentially a reference to N bytes within a ``BufferWithSegments``. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1462 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1463 ``len()`` returns the length of the segment in bytes. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1464 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1465 ``.offset`` contains the byte offset of this segment within its parent |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1466 ``BufferWithSegments`` instance. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1467 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1468 The object conforms to the buffer protocol. ``.tobytes()`` can be called to |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1469 obtain a ``bytes`` instance with a copy of the backing bytes. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1470 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1471 BufferSegments |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1472 ^^^^^^^^^^^^^^ |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1473 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1474 This type represents an array of ``(offset, length)`` integers defining segments |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1475 within a ``BufferWithSegments``. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1476 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1477 The array members are 64-bit unsigned integers using host/native bit order. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1478 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1479 Instances conform to the buffer protocol. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1480 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1481 BufferWithSegmentsCollection |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1482 ^^^^^^^^^^^^^^^^^^^^^^^^^^^^ |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1483 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1484 The ``BufferWithSegmentsCollection`` type represents a virtual spanning view |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1485 of multiple ``BufferWithSegments`` instances. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1486 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1487 Instances are constructed from 1 or more ``BufferWithSegments`` instances. The |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1488 resulting object behaves like an ordered sequence whose members are the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1489 segments within each ``BufferWithSegments``. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1490 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1491 ``len()`` returns the number of segments within all ``BufferWithSegments`` |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1492 instances. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1493 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1494 ``o[index]`` and ``__getitem__(index)`` return the ``BufferSegment`` at |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1495 that offset as if all ``BufferWithSegments`` instances were a single |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1496 entity. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1497 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1498 If the object is composed of 2 ``BufferWithSegments`` instances with the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1499 first having 2 segments and the second have 3 segments, then ``b[0]`` |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1500 and ``b[1]`` access segments in the first object and ``b[2]``, ``b[3]``, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1501 and ``b[4]`` access segments from the second. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1502 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1503 Choosing an API |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1504 =============== |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1505 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1506 There are multiple APIs for performing compression and decompression. This is |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1507 because different applications have different needs and the library wants to |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1508 facilitate optimal use in as many use cases as possible. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1509 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1510 From a high-level, APIs are divided into *one-shot* and *streaming*: either you |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1511 are operating on all data at once or you operate on it piecemeal. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1512 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1513 The *one-shot* APIs are useful for small data, where the input or output |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1514 size is known. (The size can come from a buffer length, file size, or |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1515 stored in the zstd frame header.) A limitation of the *one-shot* APIs is that |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1516 input and output must fit in memory simultaneously. For say a 4 GB input, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1517 this is often not feasible. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1518 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1519 The *one-shot* APIs also perform all work as a single operation. So, if you |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1520 feed it large input, it could take a long time for the function to return. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1521 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1522 The streaming APIs do not have the limitations of the simple API. But the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1523 price you pay for this flexibility is that they are more complex than a |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1524 single function call. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1525 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1526 The streaming APIs put the caller in control of compression and decompression |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1527 behavior by allowing them to directly control either the input or output side |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1528 of the operation. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1529 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1530 With the *streaming input*, *compressor*, and *decompressor* APIs, the caller |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1531 has full control over the input to the compression or decompression stream. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1532 They can directly choose when new data is operated on. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1533 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1534 With the *streaming ouput* APIs, the caller has full control over the output |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1535 of the compression or decompression stream. It can choose when to receive |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1536 new data. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1537 |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1538 When using the *streaming* APIs that operate on file-like or stream objects, |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1539 it is important to consider what happens in that object when I/O is requested. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1540 There is potential for long pauses as data is read or written from the |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1541 underlying stream (say from interacting with a filesystem or network). This |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1542 could add considerable overhead. |
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1543 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1544 Thread Safety |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1545 ============= |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1546 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1547 ``ZstdCompressor`` and ``ZstdDecompressor`` instances have no guarantees |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1548 about thread safety. Do not operate on the same ``ZstdCompressor`` and |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1549 ``ZstdDecompressor`` instance simultaneously from different threads. It is |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1550 fine to have different threads call into a single instance, just not at the |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1551 same time. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1552 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1553 Some operations require multiple function calls to complete. e.g. streaming |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1554 operations. A single ``ZstdCompressor`` or ``ZstdDecompressor`` cannot be used |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1555 for simultaneously active operations. e.g. you must not start a streaming |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1556 operation when another streaming operation is already active. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1557 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1558 The C extension releases the GIL during non-trivial calls into the zstd C |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1559 API. Non-trivial calls are notably compression and decompression. Trivial |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1560 calls are things like parsing frame parameters. Where the GIL is released |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1561 is considered an implementation detail and can change in any release. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1562 |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1563 APIs that accept bytes-like objects don't enforce that the underlying object |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1564 is read-only. However, it is assumed that the passed object is read-only for |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1565 the duration of the function call. It is possible to pass a mutable object |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1566 (like a ``bytearray``) to e.g. ``ZstdCompressor.compress()``, have the GIL |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1567 released, and mutate the object from another thread. Such a race condition |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1568 is a bug in the consumer of python-zstandard. Most Python data types are |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1569 immutable, so unless you are doing something fancy, you don't need to |
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1570 worry about this. |
31796
e0dc40530c5a
zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
30895
diff
changeset
|
1571 |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1572 Note on Zstandard's *Experimental* API |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1573 ====================================== |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1574 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1575 Many of the Zstandard APIs used by this module are marked as *experimental* |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1576 within the Zstandard project. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1577 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1578 It is unclear how Zstandard's C API will evolve over time, especially with |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1579 regards to this *experimental* functionality. We will try to maintain |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1580 backwards compatibility at the Python API level. However, we cannot |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1581 guarantee this for things not under our control. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1582 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1583 Since a copy of the Zstandard source code is distributed with this |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1584 module and since we compile against it, the behavior of a specific |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1585 version of this module should be constant for all of time. So if you |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1586 pin the version of this module used in your projects (which is a Python |
37495
b1fb341d8a61
zstandard: vendor python-zstandard 0.9.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
31796
diff
changeset
|
1587 best practice), you should be shielded from unwanted future changes. |
30435
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1588 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1589 Donate |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1590 ====== |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1591 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1592 A lot of time has been invested into this project by the author. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1593 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1594 If you find this project useful and would like to thank the author for |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1595 their work, consider donating some money. Any amount is appreciated. |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1596 |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1597 .. image:: https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1598 :target: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=gregory%2eszorc%40gmail%2ecom&lc=US&item_name=python%2dzstandard¤cy_code=USD&bn=PP%2dDonationsBF%3abtn_donate_LG%2egif%3aNonHosted |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1599 :alt: Donate via PayPal |
b86a448a2965
zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff
changeset
|
1600 |
42937
69de49c4e39c
zstandard: vendor python-zstandard 0.12
Gregory Szorc <gregory.szorc@gmail.com>
parents:
42070
diff
changeset
|
1601 .. |ci-status| image:: https://dev.azure.com/gregoryszorc/python-zstandard/_apis/build/status/indygreg.python-zstandard?branchName=master |
69de49c4e39c
zstandard: vendor python-zstandard 0.12
Gregory Szorc <gregory.szorc@gmail.com>
parents:
42070
diff
changeset
|
1602 :target: https://dev.azure.com/gregoryszorc/python-zstandard/_apis/build/status/indygreg.python-zstandard?branchName=master |