contrib/python-zstandard/README.rst
author Matt Harbison <matt_harbison@yahoo.com>
Sun, 16 Jul 2017 17:40:36 -0400
changeset 33544 4c4e95cae33a
parent 31799 e0dc40530c5a
child 37495 b1fb341d8a61
permissions -rw-r--r--
archive: use a templater to build the metadata file There are no visible changes here. I'm starting to wonder if adding the '+' to the 'node' line instead of a separate key line in 3047167733dc was the right thing to do. The '{node}' keyword never includes '+' elsewhere, and the way setup.py works, it would truncate it anyway. Additionally, the file is missing '{p2node}' when 'wdir()' merges are archived. I thought about adding an 'identify' line that would correspond to `hg id -n`. But the other nodes are the full 40 characters, and the output most useful for versioning is the short form. All of this cries out for customization via templating. (Although maybe having the short identify line by default is still a good idea.)
Ignore whitespace changes - Everywhere: Within whitespace: At end of lines:
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     1
================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     2
python-zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     3
================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     4
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
     5
This project provides Python bindings for interfacing with the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
     6
`Zstandard <http://www.zstd.net>`_ compression library. A C extension
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
     7
and CFFI interface are provided.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
     8
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
     9
The primary goal of the project is to provide a rich interface to the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    10
underlying C API through a Pythonic interface while not sacrificing
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    11
performance. This means exposing most of the features and flexibility
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    12
of the C API while not sacrificing usability or safety that Python provides.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    13
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
    14
The canonical home for this project is
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
    15
https://github.com/indygreg/python-zstandard.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
    16
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    17
|  |ci-status| |win-ci-status|
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    18
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    19
State of Project
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    20
================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    21
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    22
The project is officially in beta state. The author is reasonably satisfied
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    23
that functionality works as advertised. **There will be some backwards
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    24
incompatible changes before 1.0, probably in the 0.9 release.** This may
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    25
involve renaming the main module from *zstd* to *zstandard* and renaming
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    26
various types and methods. Pin the package version to prevent unwanted
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    27
breakage when this change occurs!
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    28
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    29
This project is vendored and distributed with Mercurial 4.1, where it is
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    30
used in a production capacity.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    31
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    32
There is continuous integration for Python versions 2.6, 2.7, and 3.3+
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    33
on Linux x86_x64 and Windows x86 and x86_64. The author is reasonably
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    34
confident the extension is stable and works as advertised on these
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    35
platforms.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    36
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    37
The CFFI bindings are mostly feature complete. Where a feature is implemented
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    38
in CFFI, unit tests run against both C extension and CFFI implementation to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    39
ensure behavior parity.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    40
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    41
Expected Changes
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    42
----------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    43
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    44
The author is reasonably confident in the current state of what's
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    45
implemented on the ``ZstdCompressor`` and ``ZstdDecompressor`` types.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    46
Those APIs likely won't change significantly. Some low-level behavior
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    47
(such as naming and types expected by arguments) may change.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    48
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    49
There will likely be arguments added to control the input and output
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    50
buffer sizes (currently, certain operations read and write in chunk
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    51
sizes using zstd's preferred defaults).
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    52
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    53
There should be an API that accepts an object that conforms to the buffer
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    54
interface and returns an iterator over compressed or decompressed output.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    55
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    56
There should be an API that exposes an ``io.RawIOBase`` interface to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    57
compressor and decompressor streams, like how ``gzip.GzipFile`` from
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    58
the standard library works (issue 13).
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    59
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    60
The author is on the fence as to whether to support the extremely
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    61
low level compression and decompression APIs. It could be useful to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    62
support compression without the framing headers. But the author doesn't
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    63
believe it a high priority at this time.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    64
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    65
There will likely be a refactoring of the module names. Currently,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    66
``zstd`` is a C extension and ``zstd_cffi`` is the CFFI interface.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    67
This means that all code for the C extension must be implemented in
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    68
C. ``zstd`` may be converted to a Python module so code can be reused
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
    69
between CFFI and C and so not all code in the C extension has to be C.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    70
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    71
Requirements
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    72
============
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    73
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    74
This extension is designed to run with Python 2.6, 2.7, 3.3, 3.4, 3.5, and
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    75
3.6 on common platforms (Linux, Windows, and OS X). Only x86_64 is
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
    76
currently well-tested as an architecture.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    77
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    78
Installing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    79
==========
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    80
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    81
This package is uploaded to PyPI at https://pypi.python.org/pypi/zstandard.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    82
So, to install this package::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    83
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    84
   $ pip install zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    85
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    86
Binary wheels are made available for some platforms. If you need to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    87
install from a source distribution, all you should need is a working C
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    88
compiler and the Python development headers/libraries. On many Linux
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    89
distributions, you can install a ``python-dev`` or ``python-devel``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    90
package to provide these dependencies.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    91
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    92
Packages are also uploaded to Anaconda Cloud at
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    93
https://anaconda.org/indygreg/zstandard. See that URL for how to install
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    94
this package with ``conda``.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    95
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    96
Performance
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    97
===========
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    98
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
    99
Very crude and non-scientific benchmarking (most benchmarks fall in this
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   100
category because proper benchmarking is hard) show that the Python bindings
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   101
perform within 10% of the native C implementation.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   102
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   103
The following table compares the performance of compressing and decompressing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   104
a 1.1 GB tar file comprised of the files in a Firefox source checkout. Values
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   105
obtained with the ``zstd`` program are on the left. The remaining columns detail
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   106
performance of various compression APIs in the Python bindings.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   107
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   108
+-------+-----------------+-----------------+-----------------+---------------+
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   109
| Level | Native          | Simple          | Stream In       | Stream Out    |
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   110
|       | Comp / Decomp   | Comp / Decomp   | Comp / Decomp   | Comp          |
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   111
+=======+=================+=================+=================+===============+
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   112
|   1   | 490 / 1338 MB/s | 458 / 1266 MB/s | 407 / 1156 MB/s |  405 MB/s     |
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   113
+-------+-----------------+-----------------+-----------------+---------------+
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   114
|   2   | 412 / 1288 MB/s | 381 / 1203 MB/s | 345 / 1128 MB/s |  349 MB/s     |
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   115
+-------+-----------------+-----------------+-----------------+---------------+
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   116
|   3   | 342 / 1312 MB/s | 319 / 1182 MB/s | 285 / 1165 MB/s |  287 MB/s     |
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   117
+-------+-----------------+-----------------+-----------------+---------------+
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   118
|  11   |  64 / 1506 MB/s |  66 / 1436 MB/s |  56 / 1342 MB/s |   57 MB/s     |
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   119
+-------+-----------------+-----------------+-----------------+---------------+
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   120
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   121
Again, these are very unscientific. But it shows that Python is capable of
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   122
compressing at several hundred MB/s and decompressing at over 1 GB/s.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   123
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   124
Comparison to Other Python Bindings
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   125
===================================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   126
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   127
https://pypi.python.org/pypi/zstd is an alternate Python binding to
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   128
Zstandard. At the time this was written, the latest release of that
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   129
package (1.1.2) only exposed the simple APIs for compression and decompression.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   130
This package exposes much more of the zstd API, including streaming and
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   131
dictionary compression. This package also has CFFI support.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   132
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   133
Bundling of Zstandard Source Code
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   134
=================================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   135
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   136
The source repository for this project contains a vendored copy of the
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   137
Zstandard source code. This is done for a few reasons.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   138
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   139
First, Zstandard is relatively new and not yet widely available as a system
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   140
package. Providing a copy of the source code enables the Python C extension
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   141
to be compiled without requiring the user to obtain the Zstandard source code
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   142
separately.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   143
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   144
Second, Zstandard has both a stable *public* API and an *experimental* API.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   145
The *experimental* API is actually quite useful (contains functionality for
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   146
training dictionaries for example), so it is something we wish to expose to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   147
Python. However, the *experimental* API is only available via static linking.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   148
Furthermore, the *experimental* API can change at any time. So, control over
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   149
the exact version of the Zstandard library linked against is important to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   150
ensure known behavior.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   151
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   152
Instructions for Building and Testing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   153
=====================================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   154
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   155
Once you have the source code, the extension can be built via setup.py::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   156
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   157
   $ python setup.py build_ext
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   158
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   159
We recommend testing with ``nose``::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   160
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   161
   $ nosetests
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   162
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   163
A Tox configuration is present to test against multiple Python versions::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   164
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   165
   $ tox
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   166
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   167
Tests use the ``hypothesis`` Python package to perform fuzzing. If you
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   168
don't have it, those tests won't run. Since the fuzzing tests take longer
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   169
to execute than normal tests, you'll need to opt in to running them by
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   170
setting the ``ZSTD_SLOW_TESTS`` environment variable. This is set
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   171
automatically when using ``tox``.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   172
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   173
The ``cffi`` Python package needs to be installed in order to build the CFFI
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   174
bindings. If it isn't present, the CFFI bindings won't be built.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   175
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   176
To create a virtualenv with all development dependencies, do something
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   177
like the following::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   178
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   179
  # Python 2
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   180
  $ virtualenv venv
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   181
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   182
  # Python 3
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   183
  $ python3 -m venv venv
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   184
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   185
  $ source venv/bin/activate
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   186
  $ pip install cffi hypothesis nose tox
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   187
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   188
API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   189
===
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   190
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   191
The compiled C extension provides a ``zstd`` Python module. The CFFI
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   192
bindings provide a ``zstd_cffi`` module. Both provide an identical API
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   193
interface. The types, functions, and attributes exposed by these modules
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   194
are documented in the sections below.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   195
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   196
.. note::
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   197
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   198
   The documentation in this section makes references to various zstd
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   199
   concepts and functionality. The ``Concepts`` section below explains
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   200
   these concepts in more detail.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   201
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   202
ZstdCompressor
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   203
--------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   204
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   205
The ``ZstdCompressor`` class provides an interface for performing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   206
compression operations.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   207
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   208
Each instance is associated with parameters that control compression
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   209
behavior. These come from the following named arguments (all optional):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   210
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   211
level
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   212
   Integer compression level. Valid values are between 1 and 22.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   213
dict_data
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   214
   Compression dictionary to use.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   215
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   216
   Note: When using dictionary data and ``compress()`` is called multiple
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   217
   times, the ``CompressionParameters`` derived from an integer compression
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   218
   ``level`` and the first compressed data's size will be reused for all
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   219
   subsequent operations. This may not be desirable if source data size
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   220
   varies significantly.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   221
compression_params
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   222
   A ``CompressionParameters`` instance (overrides the ``level`` value).
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   223
write_checksum
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   224
   Whether a 4 byte checksum should be written with the compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   225
   Defaults to False. If True, the decompressor can verify that decompressed
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   226
   data matches the original input data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   227
write_content_size
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   228
   Whether the size of the uncompressed data will be written into the
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   229
   header of compressed data. Defaults to False. The data will only be
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   230
   written if the compressor knows the size of the input data. This is
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   231
   likely not true for streaming compression.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   232
write_dict_id
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   233
   Whether to write the dictionary ID into the compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   234
   Defaults to True. The dictionary ID is only written if a dictionary
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   235
   is being used.
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   236
threads
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   237
   Enables and sets the number of threads to use for multi-threaded compression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   238
   operations. Defaults to 0, which means to use single-threaded compression.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   239
   Negative values will resolve to the number of logical CPUs in the system.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   240
   Read below for more info on multi-threaded compression. This argument only
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   241
   controls thread count for operations that operate on individual pieces of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   242
   data. APIs that spawn multiple threads for working on multiple pieces of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   243
   data have their own ``threads`` argument.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   244
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   245
Unless specified otherwise, assume that no two methods of ``ZstdCompressor``
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   246
instances can be called from multiple Python threads simultaneously. In other
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   247
words, assume instances are not thread safe unless stated otherwise.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   248
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   249
Simple API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   250
^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   251
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   252
``compress(data)`` compresses and returns data as a one-shot operation.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   253
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   254
   cctx = zstd.ZstdCompressor()
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   255
   compressed = cctx.compress(b'data to compress')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   256
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   257
The ``data`` argument can be any object that implements the *buffer protocol*.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   258
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   259
Unless ``compression_params`` or ``dict_data`` are passed to the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   260
``ZstdCompressor``, each invocation of ``compress()`` will calculate the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   261
optimal compression parameters for the configured compression ``level`` and
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   262
input data size (some parameters are fine-tuned for small input sizes).
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   263
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   264
If a compression dictionary is being used, the compression parameters
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   265
determined from the first input's size will be reused for subsequent
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   266
operations.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   267
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   268
There is currently a deficiency in zstd's C APIs that makes it difficult
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   269
to round trip empty inputs when ``write_content_size=True``. Attempting
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   270
this will raise a ``ValueError`` unless ``allow_empty=True`` is passed
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   271
to ``compress()``.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   272
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   273
Streaming Input API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   274
^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   275
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   276
``write_to(fh)`` (which behaves as a context manager) allows you to *stream*
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   277
data into a compressor.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   278
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   279
   cctx = zstd.ZstdCompressor(level=10)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   280
   with cctx.write_to(fh) as compressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   281
       compressor.write(b'chunk 0')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   282
       compressor.write(b'chunk 1')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   283
       ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   284
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   285
The argument to ``write_to()`` must have a ``write(data)`` method. As
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   286
compressed data is available, ``write()`` will be called with the compressed
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   287
data as its argument. Many common Python types implement ``write()``, including
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   288
open file handles and ``io.BytesIO``.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   289
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   290
``write_to()`` returns an object representing a streaming compressor instance.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   291
It **must** be used as a context manager. That object's ``write(data)`` method
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   292
is used to feed data into the compressor.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   293
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   294
A ``flush()`` method can be called to evict whatever data remains within the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   295
compressor's internal state into the output object. This may result in 0 or
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   296
more ``write()`` calls to the output object.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   297
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   298
Both ``write()`` and ``flush()`` return the number of bytes written to the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   299
object's ``write()``. In many cases, small inputs do not accumulate enough
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   300
data to cause a write and ``write()`` will return ``0``.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   301
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   302
If the size of the data being fed to this streaming compressor is known,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   303
you can declare it before compression begins::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   304
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   305
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   306
   with cctx.write_to(fh, size=data_len) as compressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   307
       compressor.write(chunk0)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   308
       compressor.write(chunk1)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   309
       ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   310
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   311
Declaring the size of the source data allows compression parameters to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   312
be tuned. And if ``write_content_size`` is used, it also results in the
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   313
content size being written into the frame header of the output data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   314
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   315
The size of chunks being ``write()`` to the destination can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   316
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   317
    cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   318
    with cctx.write_to(fh, write_size=32768) as compressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   319
        ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   320
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   321
To see how much memory is being used by the streaming compressor::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   322
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   323
    cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   324
    with cctx.write_to(fh) as compressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   325
        ...
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   326
        byte_size = compressor.memory_size()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   327
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   328
Streaming Output API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   329
^^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   330
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   331
``read_from(reader)`` provides a mechanism to stream data out of a compressor
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   332
as an iterator of data chunks.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   333
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   334
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   335
   for chunk in cctx.read_from(fh):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   336
        # Do something with emitted data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   337
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   338
``read_from()`` accepts an object that has a ``read(size)`` method or conforms
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   339
to the buffer protocol. (``bytes`` and ``memoryview`` are 2 common types that
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   340
provide the buffer protocol.)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   341
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   342
Uncompressed data is fetched from the source either by calling ``read(size)``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   343
or by fetching a slice of data from the object directly (in the case where
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   344
the buffer protocol is being used). The returned iterator consists of chunks
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   345
of compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   346
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   347
If reading from the source via ``read()``, ``read()`` will be called until
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   348
it raises or returns an empty bytes (``b''``). It is perfectly valid for
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   349
the source to deliver fewer bytes than were what requested by ``read(size)``.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   350
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   351
Like ``write_to()``, ``read_from()`` also accepts a ``size`` argument
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   352
declaring the size of the input stream::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   353
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   354
    cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   355
    for chunk in cctx.read_from(fh, size=some_int):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   356
        pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   357
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   358
You can also control the size that data is ``read()`` from the source and
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   359
the ideal size of output chunks::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   360
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   361
    cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   362
    for chunk in cctx.read_from(fh, read_size=16384, write_size=8192):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   363
        pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   364
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   365
Unlike ``write_to()``, ``read_from()`` does not give direct control over the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   366
sizes of chunks fed into the compressor. Instead, chunk sizes will be whatever
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   367
the object being read from delivers. These will often be of a uniform size.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   368
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   369
Stream Copying API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   370
^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   371
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   372
``copy_stream(ifh, ofh)`` can be used to copy data between 2 streams while
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   373
compressing it.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   374
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   375
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   376
   cctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   377
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   378
For example, say you wish to compress a file::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   379
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   380
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   381
   with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   382
       cctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   383
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   384
It is also possible to declare the size of the source stream::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   385
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   386
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   387
   cctx.copy_stream(ifh, ofh, size=len_of_input)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   388
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   389
You can also specify how large the chunks that are ``read()`` and ``write()``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   390
from and to the streams::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   391
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   392
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   393
   cctx.copy_stream(ifh, ofh, read_size=32768, write_size=16384)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   394
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   395
The stream copier returns a 2-tuple of bytes read and written::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   396
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   397
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   398
   read_count, write_count = cctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   399
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   400
Compressor API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   401
^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   402
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   403
``compressobj()`` returns an object that exposes ``compress(data)`` and
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   404
``flush()`` methods. Each returns compressed data or an empty bytes.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   405
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   406
The purpose of ``compressobj()`` is to provide an API-compatible interface
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   407
with ``zlib.compressobj`` and ``bz2.BZ2Compressor``. This allows callers to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   408
swap in different compressor objects while using the same API.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   409
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   410
``flush()`` accepts an optional argument indicating how to end the stream.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   411
``zstd.COMPRESSOBJ_FLUSH_FINISH`` (the default) ends the compression stream.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   412
Once this type of flush is performed, ``compress()`` and ``flush()`` can
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   413
no longer be called. This type of flush **must** be called to end the
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   414
compression context. If not called, returned data may be incomplete.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   415
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   416
A ``zstd.COMPRESSOBJ_FLUSH_BLOCK`` argument to ``flush()`` will flush a
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   417
zstd block. Flushes of this type can be performed multiple times. The next
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   418
call to ``compress()`` will begin a new zstd block.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   419
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   420
Here is how this API should be used::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   421
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   422
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   423
   cobj = cctx.compressobj()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   424
   data = cobj.compress(b'raw input 0')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   425
   data = cobj.compress(b'raw input 1')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   426
   data = cobj.flush()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   427
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   428
Or to flush blocks::
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   429
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   430
   cctx.zstd.ZstdCompressor()
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   431
   cobj = cctx.compressobj()
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   432
   data = cobj.compress(b'chunk in first block')
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   433
   data = cobj.flush(zstd.COMPRESSOBJ_FLUSH_BLOCK)
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   434
   data = cobj.compress(b'chunk in second block')
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   435
   data = cobj.flush()
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   436
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   437
For best performance results, keep input chunks under 256KB. This avoids
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   438
extra allocations for a large output object.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   439
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   440
It is possible to declare the input size of the data that will be fed into
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   441
the compressor::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   442
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   443
   cctx = zstd.ZstdCompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   444
   cobj = cctx.compressobj(size=6)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   445
   data = cobj.compress(b'foobar')
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   446
   data = cobj.flush()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   447
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   448
Batch Compression API
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   449
^^^^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   450
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   451
(Experimental. Not yet supported in CFFI bindings.)
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   452
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   453
``multi_compress_to_buffer(data, [threads=0])`` performs compression of multiple
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   454
inputs as a single operation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   455
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   456
Data to be compressed can be passed as a ``BufferWithSegmentsCollection``, a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   457
``BufferWithSegments``, or a list containing byte like objects. Each element of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   458
the container will be compressed individually using the configured parameters
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   459
on the ``ZstdCompressor`` instance.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   460
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   461
The ``threads`` argument controls how many threads to use for compression. The
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   462
default is ``0`` which means to use a single thread. Negative values use the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   463
number of logical CPUs in the machine.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   464
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   465
The function returns a ``BufferWithSegmentsCollection``. This type represents
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   466
N discrete memory allocations, eaching holding 1 or more compressed frames.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   467
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   468
Output data is written to shared memory buffers. This means that unlike
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   469
regular Python objects, a reference to *any* object within the collection
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   470
keeps the shared buffer and therefore memory backing it alive. This can have
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   471
undesirable effects on process memory usage.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   472
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   473
The API and behavior of this function is experimental and will likely change.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   474
Known deficiencies include:
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   475
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   476
* If asked to use multiple threads, it will always spawn that many threads,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   477
  even if the input is too small to use them. It should automatically lower
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   478
  the thread count when the extra threads would just add overhead.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   479
* The buffer allocation strategy is fixed. There is room to make it dynamic,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   480
  perhaps even to allow one output buffer per input, facilitating a variation
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   481
  of the API to return a list without the adverse effects of shared memory
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   482
  buffers.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   483
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   484
ZstdDecompressor
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   485
----------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   486
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   487
The ``ZstdDecompressor`` class provides an interface for performing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   488
decompression.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   489
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   490
Each instance is associated with parameters that control decompression. These
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   491
come from the following named arguments (all optional):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   492
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   493
dict_data
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   494
   Compression dictionary to use.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   495
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   496
The interface of this class is very similar to ``ZstdCompressor`` (by design).
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   497
30822
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   498
Unless specified otherwise, assume that no two methods of ``ZstdDecompressor``
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   499
instances can be called from multiple Python threads simultaneously. In other
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   500
words, assume instances are not thread safe unless stated otherwise.
b54a2984cdd4 zstd: vendor python-zstandard 0.6.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30444
diff changeset
   501
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   502
Simple API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   503
^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   504
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   505
``decompress(data)`` can be used to decompress an entire compressed zstd
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   506
frame in a single operation.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   507
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   508
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   509
    decompressed = dctx.decompress(data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   510
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   511
By default, ``decompress(data)`` will only work on data written with the content
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   512
size encoded in its header. This can be achieved by creating a
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   513
``ZstdCompressor`` with ``write_content_size=True``. If compressed data without
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   514
an embedded content size is seen, ``zstd.ZstdError`` will be raised.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   515
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   516
If the compressed data doesn't have its content size embedded within it,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   517
decompression can be attempted by specifying the ``max_output_size``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   518
argument.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   519
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   520
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   521
    uncompressed = dctx.decompress(data, max_output_size=1048576)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   522
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   523
Ideally, ``max_output_size`` will be identical to the decompressed output
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   524
size.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   525
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   526
If ``max_output_size`` is too small to hold the decompressed data,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   527
``zstd.ZstdError`` will be raised.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   528
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   529
If ``max_output_size`` is larger than the decompressed data, the allocated
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   530
output buffer will be resized to only use the space required.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   531
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   532
Please note that an allocation of the requested ``max_output_size`` will be
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   533
performed every time the method is called. Setting to a very large value could
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   534
result in a lot of work for the memory allocator and may result in
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   535
``MemoryError`` being raised if the allocation fails.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   536
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   537
If the exact size of decompressed data is unknown, it is **strongly**
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   538
recommended to use a streaming API.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   539
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   540
Streaming Input API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   541
^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   542
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   543
``write_to(fh)`` can be used to incrementally send compressed data to a
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   544
decompressor.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   545
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   546
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   547
    with dctx.write_to(fh) as decompressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   548
        decompressor.write(compressed_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   549
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   550
This behaves similarly to ``zstd.ZstdCompressor``: compressed data is written to
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   551
the decompressor by calling ``write(data)`` and decompressed output is written
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   552
to the output object by calling its ``write(data)`` method.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   553
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   554
Calls to ``write()`` will return the number of bytes written to the output
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   555
object. Not all inputs will result in bytes being written, so return values
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   556
of ``0`` are possible.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   557
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   558
The size of chunks being ``write()`` to the destination can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   559
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   560
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   561
    with dctx.write_to(fh, write_size=16384) as decompressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   562
        pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   563
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   564
You can see how much memory is being used by the decompressor::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   565
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   566
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   567
    with dctx.write_to(fh) as decompressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   568
        byte_size = decompressor.memory_size()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   569
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   570
Streaming Output API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   571
^^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   572
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   573
``read_from(fh)`` provides a mechanism to stream decompressed data out of a
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   574
compressed source as an iterator of data chunks.:: 
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   575
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   576
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   577
    for chunk in dctx.read_from(fh):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   578
        # Do something with original data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   579
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   580
``read_from()`` accepts a) an object with a ``read(size)`` method that will
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   581
return  compressed bytes b) an object conforming to the buffer protocol that
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   582
can expose its data as a contiguous range of bytes. The ``bytes`` and
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   583
``memoryview`` types expose this buffer protocol.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   584
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   585
``read_from()`` returns an iterator whose elements are chunks of the
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   586
decompressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   587
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   588
The size of requested ``read()`` from the source can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   589
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   590
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   591
    for chunk in dctx.read_from(fh, read_size=16384):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   592
        pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   593
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   594
It is also possible to skip leading bytes in the input data::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   595
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   596
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   597
    for chunk in dctx.read_from(fh, skip_bytes=1):
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   598
        pass
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   599
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   600
Skipping leading bytes is useful if the source data contains extra
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   601
*header* data but you want to avoid the overhead of making a buffer copy
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   602
or allocating a new ``memoryview`` object in order to decompress the data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   603
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   604
Similarly to ``ZstdCompressor.read_from()``, the consumer of the iterator
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   605
controls when data is decompressed. If the iterator isn't consumed,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   606
decompression is put on hold.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   607
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   608
When ``read_from()`` is passed an object conforming to the buffer protocol,
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   609
the behavior may seem similar to what occurs when the simple decompression
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   610
API is used. However, this API works when the decompressed size is unknown.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   611
Furthermore, if feeding large inputs, the decompressor will work in chunks
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   612
instead of performing a single operation.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   613
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   614
Stream Copying API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   615
^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   616
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   617
``copy_stream(ifh, ofh)`` can be used to copy data across 2 streams while
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   618
performing decompression.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   619
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   620
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   621
    dctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   622
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   623
e.g. to decompress a file to another file::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   624
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   625
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   626
    with open(input_path, 'rb') as ifh, open(output_path, 'wb') as ofh:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   627
        dctx.copy_stream(ifh, ofh)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   628
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   629
The size of chunks being ``read()`` and ``write()`` from and to the streams
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   630
can be specified::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   631
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   632
    dctx = zstd.ZstdDecompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   633
    dctx.copy_stream(ifh, ofh, read_size=8192, write_size=16384)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   634
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   635
Decompressor API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   636
^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   637
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   638
``decompressobj()`` returns an object that exposes a ``decompress(data)``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   639
methods. Compressed data chunks are fed into ``decompress(data)`` and
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   640
uncompressed output (or an empty bytes) is returned. Output from subsequent
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   641
calls needs to be concatenated to reassemble the full decompressed byte
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   642
sequence.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   643
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   644
The purpose of ``decompressobj()`` is to provide an API-compatible interface
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   645
with ``zlib.decompressobj`` and ``bz2.BZ2Decompressor``. This allows callers
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   646
to swap in different decompressor objects while using the same API.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   647
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   648
Each object is single use: once an input frame is decoded, ``decompress()``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   649
can no longer be called.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   650
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   651
Here is how this API should be used::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   652
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   653
   dctx = zstd.ZstdDeompressor()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   654
   dobj = cctx.decompressobj()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   655
   data = dobj.decompress(compressed_chunk_0)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   656
   data = dobj.decompress(compressed_chunk_1)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   657
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   658
Batch Decompression API
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   659
^^^^^^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   660
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   661
(Experimental. Not yet supported in CFFI bindings.)
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   662
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   663
``multi_decompress_to_buffer()`` performs decompression of multiple
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   664
frames as a single operation and returns a ``BufferWithSegmentsCollection``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   665
containing decompressed data for all inputs.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   666
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   667
Compressed frames can be passed to the function as a ``BufferWithSegments``,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   668
a ``BufferWithSegmentsCollection``, or as a list containing objects that
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   669
conform to the buffer protocol. For best performance, pass a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   670
``BufferWithSegmentsCollection`` or a ``BufferWithSegments``, as
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   671
minimal input validation will be done for that type. If calling from
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   672
Python (as opposed to C), constructing one of these instances may add
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   673
overhead cancelling out the performance overhead of validation for list
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   674
inputs.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   675
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   676
The decompressed size of each frame must be discoverable. It can either be
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   677
embedded within the zstd frame (``write_content_size=True`` argument to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   678
``ZstdCompressor``) or passed in via the ``decompressed_sizes`` argument.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   679
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   680
The ``decompressed_sizes`` argument is an object conforming to the buffer
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   681
protocol which holds an array of 64-bit unsigned integers in the machine's
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   682
native format defining the decompressed sizes of each frame. If this argument
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   683
is passed, it avoids having to scan each frame for its decompressed size.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   684
This frame scanning can add noticeable overhead in some scenarios.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   685
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   686
The ``threads`` argument controls the number of threads to use to perform
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   687
decompression operations. The default (``0``) or the value ``1`` means to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   688
use a single thread. Negative values use the number of logical CPUs in the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   689
machine.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   690
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   691
.. note::
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   692
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   693
   It is possible to pass a ``mmap.mmap()`` instance into this function by
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   694
   wrapping it with a ``BufferWithSegments`` instance (which will define the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   695
   offsets of frames within the memory mapped region).
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   696
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   697
This function is logically equivalent to performing ``dctx.decompress()``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   698
on each input frame and returning the result.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   699
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   700
This function exists to perform decompression on multiple frames as fast
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   701
as possible by having as little overhead as possible. Since decompression is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   702
performed as a single operation and since the decompressed output is stored in
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   703
a single buffer, extra memory allocations, Python objects, and Python function
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   704
calls are avoided. This is ideal for scenarios where callers need to access
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   705
decompressed data for multiple frames.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   706
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   707
Currently, the implementation always spawns multiple threads when requested,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   708
even if the amount of work to do is small. In the future, it will be smarter
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   709
about avoiding threads and their associated overhead when the amount of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   710
work to do is small.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   711
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   712
Content-Only Dictionary Chain Decompression
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   713
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   714
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   715
``decompress_content_dict_chain(frames)`` performs decompression of a list of
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   716
zstd frames produced using chained *content-only* dictionary compression. Such
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   717
a list of frames is produced by compressing discrete inputs where each
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   718
non-initial input is compressed with a *content-only* dictionary consisting
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   719
of the content of the previous input.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   720
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   721
For example, say you have the following inputs::
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   722
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   723
   inputs = [b'input 1', b'input 2', b'input 3']
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   724
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   725
The zstd frame chain consists of:
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   726
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   727
1. ``b'input 1'`` compressed in standalone/discrete mode
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   728
2. ``b'input 2'`` compressed using ``b'input 1'`` as a *content-only* dictionary
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   729
3. ``b'input 3'`` compressed using ``b'input 2'`` as a *content-only* dictionary
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   730
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   731
Each zstd frame **must** have the content size written.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   732
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   733
The following Python code can be used to produce a *content-only dictionary
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   734
chain*::
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   735
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   736
    def make_chain(inputs):
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   737
        frames = []
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   738
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   739
        # First frame is compressed in standalone/discrete mode.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   740
        zctx = zstd.ZstdCompressor(write_content_size=True)
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   741
        frames.append(zctx.compress(inputs[0]))
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   742
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   743
        # Subsequent frames use the previous fulltext as a content-only dictionary
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   744
        for i, raw in enumerate(inputs[1:]):
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   745
            dict_data = zstd.ZstdCompressionDict(inputs[i])
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   746
            zctx = zstd.ZstdCompressor(write_content_size=True, dict_data=dict_data)
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   747
            frames.append(zctx.compress(raw))
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   748
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   749
        return frames
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   750
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   751
``decompress_content_dict_chain()`` returns the uncompressed data of the last
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   752
element in the input chain.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   753
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   754
It is possible to implement *content-only dictionary chain* decompression
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   755
on top of other Python APIs. However, this function will likely be significantly
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   756
faster, especially for long input chains, as it avoids the overhead of
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   757
instantiating and passing around intermediate objects between C and Python.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   758
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   759
Multi-Threaded Compression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   760
--------------------------
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   761
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   762
``ZstdCompressor`` accepts a ``threads`` argument that controls the number
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   763
of threads to use for compression. The way this works is that input is split
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   764
into segments and each segment is fed into a worker pool for compression. Once
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   765
a segment is compressed, it is flushed/appended to the output.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   766
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   767
The segment size for multi-threaded compression is chosen from the window size
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   768
of the compressor. This is derived from the ``window_log`` attribute of a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   769
``CompressionParameters`` instance. By default, segment sizes are in the 1+MB
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   770
range.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   771
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   772
If multi-threaded compression is requested and the input is smaller than the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   773
configured segment size, only a single compression thread will be used. If the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   774
input is smaller than the segment size multiplied by the thread pool size or
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   775
if data cannot be delivered to the compressor fast enough, not all requested
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   776
compressor threads may be active simultaneously.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   777
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   778
Compared to non-multi-threaded compression, multi-threaded compression has
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   779
higher per-operation overhead. This includes extra memory operations,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   780
thread creation, lock acquisition, etc.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   781
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   782
Due to the nature of multi-threaded compression using *N* compression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   783
*states*, the output from multi-threaded compression will likely be larger
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   784
than non-multi-threaded compression. The difference is usually small. But
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   785
there is a CPU/wall time versus size trade off that may warrant investigation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   786
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   787
Output from multi-threaded compression does not require any special handling
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   788
on the decompression side. In other words, any zstd decompressor should be able
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   789
to consume data produced with multi-threaded compression.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   790
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   791
Dictionary Creation and Management
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   792
----------------------------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   793
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   794
Compression dictionaries are represented as the ``ZstdCompressionDict`` type.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   795
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   796
Instances can be constructed from bytes::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   797
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   798
   dict_data = zstd.ZstdCompressionDict(data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   799
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   800
It is possible to construct a dictionary from *any* data. Unless the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   801
data begins with a magic header, the dictionary will be treated as
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   802
*content-only*. *Content-only* dictionaries allow compression operations
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   803
that follow to reference raw data within the content. For one use of
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   804
*content-only* dictionaries, see
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   805
``ZstdDecompressor.decompress_content_dict_chain()``.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   806
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   807
More interestingly, instances can be created by *training* on sample data::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   808
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   809
   dict_data = zstd.train_dictionary(size, samples)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   810
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   811
This takes a list of bytes instances and creates and returns a
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   812
``ZstdCompressionDict``.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   813
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   814
You can see how many bytes are in the dictionary by calling ``len()``::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   815
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   816
   dict_data = zstd.train_dictionary(size, samples)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   817
   dict_size = len(dict_data)  # will not be larger than ``size``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   818
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   819
Once you have a dictionary, you can pass it to the objects performing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   820
compression and decompression::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   821
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   822
   dict_data = zstd.train_dictionary(16384, samples)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   823
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   824
   cctx = zstd.ZstdCompressor(dict_data=dict_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   825
   for source_data in input_data:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   826
       compressed = cctx.compress(source_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   827
       # Do something with compressed data.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   828
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   829
   dctx = zstd.ZstdDecompressor(dict_data=dict_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   830
   for compressed_data in input_data:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   831
       buffer = io.BytesIO()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   832
       with dctx.write_to(buffer) as decompressor:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   833
           decompressor.write(compressed_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   834
       # Do something with raw data in ``buffer``.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   835
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   836
Dictionaries have unique integer IDs. You can retrieve this ID via::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   837
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   838
   dict_id = zstd.dictionary_id(dict_data)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   839
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   840
You can obtain the raw data in the dict (useful for persisting and constructing
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   841
a ``ZstdCompressionDict`` later) via ``as_bytes()``::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   842
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   843
   dict_data = zstd.train_dictionary(size, samples)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   844
   raw_data = dict_data.as_bytes()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   845
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   846
The following named arguments to ``train_dictionary`` can also be used
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   847
to further control dictionary generation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   848
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   849
selectivity
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   850
   Integer selectivity level. Default is 9. Larger values yield more data in
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   851
   dictionary.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   852
level
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   853
   Integer compression level. Default is 6.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   854
dict_id
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   855
   Integer dictionary ID for the produced dictionary. Default is 0, which
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   856
   means to use a random value.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   857
notifications
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   858
   Controls writing of informational messages to ``stderr``. ``0`` (the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   859
   default) means to write nothing. ``1`` writes errors. ``2`` writes
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   860
   progression info. ``3`` writes more details. And ``4`` writes all info.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   861
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   862
Cover Dictionaries
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   863
^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   864
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   865
An alternate dictionary training mechanism named *cover* is also available.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   866
More details about this training mechanism are available in the paper
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   867
*Effective Construction of Relative Lempel-Ziv Dictionaries* (authors:
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   868
Liao, Petri, Moffat, Wirth).
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   869
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   870
To use this mechanism, use ``zstd.train_cover_dictionary()`` instead of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   871
``zstd.train_dictionary()``. The function behaves nearly the same except
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   872
its arguments are different and the returned dictionary will contain ``k``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   873
and ``d`` attributes reflecting the parameters to the cover algorithm.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   874
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   875
.. note::
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   876
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   877
   The ``k`` and ``d`` attributes are only populated on dictionary
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   878
   instances created by this function. If a ``ZstdCompressionDict`` is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   879
   constructed from raw bytes data, the ``k`` and ``d`` attributes will
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   880
   be ``0``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   881
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   882
The segment and dmer size parameters to the cover algorithm can either be
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   883
specified manually or you can ask ``train_cover_dictionary()`` to try
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   884
multiple values and pick the best one, where *best* means the smallest
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   885
compressed data size.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   886
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   887
In manual mode, the ``k`` and ``d`` arguments must be specified or a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   888
``ZstdError`` will be raised.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   889
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   890
In automatic mode (triggered by specifying ``optimize=True``), ``k``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   891
and ``d`` are optional. If a value isn't specified, then default values for
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   892
both are tested.  The ``steps`` argument can control the number of steps
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   893
through ``k`` values. The ``level`` argument defines the compression level
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   894
that will be used when testing the compressed size. And ``threads`` can
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   895
specify the number of threads to use for concurrent operation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   896
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   897
This function takes the following arguments:
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   898
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   899
dict_size
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   900
   Target size in bytes of the dictionary to generate.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   901
samples
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   902
   A list of bytes holding samples the dictionary will be trained from.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   903
k
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   904
   Parameter to cover algorithm defining the segment size. A reasonable range
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   905
   is [16, 2048+].
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   906
d
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   907
   Parameter to cover algorithm defining the dmer size. A reasonable range is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   908
   [6, 16]. ``d`` must be less than or equal to ``k``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   909
dict_id
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   910
   Integer dictionary ID for the produced dictionary. Default is 0, which uses
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   911
   a random value.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   912
optimize
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   913
   When true, test dictionary generation with multiple parameters.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   914
level
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   915
   Integer target compression level when testing compression with
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   916
   ``optimize=True``. Default is 1.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   917
steps
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   918
   Number of steps through ``k`` values to perform when ``optimize=True``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   919
   Default is 32.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   920
threads
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   921
   Number of threads to use when ``optimize=True``. Default is 0, which means
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   922
   to use a single thread. A negative value can be specified to use as many
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   923
   threads as there are detected logical CPUs.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   924
notifications
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   925
   Controls writing of informational messages to ``stderr``. See the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   926
   documentation for ``train_dictionary()`` for more.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
   927
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   928
Explicit Compression Parameters
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   929
-------------------------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   930
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   931
Zstandard's integer compression levels along with the input size and dictionary
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   932
size are converted into a data structure defining multiple parameters to tune
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   933
behavior of the compression algorithm. It is possible to use define this
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   934
data structure explicitly to have lower-level control over compression behavior.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   935
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   936
The ``zstd.CompressionParameters`` type represents this data structure.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   937
You can see how Zstandard converts compression levels to this data structure
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   938
by calling ``zstd.get_compression_parameters()``. e.g.::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   939
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   940
    params = zstd.get_compression_parameters(5)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   941
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   942
This function also accepts the uncompressed data size and dictionary size
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   943
to adjust parameters::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   944
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   945
    params = zstd.get_compression_parameters(3, source_size=len(data), dict_size=len(dict_data))
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   946
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   947
You can also construct compression parameters from their low-level components::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   948
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   949
    params = zstd.CompressionParameters(20, 6, 12, 5, 4, 10, zstd.STRATEGY_FAST)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   950
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   951
You can then configure a compressor to use the custom parameters::
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   952
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   953
    cctx = zstd.ZstdCompressor(compression_params=params)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   954
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   955
The members/attributes of ``CompressionParameters`` instances are as follows::
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   956
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   957
* window_log
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   958
* chain_log
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   959
* hash_log
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   960
* search_log
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   961
* search_length
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   962
* target_length
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   963
* strategy
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   964
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   965
This is the order the arguments are passed to the constructor if not using
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   966
named arguments.
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   967
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   968
You'll need to read the Zstandard documentation for what these parameters
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   969
do.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
   970
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   971
Frame Inspection
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   972
----------------
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   973
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   974
Data emitted from zstd compression is encapsulated in a *frame*. This frame
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   975
begins with a 4 byte *magic number* header followed by 2 to 14 bytes describing
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   976
the frame in more detail. For more info, see
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   977
https://github.com/facebook/zstd/blob/master/doc/zstd_compression_format.md.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   978
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   979
``zstd.get_frame_parameters(data)`` parses a zstd *frame* header from a bytes
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   980
instance and return a ``FrameParameters`` object describing the frame.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   981
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   982
Depending on which fields are present in the frame and their values, the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   983
length of the frame parameters varies. If insufficient bytes are passed
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   984
in to fully parse the frame parameters, ``ZstdError`` is raised. To ensure
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   985
frame parameters can be parsed, pass in at least 18 bytes.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   986
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   987
``FrameParameters`` instances have the following attributes:
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   988
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   989
content_size
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   990
   Integer size of original, uncompressed content. This will be ``0`` if the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   991
   original content size isn't written to the frame (controlled with the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   992
   ``write_content_size`` argument to ``ZstdCompressor``) or if the input
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   993
   content size was ``0``.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   994
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   995
window_size
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   996
   Integer size of maximum back-reference distance in compressed data.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   997
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   998
dict_id
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
   999
   Integer of dictionary ID used for compression. ``0`` if no dictionary
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1000
   ID was used or if the dictionary ID was ``0``.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1001
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1002
has_checksum
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1003
   Bool indicating whether a 4 byte content checksum is stored at the end
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1004
   of the frame.
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1005
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1006
Misc Functionality
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1007
------------------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1008
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1009
estimate_compression_context_size(CompressionParameters)
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1010
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1011
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1012
Given a ``CompressionParameters`` struct, estimate the memory size required
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1013
to perform compression.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1014
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1015
estimate_decompression_context_size()
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1016
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1017
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1018
Estimate the memory size requirements for a decompressor instance.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1019
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1020
Constants
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1021
---------
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1022
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1023
The following module constants/attributes are exposed:
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1024
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1025
ZSTD_VERSION
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1026
    This module attribute exposes a 3-tuple of the Zstandard version. e.g.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1027
    ``(1, 0, 0)``
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1028
MAX_COMPRESSION_LEVEL
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1029
    Integer max compression level accepted by compression functions
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1030
COMPRESSION_RECOMMENDED_INPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1031
    Recommended chunk size to feed to compressor functions
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1032
COMPRESSION_RECOMMENDED_OUTPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1033
    Recommended chunk size for compression output
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1034
DECOMPRESSION_RECOMMENDED_INPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1035
    Recommended chunk size to feed into decompresor functions
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1036
DECOMPRESSION_RECOMMENDED_OUTPUT_SIZE
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1037
    Recommended chunk size for decompression output
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1038
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1039
FRAME_HEADER
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1040
    bytes containing header of the Zstandard frame
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1041
MAGIC_NUMBER
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1042
    Frame header as an integer
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1043
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1044
WINDOWLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1045
    Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1046
WINDOWLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1047
    Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1048
CHAINLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1049
    Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1050
CHAINLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1051
    Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1052
HASHLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1053
    Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1054
HASHLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1055
    Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1056
SEARCHLOG_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1057
    Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1058
SEARCHLOG_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1059
    Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1060
SEARCHLENGTH_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1061
    Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1062
SEARCHLENGTH_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1063
    Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1064
TARGETLENGTH_MIN
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1065
    Minimum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1066
TARGETLENGTH_MAX
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1067
    Maximum value for compression parameter
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1068
STRATEGY_FAST
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1069
    Compression strategy
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1070
STRATEGY_DFAST
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1071
    Compression strategy
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1072
STRATEGY_GREEDY
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1073
    Compression strategy
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1074
STRATEGY_LAZY
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1075
    Compression strategy
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1076
STRATEGY_LAZY2
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1077
    Compression strategy
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1078
STRATEGY_BTLAZY2
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1079
    Compression strategy
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1080
STRATEGY_BTOPT
30924
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1081
    Compression strategy
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1082
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1083
Performance Considerations
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1084
--------------------------
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1085
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1086
The ``ZstdCompressor`` and ``ZstdDecompressor`` types maintain state to a
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1087
persistent compression or decompression *context*. Reusing a ``ZstdCompressor``
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1088
or ``ZstdDecompressor`` instance for multiple operations is faster than
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1089
instantiating a new ``ZstdCompressor`` or ``ZstdDecompressor`` for each
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1090
operation. The differences are magnified as the size of data decreases. For
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1091
example, the difference between *context* reuse and non-reuse for 100,000
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1092
100 byte inputs will be significant (possiby over 10x faster to reuse contexts)
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1093
whereas 10 1,000,000 byte inputs will be more similar in speed (because the
c32454d69b85 zstd: vendor python-zstandard 0.7.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30822
diff changeset
  1094
time spent doing compression dwarfs time spent creating new *contexts*).
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1095
31799
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1096
Buffer Types
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1097
------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1098
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1099
The API exposes a handful of custom types for interfacing with memory buffers.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1100
The primary goal of these types is to facilitate efficient multi-object
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1101
operations.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1102
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1103
The essential idea is to have a single memory allocation provide backing
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1104
storage for multiple logical objects. This has 2 main advantages: fewer
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1105
allocations and optimal memory access patterns. This avoids having to allocate
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1106
a Python object for each logical object and furthermore ensures that access of
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1107
data for objects can be sequential (read: fast) in memory.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1108
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1109
BufferWithSegments
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1110
^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1111
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1112
The ``BufferWithSegments`` type represents a memory buffer containing N
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1113
discrete items of known lengths (segments). It is essentially a fixed size
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1114
memory address and an array of 2-tuples of ``(offset, length)`` 64-bit
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1115
unsigned native endian integers defining the byte offset and length of each
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1116
segment within the buffer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1117
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1118
Instances behave like containers.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1119
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1120
``len()`` returns the number of segments within the instance.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1121
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1122
``o[index]`` or ``__getitem__`` obtains a ``BufferSegment`` representing an
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1123
individual segment within the backing buffer. That returned object references
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1124
(not copies) memory. This means that iterating all objects doesn't copy
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1125
data within the buffer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1126
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1127
The ``.size`` attribute contains the total size in bytes of the backing
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1128
buffer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1129
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1130
Instances conform to the buffer protocol. So a reference to the backing bytes
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1131
can be obtained via ``memoryview(o)``. A *copy* of the backing bytes can also
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1132
be obtained via ``.tobytes()``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1133
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1134
The ``.segments`` attribute exposes the array of ``(offset, length)`` for
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1135
segments within the buffer. It is a ``BufferSegments`` type.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1136
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1137
BufferSegment
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1138
^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1139
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1140
The ``BufferSegment`` type represents a segment within a ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1141
It is essentially a reference to N bytes within a ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1142
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1143
``len()`` returns the length of the segment in bytes.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1144
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1145
``.offset`` contains the byte offset of this segment within its parent
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1146
``BufferWithSegments`` instance.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1147
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1148
The object conforms to the buffer protocol. ``.tobytes()`` can be called to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1149
obtain a ``bytes`` instance with a copy of the backing bytes.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1150
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1151
BufferSegments
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1152
^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1153
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1154
This type represents an array of ``(offset, length)`` integers defining segments
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1155
within a ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1156
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1157
The array members are 64-bit unsigned integers using host/native bit order.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1158
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1159
Instances conform to the buffer protocol.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1160
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1161
BufferWithSegmentsCollection
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1162
^^^^^^^^^^^^^^^^^^^^^^^^^^^^
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1163
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1164
The ``BufferWithSegmentsCollection`` type represents a virtual spanning view
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1165
of multiple ``BufferWithSegments`` instances.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1166
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1167
Instances are constructed from 1 or more ``BufferWithSegments`` instances. The
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1168
resulting object behaves like an ordered sequence whose members are the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1169
segments within each ``BufferWithSegments``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1170
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1171
``len()`` returns the number of segments within all ``BufferWithSegments``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1172
instances.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1173
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1174
``o[index]`` and ``__getitem__(index)`` return the ``BufferSegment`` at
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1175
that offset as if all ``BufferWithSegments`` instances were a single
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1176
entity.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1177
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1178
If the object is composed of 2 ``BufferWithSegments`` instances with the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1179
first having 2 segments and the second have 3 segments, then ``b[0]``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1180
and ``b[1]`` access segments in the first object and ``b[2]``, ``b[3]``,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1181
and ``b[4]`` access segments from the second.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1182
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1183
Choosing an API
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1184
===============
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1185
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1186
There are multiple APIs for performing compression and decompression. This is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1187
because different applications have different needs and the library wants to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1188
facilitate optimal use in as many use cases as possible.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1189
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1190
From a high-level, APIs are divided into *one-shot* and *streaming*. See
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1191
the ``Concepts`` section for a description of how these are different at
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1192
the C layer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1193
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1194
The *one-shot* APIs are useful for small data, where the input or output
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1195
size is known. (The size can come from a buffer length, file size, or
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1196
stored in the zstd frame header.) A limitation of the *one-shot* APIs is that
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1197
input and output must fit in memory simultaneously. For say a 4 GB input,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1198
this is often not feasible.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1199
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1200
The *one-shot* APIs also perform all work as a single operation. So, if you
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1201
feed it large input, it could take a long time for the function to return.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1202
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1203
The streaming APIs do not have the limitations of the simple API. But the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1204
price you pay for this flexibility is that they are more complex than a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1205
single function call.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1206
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1207
The streaming APIs put the caller in control of compression and decompression
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1208
behavior by allowing them to directly control either the input or output side
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1209
of the operation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1210
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1211
With the *streaming input*, *compressor*, and *decompressor* APIs, the caller
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1212
has full control over the input to the compression or decompression stream.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1213
They can directly choose when new data is operated on.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1214
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1215
With the *streaming ouput* APIs, the caller has full control over the output
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1216
of the compression or decompression stream. It can choose when to receive
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1217
new data.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1218
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1219
When using the *streaming* APIs that operate on file-like or stream objects,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1220
it is important to consider what happens in that object when I/O is requested.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1221
There is potential for long pauses as data is read or written from the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1222
underlying stream (say from interacting with a filesystem or network). This
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1223
could add considerable overhead.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1224
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1225
Concepts
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1226
========
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1227
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1228
It is important to have a basic understanding of how Zstandard works in order
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1229
to optimally use this library. In addition, there are some low-level Python
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1230
concepts that are worth explaining to aid understanding. This section aims to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1231
provide that knowledge.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1232
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1233
Zstandard Frames and Compression Format
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1234
---------------------------------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1235
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1236
Compressed zstandard data almost always exists within a container called a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1237
*frame*. (For the technically curious, see the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1238
`specification <https://github.com/facebook/zstd/blob/3bee41a70eaf343fbcae3637b3f6edbe52f35ed8/doc/zstd_compression_format.md>_.)
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1239
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1240
The frame contains a header and optional trailer. The header contains a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1241
magic number to self-identify as a zstd frame and a description of the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1242
compressed data that follows.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1243
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1244
Among other things, the frame *optionally* contains the size of the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1245
decompressed data the frame represents, a 32-bit checksum of the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1246
decompressed data (to facilitate verification during decompression),
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1247
and the ID of the dictionary used to compress the data.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1248
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1249
Storing the original content size in the frame (``write_content_size=True``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1250
to ``ZstdCompressor``) is important for performance in some scenarios. Having
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1251
the decompressed size stored there (or storing it elsewhere) allows
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1252
decompression to perform a single memory allocation that is exactly sized to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1253
the output. This is faster than continuously growing a memory buffer to hold
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1254
output.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1255
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1256
Compression and Decompression Contexts
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1257
--------------------------------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1258
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1259
In order to perform a compression or decompression operation with the zstd
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1260
C API, you need what's called a *context*. A context essentially holds
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1261
configuration and state for a compression or decompression operation. For
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1262
example, a compression context holds the configured compression level.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1263
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1264
Contexts can be reused for multiple operations. Since creating and
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1265
destroying contexts is not free, there are performance advantages to
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1266
reusing contexts.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1267
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1268
The ``ZstdCompressor`` and ``ZstdDecompressor`` types are essentially
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1269
wrappers around these contexts in the zstd C API.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1270
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1271
One-shot And Streaming Operations
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1272
---------------------------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1273
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1274
A compression or decompression operation can either be performed as a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1275
single *one-shot* operation or as a continuous *streaming* operation.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1276
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1277
In one-shot mode (the *simple* APIs provided by the Python interface),
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1278
**all** input is handed to the compressor or decompressor as a single buffer
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1279
and **all** output is returned as a single buffer.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1280
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1281
In streaming mode, input is delivered to the compressor or decompressor as
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1282
a series of chunks via multiple function calls. Likewise, output is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1283
obtained in chunks as well.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1284
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1285
Streaming operations require an additional *stream* object to be created
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1286
to track the operation. These are logical extensions of *context*
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1287
instances.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1288
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1289
There are advantages and disadvantages to each mode of operation. There
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1290
are scenarios where certain modes can't be used. See the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1291
``Choosing an API`` section for more.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1292
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1293
Dictionaries
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1294
------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1295
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1296
A compression *dictionary* is essentially data used to seed the compressor
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1297
state so it can achieve better compression. The idea is that if you are
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1298
compressing a lot of similar pieces of data (e.g. JSON documents or anything
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1299
sharing similar structure), then you can find common patterns across multiple
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1300
objects then leverage those common patterns during compression and
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1301
decompression operations to achieve better compression ratios.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1302
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1303
Dictionary compression is generally only useful for small inputs - data no
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1304
larger than a few kilobytes. The upper bound on this range is highly dependent
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1305
on the input data and the dictionary.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1306
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1307
Python Buffer Protocol
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1308
----------------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1309
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1310
Many functions in the library operate on objects that implement Python's
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1311
`buffer protocol <https://docs.python.org/3.6/c-api/buffer.html>`_.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1312
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1313
The *buffer protocol* is an internal implementation detail of a Python
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1314
type that allows instances of that type (objects) to be exposed as a raw
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1315
pointer (or buffer) in the C API. In other words, it allows objects to be
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1316
exposed as an array of bytes.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1317
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1318
From the perspective of the C API, objects implementing the *buffer protocol*
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1319
all look the same: they are just a pointer to a memory address of a defined
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1320
length. This allows the C API to be largely type agnostic when accessing their
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1321
data. This allows custom types to be passed in without first converting them
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1322
to a specific type.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1323
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1324
Many Python types implement the buffer protocol. These include ``bytes``
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1325
(``str`` on Python 2), ``bytearray``, ``array.array``, ``io.BytesIO``,
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1326
``mmap.mmap``, and ``memoryview``.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1327
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1328
``python-zstandard`` APIs that accept objects conforming to the buffer
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1329
protocol require that the buffer is *C contiguous* and has a single
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1330
dimension (``ndim==1``). This is usually the case. An example of where it
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1331
is not is a Numpy matrix type.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1332
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1333
Requiring Output Sizes for Non-Streaming Decompression APIs
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1334
-----------------------------------------------------------
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1335
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1336
Non-streaming decompression APIs require that either the output size is
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1337
explicitly defined (either in the zstd frame header or passed into the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1338
function) or that a max output size is specified. This restriction is for
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1339
your safety.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1340
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1341
The *one-shot* decompression APIs store the decompressed result in a
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1342
single buffer. This means that a buffer needs to be pre-allocated to hold
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1343
the result. If the decompressed size is not known, then there is no universal
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1344
good default size to use. Any default will fail or will be highly sub-optimal
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1345
in some scenarios (it will either be too small or will put stress on the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1346
memory allocator to allocate a too large block).
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1347
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1348
A *helpful* API may retry decompression with buffers of increasing size.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1349
While useful, there are obvious performance disadvantages, namely redoing
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1350
decompression N times until it works. In addition, there is a security
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1351
concern. Say the input came from highly compressible data, like 1 GB of the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1352
same byte value. The output size could be several magnitudes larger than the
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1353
input size. An input of <100KB could decompress to >1GB. Without a bounds
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1354
restriction on the decompressed size, certain inputs could exhaust all system
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1355
memory. That's not good and is why the maximum output size is limited.
e0dc40530c5a zstd: vendor python-zstandard 0.8.0
Gregory Szorc <gregory.szorc@gmail.com>
parents: 30924
diff changeset
  1356
30444
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1357
Note on Zstandard's *Experimental* API
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1358
======================================
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1359
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1360
Many of the Zstandard APIs used by this module are marked as *experimental*
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1361
within the Zstandard project. This includes a large number of useful
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1362
features, such as compression and frame parameters and parts of dictionary
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1363
compression.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1364
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1365
It is unclear how Zstandard's C API will evolve over time, especially with
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1366
regards to this *experimental* functionality. We will try to maintain
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1367
backwards compatibility at the Python API level. However, we cannot
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1368
guarantee this for things not under our control.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1369
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1370
Since a copy of the Zstandard source code is distributed with this
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1371
module and since we compile against it, the behavior of a specific
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1372
version of this module should be constant for all of time. So if you
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1373
pin the version of this module used in your projects (which is a Python
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1374
best practice), you should be buffered from unwanted future changes.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1375
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1376
Donate
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1377
======
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1378
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1379
A lot of time has been invested into this project by the author.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1380
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1381
If you find this project useful and would like to thank the author for
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1382
their work, consider donating some money. Any amount is appreciated.
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1383
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1384
.. image:: https://www.paypalobjects.com/en_US/i/btn/btn_donate_LG.gif
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1385
    :target: https://www.paypal.com/cgi-bin/webscr?cmd=_donations&business=gregory%2eszorc%40gmail%2ecom&lc=US&item_name=python%2dzstandard&currency_code=USD&bn=PP%2dDonationsBF%3abtn_donate_LG%2egif%3aNonHosted
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1386
    :alt: Donate via PayPal
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1387
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1388
.. |ci-status| image:: https://travis-ci.org/indygreg/python-zstandard.svg?branch=master
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1389
    :target: https://travis-ci.org/indygreg/python-zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1390
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1391
.. |win-ci-status| image:: https://ci.appveyor.com/api/projects/status/github/indygreg/python-zstandard?svg=true
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1392
    :target: https://ci.appveyor.com/project/indygreg/python-zstandard
b86a448a2965 zstd: vendor python-zstandard 0.5.0
Gregory Szorc <gregory.szorc@gmail.com>
parents:
diff changeset
  1393
    :alt: Windows build status