Skip to content

Commit 615ed7f

Browse files
committed
Updates to respond to review
- Rewrite the digested/undigested dict section - Add a section for exceptions - wrap lines to ~80 chars - clarify that an exception is raised if level is passed to a decompressor - make quotes in docs for open vs ZstdFile consistent - Remove currently and repeated "note"
1 parent 987bd27 commit 615ed7f

File tree

1 file changed

+70
-77
lines changed

1 file changed

+70
-77
lines changed

Doc/library/compression.zstd.rst

Lines changed: 70 additions & 77 deletions
Original file line numberDiff line numberDiff line change
@@ -13,9 +13,9 @@
1313

1414
This module provides classes and functions for compressing and
1515
decompressing data using the Zstandard (or *zstd*) compression algorithm. Also
16-
included is a file interface that supports reading and writing the contents of ``.zst`` files.
17-
files created by the :program:`zstd` utility, as well as raw zstd compressed
18-
streams.
16+
included is a file interface that supports reading and writing the contents of
17+
``.zst`` files created by the :program:`zstd` utility, as well as raw zstd
18+
compressed streams.
1919

2020
The :mod:`!compression.zstd` module contains:
2121

@@ -31,6 +31,9 @@ The :mod:`!compression.zstd` module contains:
3131
:class:`Strategy` classes for setting advanced (de)compression parameters.
3232

3333

34+
Exceptions
35+
----------
36+
3437
.. exception:: ZstdError
3538

3639
This exception is raised when an error occurs during compression or
@@ -52,16 +55,17 @@ Reading and writing compressed files
5255
to read from or write to.
5356

5457
The mode argument can be either ``'r'`` for reading (default), ``'w'`` for
55-
overwriting, ``'a'`` for appending, or ``'x'`` for exclusive creation. These can
56-
equivalently be given as ``'rb'``, ``'wb'``, ``'ab'``, and ``'xb'`` respectively. You may
57-
also open in text mode with ``'rt'``, ``'wt'``, ``'at'``, and ``'xt'`` respectively.
58+
overwriting, ``'a'`` for appending, or ``'x'`` for exclusive creation. These
59+
can equivalently be given as ``'rb'``, ``'wb'``, ``'ab'``, and ``'xb'``
60+
respectively. You may also open in text mode with ``'rt'``, ``'wt'``,
61+
``'at'``, and ``'xt'`` respectively.
5862

5963
When opening a file for reading, the *options* argument can be a dictionary
6064
providing advanced decompression parameters; see
6165
:class:`DecompressionParameter` for detailed information about supported
6266
parameters. The *zstd_dict* argument is a :class:`ZstdDict` instance to be
63-
used during decompression. When opening a file for reading, the *level*
64-
argument should not be used.
67+
used during decompression. When opening a file for reading, if the *level*
68+
argument is passed a :exc:`!TypeError` will be raised.
6569

6670
When opening a file for writing, the *options* argument can be a dictionary
6771
providing advanced decompression parameters; see
@@ -76,11 +80,12 @@ Reading and writing compressed files
7680
*encoding*, *errors*, and *newline* parameters must not be provided.
7781

7882
In text mode, a :class:`ZstdFile` object is created, and wrapped in an
79-
:class:`io.TextIOWrapper` instance with the specified encoding, error handling
80-
behavior, and line endings.
83+
:class:`io.TextIOWrapper` instance with the specified encoding, error
84+
handling behavior, and line endings.
8185

8286

83-
.. class:: ZstdFile(file, /, mode='r', *, level=None, options=None, zstd_dict=None)
87+
.. class:: ZstdFile(file, /, mode='r', *, level=None, options=None, \
88+
zstd_dict=None)
8489

8590
Open a Zstandard-compressed file in binary mode.
8691

@@ -91,20 +96,20 @@ Reading and writing compressed files
9196
wrapping an existing file object, the wrapped file will not be closed when
9297
the :class:`ZstdFile` is closed.
9398

94-
The *mode* argument can be either ``"r"`` for reading (default), ``"w"`` for
95-
overwriting, ``"x"`` for exclusive creation, or ``"a"`` for appending. These
96-
can equivalently be given as ``"rb"``, ``"wb"``, ``"xb"`` and ``"ab"``
99+
The *mode* argument can be either ``'r'`` for reading (default), ``'w'`` for
100+
overwriting, ``'x'`` for exclusive creation, or ``'a'`` for appending. These
101+
can equivalently be given as ``'rb'``, ``'wb'``, ``'xb'`` and ``'ab'``
97102
respectively.
98103

99104
If *file* is a file object (rather than an actual file name), a mode of
100-
``"w"`` does not truncate the file, and is instead equivalent to ``"a"``.
105+
``'w'`` does not truncate the file, and is instead equivalent to ``'a'``.
101106

102107
When opening a file for reading, the *options* argument can be a dictionary
103108
providing advanced decompression parameters, see
104109
:class:`DecompressionParameter` for detailed information about supported
105110
parameters. The *zstd_dict* argument is a :class:`!ZstdDict` instance to be
106-
used during decompression. When opening a file for reading, the *level*
107-
argument should not be used.
111+
used during decompression. When opening a file for reading, if the *level*
112+
argument is passed a :exc:`!TypeError` will be raised.
108113

109114
When opening a file for writing, the *options* argument can be a dictionary
110115
providing advanced decompression parameters, see
@@ -132,8 +137,8 @@ Reading and writing compressed files
132137

133138
.. note:: While calling :meth:`peek` does not change the file position of
134139
the :class:`ZstdFile`, it may change the position of the underlying
135-
file object (for example, if the :class:`ZstdFile` was constructed by passing a
136-
file object for *filename*).
140+
file object (for example, if the :class:`ZstdFile` was constructed by
141+
passing a file object for *file*).
137142

138143
.. attribute:: mode
139144

@@ -307,7 +312,7 @@ Compressing and decompressing data in memory
307312

308313
Data found after the end of the compressed stream.
309314

310-
Before the end of the stream is reached, this will be ``b""``.
315+
Before the end of the stream is reached, this will be ``b''``.
311316

312317
.. attribute:: needs_input
313318

@@ -377,6 +382,40 @@ Zstandard dictionaries
377382
is an ordinary Zstandard dictionary, created from Zstandard functions,
378383
for example, :func:`train_dict` or the ``zstd`` CLI.
379384

385+
When passing a :class:`!ZstdDict` to a function, the
386+
:attr:`!as_digested_dict` and :attr:`!as_undigested_dict` attributes can
387+
control how the dictionary is loaded by passing them as the ``zstd_dict``
388+
argument, for example, ``compress(data, zstd_dict=zd.as_digested_dict)``.
389+
Digesting a dictionary is a costly operation that occurs when loading a
390+
Zstandard dictionary. When making multiple calls to compression or
391+
decompression, passing a digested dictionary will reduce the overhead of
392+
loading the dictionary.
393+
394+
.. list-table:: Difference for compression
395+
:widths: 10 14 10
396+
:header-rows: 1
397+
398+
* -
399+
- Digested dictionary
400+
- Undigested dictionary
401+
* - Advanced parameters of the compressor which may be overridden by
402+
the dictionary's parameters
403+
- ``window_log``, ``hash_log``, ``chain_log``, ``search_log``,
404+
``min_match``, ``target_length``, ``strategy``,
405+
``enable_long_distance_matching``, ``ldm_hash_log``,
406+
``ldm_min_match``, ``ldm_bucket_size_log``, ``ldm_hash_rate_log``,
407+
and some non-public parameters.
408+
- None
409+
* - :class:`!ZstdDict` internally caches the dictionary
410+
- Yes. It's faster when loading a digested dictionary again with the
411+
same compression level.
412+
- No. If you wish to load an undigested dictionary multiple times,
413+
consider reusing a compressor object.
414+
415+
If passing a :class:`!ZstdDict` without any attribute, an undigested
416+
dictionary is passed by default when compressing and a digested dictionary
417+
is passed by default when decompressing.
418+
380419
.. attribute:: dict_content
381420

382421
The content of the Zstandard dictionary, a ``bytes`` object. It's the
@@ -407,53 +446,6 @@ Zstandard dictionaries
407446

408447
Load as an undigested dictionary.
409448

410-
Digesting a dictionary is a costly operation. These two attributes can
411-
control how the dictionary is loaded to the compressor, by passing them
412-
as the ``zstd_dict`` argument, for example,
413-
``compress(data, zstd_dict=zd.as_digested_dict)``.
414-
415-
If don't use one of these attributes, an **undigested** dictionary is
416-
passed by default.
417-
418-
.. list-table:: Difference for compression
419-
:widths: 12 12 12
420-
:header-rows: 1
421-
422-
* -
423-
- | Digested
424-
| dictionary
425-
- | Undigested
426-
| dictionary
427-
* - | Some advanced
428-
| parameters of the
429-
| compressor may
430-
| be overridden
431-
| by dictionary's
432-
| parameters
433-
- | ``window_log``, ``hash_log``,
434-
| ``chain_log``, ``search_log``,
435-
| ``min_match``, ``target_length``,
436-
| ``strategy``,
437-
| ``enable_long_distance_matching``,
438-
| ``ldm_hash_log``, ``ldm_min_match``,
439-
| ``ldm_bucket_size_log``,
440-
| ``ldm_hash_rate_log``, and some
441-
| non-public parameters.
442-
- No
443-
* - | ZstdDict internally
444-
| caches the dictionary
445-
- | Yes. It's faster when
446-
| loading a digested
447-
| dictionary again with the same
448-
| compression level.
449-
- | No. If you wish to load an undigested
450-
| dictionary multiple times,
451-
| consider reusing a
452-
| compressor object.
453-
454-
A **digested** dictionary is used for decompression by default, which
455-
is faster when loaded multiple times.
456-
457449

458450
Advanced parameter control
459451
--------------------------
@@ -482,14 +474,14 @@ Advanced parameter control
482474
.. attribute:: compression_level
483475

484476
A high-level means of setting other compression parameters that affect
485-
the speed and ratio of compressing data. Setting the level to zero uses the
486-
default :attr:`COMPRESSION_LEVEL_DEFAULT`.
477+
the speed and ratio of compressing data. Setting the level to zero uses
478+
the default :attr:`COMPRESSION_LEVEL_DEFAULT`.
487479

488480
.. attribute:: window_log
489481

490482
Maximum allowed back-reference distance the compressor can use when
491-
compressing data, expressed as power of two, ``1 << window_log`` bytes. This
492-
parameter greatly influences the memory usage of compression. Higher
483+
compressing data, expressed as power of two, ``1 << window_log`` bytes.
484+
This parameter greatly influences the memory usage of compression. Higher
493485
values require more memory but gain better compression values.
494486

495487
.. attribute:: hash_log
@@ -519,9 +511,9 @@ Advanced parameter control
519511
Minimum size of searched matches. Larger values increase compression and
520512
decompression speed, but decrease ratio. Note that Zstandard can still
521513
find matches of smaller size, it just tweaks its search algorithm to look
522-
for this size and larger. Note that currently, for all strategies
523-
< :attr:`~Strategy.btopt`, the effective minimum is ``4``, for all
524-
strategies > :attr:`~Strategy.fast`, the effective maximum is ``6``.
514+
for this size and larger. For all strategies < :attr:`~Strategy.btopt`,
515+
the effective minimum is ``4``, for all strategies
516+
> :attr:`~Strategy.fast`, the effective maximum is ``6``.
525517

526518
.. attribute:: target_length
527519

@@ -599,7 +591,7 @@ Advanced parameter control
599591

600592
Select how many threads will be spawned to compress in parallel. When
601593
:attr:`!nb_workers` >= 1, enables multi-threaded compression, 1
602-
means "1-thread multi-threaded mode". More workers improve speed, but
594+
means "one-thread multi-threaded mode". More workers improve speed, but
603595
also increase memory usage and slightly reduce compression ratio.
604596

605597
.. attribute:: job_size
@@ -655,8 +647,9 @@ Advanced parameter control
655647

656648
.. note::
657649

658-
The values of attributes of :class:`Strategy` are not necessarily stable
659-
between zstd versions. Only the ordering may be relied upon.
650+
The values of attributes of :class:`!Strategy` are not necessarily stable
651+
across zstd versions. Only the ordering of the attributes may be relied
652+
upon.
660653

661654
The following strategies are available:
662655

0 commit comments

Comments
 (0)