Merge pull request #6094 from radarhere/decoder

radarhere · web-flow · commit 397a9409959b · 2022-03-07T08:01:31.000+11:00
Improved codec documentation
diff --git a/docs/handbook/writing-your-own-image-plugin.rst b/docs/handbook/writing-your-own-image-plugin.rst
@@ -123,8 +123,12 @@ The ``tile`` attribute
 To be able to read the file as well as just identifying it, the ``tile``
 attribute must also be set. This attribute consists of a list of tile
 descriptors, where each descriptor specifies how data should be loaded to a
-given region in the image. In most cases, only a single descriptor is used,
-covering the full image.
+given region in the image.
+
+In most cases, only a single descriptor is used, covering the full image.
+:py:class:`.PsdImagePlugin.PsdImageFile` uses multiple tiles to combine
+channels within a single layer, given that the channels are stored separately,
+one after the other.
 
 The tile descriptor is a 4-tuple with the following contents::
 
@@ -324,42 +328,42 @@ The fields are used as follows:
     Whether the first line in the image is the top line on the screen (1), or
     the bottom line (-1). If omitted, the orientation defaults to 1.
 
-.. _file-decoders:
+.. _file-codecs:
 
-Writing Your Own File Decoder in C
-==================================
+Writing Your Own File Codec in C
+================================
 
-There are 3 stages in a file decoder's lifetime:
+There are 3 stages in a file codec's lifetime:
 
-1. Setup: Pillow looks for a function in the decoder registry, falling
-   back to a function named ``[decodername]_decoder`` on the internal
-   core image object.  That function is called with the ``args`` tuple
-   from the ``tile`` setup in the ``_open`` method.
+1. Setup: Pillow looks for a function in the decoder or encoder registry,
+   falling back to a function named ``[codecname]_decoder`` or
+   ``[codecname]_encoder`` on the internal core image object. That function is
+   called with the ``args`` tuple from the ``tile``.
 
-2. Decoding: The decoder's decode function is repeatedly called with
-   chunks of image data.
+2. Transforming: The codec's ``decode`` or ``encode`` function is repeatedly
+   called with chunks of image data.
 
-3. Cleanup: If the decoder has registered a cleanup function, it will
-   be called at the end of the decoding process, even if there was an
+3. Cleanup: If the codec has registered a cleanup function, it will
+   be called at the end of the transformation process, even if there was an
    exception raised.
 
 
 Setup
 -----
 
-The current conventions are that the decoder setup function is named
-``PyImaging_[Decodername]DecoderNew`` and defined in ``decode.c``. The
-python binding for it is named ``[decodername]_decoder`` and is setup
-from within the ``_imaging.c`` file in the codecs section of the
-function array.
+The current conventions are that the codec setup function is named
+``PyImaging_[codecname]DecoderNew`` or ``PyImaging_[codecname]EncoderNew``
+and defined in ``decode.c`` or ``encode.c``. The Python binding for it is
+named ``[codecname]_decoder`` or ``[codecname]_encoder`` and is set up from
+within the ``_imaging.c`` file in the codecs section of the function array.
 
-The setup function needs to call ``PyImaging_DecoderNew`` and at the
-very least, set the ``decode`` function pointer. The fields of
-interest in this object are:
+The setup function needs to call ``PyImaging_DecoderNew`` or
+``PyImaging_EncoderNew`` and at the very least, set the ``decode`` or
+``encode`` function pointer. The fields of interest in this object are:
 
-**decode**
-  Function pointer to the decode function, which has access to
-  ``im``, ``state``, and the buffer of data to be added to the image.
+**decode**/**encode**
+  Function pointer to the decode or encode function, which has access to
+  ``im``, ``state``, and the buffer of data to be transformed.
 
 **cleanup**
   Function pointer to the cleanup function, has access to ``state``.
@@ -369,36 +373,34 @@ interest in this object are:
 
 **state**
   An ImagingCodecStateInstance, will be set by Pillow. The ``context``
-  member is an opaque struct that can be used by the decoder to store
+  member is an opaque struct that can be used by the codec to store
   any format specific state or options.
 
-**pulls_fd**
-  **EXPERIMENTAL** -- **WARNING**, interface may change. If set to 1,
-  ``state->fd`` will be a pointer to the Python file like object.  The
-  decoder may use the functions in ``codec_fd.c`` to read directly
-  from the file like object rather than have the data pushed through a
-  buffer.  Note that this implementation may be refactored until this
-  warning is removed.
+**pulls_fd**/**pushes_fd**
+  If the decoder has ``pulls_fd`` or the encoder has ``pushes_fd`` set to 1,
+  ``state->fd`` will be a pointer to the Python file like object. The codec may
+  use the functions in ``codec_fd.c`` to read or write directly with the file
+  like object rather than have the data pushed through a buffer.
 
   .. versionadded:: 3.3.0
 
 
-Decoding
---------
+Transforming
+------------
 
-The decode function is called with the target (core) image, the
-decoder state structure, and a buffer of data to be decoded.
+The decode or encode function is called with the target (core) image, the codec
+state structure, and a buffer of data to be transformed.
 
-**Experimental** -- If ``pulls_fd`` is set, then the decode function
-is called once, with an empty buffer. It is the decoder's
-responsibility to decode the entire tile in that one call.  The rest of
-this section only applies if ``pulls_fd`` is not set.
+It is the codec's responsibility to pull as much data as possible out of the
+buffer and return the number of bytes consumed. The next call to the codec will
+include the previous unconsumed tail. The codec function will be called
+multiple times as the data processed.
 
-It is the decoder's responsibility to pull as much data as possible
-out of the buffer and return the number of bytes consumed. The next
-call to the decoder will include the previous unconsumed tail. The
-decoder function will be called multiple times as the data is read
-from the file like object.
+Alternatively, if ``pulls_fd`` or ``pushes_fd`` is set, then the decode or
+encode function is called once, with an empty buffer. It is the codec's
+responsibility to transform the entire tile in that one call.  Using this will
+provide a codec with more freedom, but that freedom may mean increased memory
+usage if the entire tile is held in memory at once by the codec.
 
 If an error occurs, set ``state->errcode`` and return -1.
 
@@ -407,10 +409,9 @@ Return -1 on success, without setting the errcode.
 Cleanup
 -------
 
-The cleanup function is called after the decoder returns a negative
-value, or if there is a read error from the file. This function should
-free any allocated memory and release any resources from external
-libraries.
+The cleanup function is called after the codec returns a negative
+value, or if there is an error. This function should free any allocated
+memory and release any resources from external libraries.
 
 .. _file-codecs-py:
 
@@ -425,11 +426,32 @@ They should be registered using :py:meth:`PIL.Image.register_decoder` and
 the file codecs, there are three stages in the lifetime of a
 Python-based file codec:
 
-1. Setup: Pillow looks for the decoder in the registry, then
+1. Setup: Pillow looks for the codec in the decoder or encoder registry, then
    instantiates the class.
 
 2. Transforming: The instance's ``decode`` method is repeatedly called with
    a buffer of data to be interpreted, or the ``encode`` method is repeatedly
    called with the size of data to be output.
 
-3. Cleanup: The instance's ``cleanup`` method is called.
+   Alternatively, if the decoder's ``_pulls_fd`` property (or the encoder's
+   ``_pushes_fd`` property) is set to ``True``, then ``decode`` and ``encode``
+   will only be called once. In the decoder, ``self.fd`` can be used to access
+   the file-like object. Using this will provide a codec with more freedom, but
+   that freedom may mean increased memory usage if entire file is held in
+   memory at once by the codec.
+
+   In ``decode``, once the data has been interpreted, ``set_as_raw`` can be
+   used to populate the image.
+
+3. Cleanup: The instance's ``cleanup`` method is called once the transformation
+   is complete. This can be used to clean up any resources used by the codec.
+
+   If you set ``_pulls_fd`` or ``_pushes_fd`` to ``True`` however, then you
+   probably chose to perform any cleanup tasks  at the end of ``decode`` or
+   ``encode``.
+
+For an example :py:class:`PIL.ImageFile.PyDecoder`, see `DdsImagePlugin
+<https://github.com/python-pillow/Pillow/blob/main/docs/example/DdsImagePlugin.py>`_.
+For a plugin that uses both :py:class:`PIL.ImageFile.PyDecoder` and
+:py:class:`PIL.ImageFile.PyEncoder`, see `BlpImagePlugin
+<https://github.com/python-pillow/Pillow/blob/main/src/PIL/BlpImagePlugin.py>`_
diff --git a/src/PIL/Image.py b/src/PIL/Image.py
@@ -2780,9 +2780,9 @@ def frombytes(mode, size, data, decoder_name="raw", *args):
     In its simplest form, this function takes three arguments
     (mode, size, and unpacked pixel data).
 
-    You can also use any pixel decoder supported by PIL.  For more
+    You can also use any pixel decoder supported by PIL. For more
     information on available decoders, see the section
-    :ref:`Writing Your Own File Decoder <file-decoders>`.
+    :ref:`Writing Your Own File Codec <file-codecs>`.
 
     Note that this function decodes pixel data only, not entire images.
     If you have an entire image in a string, wrap it in a
diff --git a/src/PIL/ImageFile.py b/src/PIL/ImageFile.py
@@ -718,6 +718,9 @@ def encode(self, bufsize):
 
     def encode_to_pyfd(self):
         """
+        If ``pushes_fd`` is ``True``, then this method will be used,
+        and ``encode()`` will only be called once.
+
         :returns: A tuple of ``(bytes consumed, errcode)``.
             Err codes are from :data:`.ImageFile.ERRORS`.
         """