PEP 756: Remove Open Questions (#3968)

vstinner · web-flow · commit b6cf6d47f345 · 2024-09-17T15:34:14.000+02:00
diff --git a/peps/pep-0756.rst b/peps/pep-0756.rst
@@ -102,7 +102,12 @@ longer rationale.
 PyUnicode_Export()
 ------------------
 
-API: ``int32_t PyUnicode_Export(PyObject *unicode, int32_t requested_formats, Py_buffer *view)``.
+API::
+
+    int32_t PyUnicode_Export(
+        PyObject *unicode,
+        int32_t requested_formats,
+        Py_buffer *view)
 
 Export the contents of the *unicode* string in one of the *requested_formats*.
 
@@ -116,6 +121,10 @@ The contents of the buffer are valid until they are released.
 
 The buffer is read-only and must not be modified.
 
+The ``view->len`` member must be used to get the string length. The
+buffer should end with a trailing NUL character, but it's not
+recommended to rely on that because of embedded NUL characters.
+
 *unicode* and *view* must not be NULL.
 
 Available formats:
@@ -152,14 +161,18 @@ needed. There are cases when a copy is needed, *O*\ (*n*) complexity:
 * If only UTF-8 is requested: the string is encoded to UTF-8 at the
   first call, and then the encoded UTF-8 string is cached.
 
-To have an *O*\ (1) complexity on CPython and PyPy, it's recommended to
+To get the best performance on CPython and PyPy, it's recommended to
 support these 4 formats::
 
     (PyUnicode_FORMAT_UCS1 \
      | PyUnicode_FORMAT_UCS2 \
      | PyUnicode_FORMAT_UCS4 \
      | PyUnicode_FORMAT_UTF8)
 
+PyPy uses UTF-8 natively and so the ``PyUnicode_FORMAT_UTF8`` format is
+recommended. It requires a memory copy, since PyPy ``str`` objects can
+be moved in memory (PyPy uses a moving garbage collector).
+
 
 Py_buffer format and item size
 ------------------------------
@@ -181,7 +194,12 @@ Export format               Buffer format       Item size
 PyUnicode_Import()
 ------------------
 
-API: ``PyObject* PyUnicode_Import(const void *data, Py_ssize_t nbytes, int32_t format)``.
+API::
+
+    PyObject* PyUnicode_Import(
+        const void *data,
+        Py_ssize_t nbytes,
+        int32_t format)
 
 Create a Unicode string object from a buffer in a supported format.
 
@@ -224,10 +242,6 @@ example, the UTF-8 format uses the ``surrogatepass`` error handler.
 
 Embedded NUL characters are allowed: they can be imported and exported.
 
-An exported string does not end with a trailing NUL character: the
-``PyUnicode_Export()`` caller must use ``Py_buffer.len`` to get the
-string length.
-
 
 Implementation
 ==============
@@ -242,19 +256,6 @@ There is no impact on the backward compatibility, only new C API
 functions are added.
 
 
-Open Questions
-==============
-
-* Should we guarantee that the exported buffer always ends with a NUL
-  character? Is it possible to implement it in *O*\ (1) complexity
-  in all Python implementations?
-* Is it ok to allow surrogate characters?
-* Should we add a flag to disallow embedded NUL characters? It would
-  have an *O*\ (*n*) complexity.
-* Should we add a flag to disallow surrogate characters? It would
-  have an *O*\ (*n*) complexity.
-
-
 Usage of PEP 393 C APIs
 =======================