Skip to content

Commit b6cf6d4

Browse files
authored
PEP 756: Remove Open Questions (#3968)
1 parent 80f7aad commit b6cf6d4

File tree

1 file changed

+21
-20
lines changed

1 file changed

+21
-20
lines changed

peps/pep-0756.rst

Lines changed: 21 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -102,7 +102,12 @@ longer rationale.
102102
PyUnicode_Export()
103103
------------------
104104

105-
API: ``int32_t PyUnicode_Export(PyObject *unicode, int32_t requested_formats, Py_buffer *view)``.
105+
API::
106+
107+
int32_t PyUnicode_Export(
108+
PyObject *unicode,
109+
int32_t requested_formats,
110+
Py_buffer *view)
106111

107112
Export the contents of the *unicode* string in one of the *requested_formats*.
108113

@@ -116,6 +121,10 @@ The contents of the buffer are valid until they are released.
116121

117122
The buffer is read-only and must not be modified.
118123

124+
The ``view->len`` member must be used to get the string length. The
125+
buffer should end with a trailing NUL character, but it's not
126+
recommended to rely on that because of embedded NUL characters.
127+
119128
*unicode* and *view* must not be NULL.
120129

121130
Available formats:
@@ -152,14 +161,18 @@ needed. There are cases when a copy is needed, *O*\ (*n*) complexity:
152161
* If only UTF-8 is requested: the string is encoded to UTF-8 at the
153162
first call, and then the encoded UTF-8 string is cached.
154163

155-
To have an *O*\ (1) complexity on CPython and PyPy, it's recommended to
164+
To get the best performance on CPython and PyPy, it's recommended to
156165
support these 4 formats::
157166

158167
(PyUnicode_FORMAT_UCS1 \
159168
| PyUnicode_FORMAT_UCS2 \
160169
| PyUnicode_FORMAT_UCS4 \
161170
| PyUnicode_FORMAT_UTF8)
162171

172+
PyPy uses UTF-8 natively and so the ``PyUnicode_FORMAT_UTF8`` format is
173+
recommended. It requires a memory copy, since PyPy ``str`` objects can
174+
be moved in memory (PyPy uses a moving garbage collector).
175+
163176

164177
Py_buffer format and item size
165178
------------------------------
@@ -181,7 +194,12 @@ Export format Buffer format Item size
181194
PyUnicode_Import()
182195
------------------
183196

184-
API: ``PyObject* PyUnicode_Import(const void *data, Py_ssize_t nbytes, int32_t format)``.
197+
API::
198+
199+
PyObject* PyUnicode_Import(
200+
const void *data,
201+
Py_ssize_t nbytes,
202+
int32_t format)
185203

186204
Create a Unicode string object from a buffer in a supported format.
187205

@@ -224,10 +242,6 @@ example, the UTF-8 format uses the ``surrogatepass`` error handler.
224242

225243
Embedded NUL characters are allowed: they can be imported and exported.
226244

227-
An exported string does not end with a trailing NUL character: the
228-
``PyUnicode_Export()`` caller must use ``Py_buffer.len`` to get the
229-
string length.
230-
231245

232246
Implementation
233247
==============
@@ -242,19 +256,6 @@ There is no impact on the backward compatibility, only new C API
242256
functions are added.
243257

244258

245-
Open Questions
246-
==============
247-
248-
* Should we guarantee that the exported buffer always ends with a NUL
249-
character? Is it possible to implement it in *O*\ (1) complexity
250-
in all Python implementations?
251-
* Is it ok to allow surrogate characters?
252-
* Should we add a flag to disallow embedded NUL characters? It would
253-
have an *O*\ (*n*) complexity.
254-
* Should we add a flag to disallow surrogate characters? It would
255-
have an *O*\ (*n*) complexity.
256-
257-
258259
Usage of PEP 393 C APIs
259260
=======================
260261

0 commit comments

Comments
 (0)