Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
21 changes: 16 additions & 5 deletions Doc/c-api/unicode.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1726,10 +1726,6 @@ They all return ``NULL`` or ``-1`` if an exception occurs.
from user input, prefer calling :c:func:`PyUnicode_FromString` and
:c:func:`PyUnicode_InternInPlace` directly.

.. impl-detail::

Strings interned this way are made :term:`immortal`.


.. c:function:: unsigned int PyUnicode_CHECK_INTERNED(PyObject *str)

Expand Down Expand Up @@ -1806,9 +1802,24 @@ object.

See also :c:func:`PyUnicodeWriter_DecodeUTF8Stateful`.

.. c:function:: int PyUnicodeWriter_WriteASCII(PyUnicodeWriter *writer, const char *str, Py_ssize_t size)

Write the ASCII string *str* into *writer*.

*size* is the string length in bytes. If *size* is equal to ``-1``, call
``strlen(str)`` to get the string length.

*str* must only contain ASCII characters. The behavior is undefined if
*str* contains non-ASCII characters.

On success, return ``0``.
On error, set an exception, leave the writer unchanged, and return ``-1``.

.. versionadded:: next

.. c:function:: int PyUnicodeWriter_WriteWideChar(PyUnicodeWriter *writer, const wchar_t *str, Py_ssize_t size)

Writer the wide string *str* into *writer*.
Write the wide string *str* into *writer*.

*size* is a number of wide characters. If *size* is equal to ``-1``, call
``wcslen(str)`` to get the string length.
Expand Down
52 changes: 36 additions & 16 deletions Doc/library/ctypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2031,35 +2031,55 @@ Utility functions
pointer.


.. function:: create_string_buffer(init_or_size, size=None)
.. function:: create_string_buffer(init, size=None)
create_string_buffer(size)

This function creates a mutable character buffer. The returned object is a
ctypes array of :class:`c_char`.

*init_or_size* must be an integer which specifies the size of the array, or a
bytes object which will be used to initialize the array items.
If *size* is given (and not ``None``), it must be an :class:`int`.
It specifies the size of the returned array.

If a bytes object is specified as first argument, the buffer is made one item
larger than its length so that the last element in the array is a NUL
termination character. An integer can be passed as second argument which allows
specifying the size of the array if the length of the bytes should not be used.
If the *init* argument is given, it must be :class:`bytes`. It is used
to initialize the array items. Bytes not initialized this way are
set to zero (NUL).

If *size* is not given (or if it is ``None``), the buffer is made one element
larger than *init*, effectively adding a NUL terminator.

If both arguments are given, *size* must not be less than ``len(init)``.

.. warning::

If *size* is equal to ``len(init)``, a NUL terminator is
not added. Do not treat such a buffer as a C string.

For example::

>>> bytes(create_string_buffer(2))
b'\x00\x00'
>>> bytes(create_string_buffer(b'ab'))
b'ab\x00'
>>> bytes(create_string_buffer(b'ab', 2))
b'ab'
>>> bytes(create_string_buffer(b'ab', 4))
b'ab\x00\x00'
>>> bytes(create_string_buffer(b'abcdef', 2))
Traceback (most recent call last):
...
ValueError: byte string too long

.. audit-event:: ctypes.create_string_buffer init,size ctypes.create_string_buffer


.. function:: create_unicode_buffer(init_or_size, size=None)
.. function:: create_unicode_buffer(init, size=None)
create_unicode_buffer(size)

This function creates a mutable unicode character buffer. The returned object is
a ctypes array of :class:`c_wchar`.

*init_or_size* must be an integer which specifies the size of the array, or a
string which will be used to initialize the array items.

If a string is specified as first argument, the buffer is made one item
larger than the length of the string so that the last element in the array is a
NUL termination character. An integer can be passed as second argument which
allows specifying the size of the array if the length of the string should not
be used.
The function takes the same arguments as :func:`~create_string_buffer` except
*init* must be a string and *size* counts :class:`c_wchar`.

.. audit-event:: ctypes.create_unicode_buffer init,size ctypes.create_unicode_buffer

Expand Down
15 changes: 15 additions & 0 deletions Doc/whatsnew/3.15.rst
Original file line number Diff line number Diff line change
Expand Up @@ -138,6 +138,14 @@ Deprecated
Removed
=======

ctypes
------

* Removed the undocumented function :func:`!ctypes.SetPointerType`,
which has been deprecated since Python 3.13.
(Contributed by Bénédikt Tran in :gh:`133866`.)


http.server
-----------

Expand Down Expand Up @@ -218,6 +226,13 @@ New features
functions as replacements for :c:func:`PySys_GetObject`.
(Contributed by Serhiy Storchaka in :gh:`108512`.)

* Add :c:func:`PyUnicodeWriter_WriteASCII` function to write an ASCII string
into a :c:type:`PyUnicodeWriter`. The function is faster than
:c:func:`PyUnicodeWriter_WriteUTF8`, but has an undefined behavior if the
input string contains non-ASCII characters.
(Contributed by Victor Stinner in :gh:`133968`.)


Porting to Python 3.15
----------------------

Expand Down
4 changes: 4 additions & 0 deletions Include/cpython/unicodeobject.h
Original file line number Diff line number Diff line change
Expand Up @@ -478,6 +478,10 @@ PyAPI_FUNC(int) PyUnicodeWriter_WriteUTF8(
PyUnicodeWriter *writer,
const char *str,
Py_ssize_t size);
PyAPI_FUNC(int) PyUnicodeWriter_WriteASCII(
PyUnicodeWriter *writer,
const char *str,
Py_ssize_t size);
PyAPI_FUNC(int) PyUnicodeWriter_WriteWideChar(
PyUnicodeWriter *writer,
const wchar_t *str,
Expand Down
6 changes: 0 additions & 6 deletions Lib/ctypes/__init__.py
Original file line number Diff line number Diff line change
Expand Up @@ -379,12 +379,6 @@ def create_unicode_buffer(init, size=None):
return buf
raise TypeError(init)


def SetPointerType(pointer, cls):
import warnings
warnings._deprecated("ctypes.SetPointerType", remove=(3, 15))
pointer.set_type(cls)

def ARRAY(typ, len):
return typ * len

Expand Down
7 changes: 7 additions & 0 deletions Lib/test/test_capi/test_unicode.py
Original file line number Diff line number Diff line change
Expand Up @@ -1776,6 +1776,13 @@ def test_utf8(self):
self.assertEqual(writer.finish(),
"ascii-latin1=\xE9-euro=\u20AC.")

def test_ascii(self):
writer = self.create_writer(0)
writer.write_ascii(b"Hello ", -1)
writer.write_ascii(b"", 0)
writer.write_ascii(b"Python! <truncated>", 6)
self.assertEqual(writer.finish(), "Hello Python")

def test_invalid_utf8(self):
writer = self.create_writer(0)
with self.assertRaises(UnicodeDecodeError):
Expand Down
10 changes: 3 additions & 7 deletions Lib/test/test_ctypes/test_incomplete.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,5 @@
import ctypes
import unittest
import warnings
from ctypes import Structure, POINTER, pointer, c_char_p

# String-based "incomplete pointers" were implemented in ctypes 0.6.3 (2003, when
Expand All @@ -21,9 +20,7 @@ class cell(Structure):
_fields_ = [("name", c_char_p),
("next", lpcell)]

with warnings.catch_warnings():
warnings.simplefilter('ignore', DeprecationWarning)
ctypes.SetPointerType(lpcell, cell)
lpcell.set_type(cell)

self.assertIs(POINTER(cell), lpcell)

Expand All @@ -50,10 +47,9 @@ class cell(Structure):
_fields_ = [("name", c_char_p),
("next", lpcell)]

with self.assertWarns(DeprecationWarning):
ctypes.SetPointerType(lpcell, cell)

lpcell.set_type(cell)
self.assertIs(POINTER(cell), lpcell)


if __name__ == '__main__':
unittest.main()
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
Add :c:func:`PyUnicodeWriter_WriteASCII` function to write an ASCII string
into a :c:type:`PyUnicodeWriter`. The function is faster than
:c:func:`PyUnicodeWriter_WriteUTF8`, but has an undefined behavior if the
input string contains non-ASCII characters. Patch by Victor Stinner.
Original file line number Diff line number Diff line change
@@ -0,0 +1,3 @@
Remove the undocumented function :func:`!ctypes.SetPointerType`,
which has been deprecated since Python 3.13.
Patch by Bénédikt Tran.
10 changes: 5 additions & 5 deletions Modules/_json.c
Original file line number Diff line number Diff line change
Expand Up @@ -1476,13 +1476,13 @@ encoder_listencode_obj(PyEncoderObject *s, PyUnicodeWriter *writer,
int rv;

if (obj == Py_None) {
return PyUnicodeWriter_WriteUTF8(writer, "null", 4);
return PyUnicodeWriter_WriteASCII(writer, "null", 4);
}
else if (obj == Py_True) {
return PyUnicodeWriter_WriteUTF8(writer, "true", 4);
return PyUnicodeWriter_WriteASCII(writer, "true", 4);
}
else if (obj == Py_False) {
return PyUnicodeWriter_WriteUTF8(writer, "false", 5);
return PyUnicodeWriter_WriteASCII(writer, "false", 5);
}
else if (PyUnicode_Check(obj)) {
PyObject *encoded = encoder_encode_string(s, obj);
Expand Down Expand Up @@ -1649,7 +1649,7 @@ encoder_listencode_dict(PyEncoderObject *s, PyUnicodeWriter *writer,

if (PyDict_GET_SIZE(dct) == 0) {
/* Fast path */
return PyUnicodeWriter_WriteUTF8(writer, "{}", 2);
return PyUnicodeWriter_WriteASCII(writer, "{}", 2);
}

if (s->markers != Py_None) {
Expand Down Expand Up @@ -1753,7 +1753,7 @@ encoder_listencode_list(PyEncoderObject *s, PyUnicodeWriter *writer,
return -1;
if (PySequence_Fast_GET_SIZE(s_fast) == 0) {
Py_DECREF(s_fast);
return PyUnicodeWriter_WriteUTF8(writer, "[]", 2);
return PyUnicodeWriter_WriteASCII(writer, "[]", 2);
}

if (s->markers != Py_None) {
Expand Down
2 changes: 1 addition & 1 deletion Modules/_ssl.c
Original file line number Diff line number Diff line change
Expand Up @@ -563,7 +563,7 @@ fill_and_set_sslerror(_sslmodulestate *state,
goto fail;
}
}
if (PyUnicodeWriter_WriteUTF8(writer, "] ", 2) < 0) {
if (PyUnicodeWriter_WriteASCII(writer, "] ", 2) < 0) {
goto fail;
}
}
Expand Down
22 changes: 22 additions & 0 deletions Modules/_testcapi/unicode.c
Original file line number Diff line number Diff line change
Expand Up @@ -332,6 +332,27 @@ writer_write_utf8(PyObject *self_raw, PyObject *args)
}


static PyObject*
writer_write_ascii(PyObject *self_raw, PyObject *args)
{
WriterObject *self = (WriterObject *)self_raw;
if (writer_check(self) < 0) {
return NULL;
}

char *str;
Py_ssize_t size;
if (!PyArg_ParseTuple(args, "yn", &str, &size)) {
return NULL;
}

if (PyUnicodeWriter_WriteASCII(self->writer, str, size) < 0) {
return NULL;
}
Py_RETURN_NONE;
}


static PyObject*
writer_write_widechar(PyObject *self_raw, PyObject *args)
{
Expand Down Expand Up @@ -513,6 +534,7 @@ writer_finish(PyObject *self_raw, PyObject *Py_UNUSED(args))
static PyMethodDef writer_methods[] = {
{"write_char", _PyCFunction_CAST(writer_write_char), METH_VARARGS},
{"write_utf8", _PyCFunction_CAST(writer_write_utf8), METH_VARARGS},
{"write_ascii", _PyCFunction_CAST(writer_write_ascii), METH_VARARGS},
{"write_widechar", _PyCFunction_CAST(writer_write_widechar), METH_VARARGS},
{"write_ucs4", _PyCFunction_CAST(writer_write_ucs4), METH_VARARGS},
{"write_str", _PyCFunction_CAST(writer_write_str), METH_VARARGS},
Expand Down
6 changes: 3 additions & 3 deletions Objects/genericaliasobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -65,7 +65,7 @@ ga_repr_items_list(PyUnicodeWriter *writer, PyObject *p)

for (Py_ssize_t i = 0; i < len; i++) {
if (i > 0) {
if (PyUnicodeWriter_WriteUTF8(writer, ", ", 2) < 0) {
if (PyUnicodeWriter_WriteASCII(writer, ", ", 2) < 0) {
return -1;
}
}
Expand Down Expand Up @@ -109,7 +109,7 @@ ga_repr(PyObject *self)
}
for (Py_ssize_t i = 0; i < len; i++) {
if (i > 0) {
if (PyUnicodeWriter_WriteUTF8(writer, ", ", 2) < 0) {
if (PyUnicodeWriter_WriteASCII(writer, ", ", 2) < 0) {
goto error;
}
}
Expand All @@ -126,7 +126,7 @@ ga_repr(PyObject *self)
}
if (len == 0) {
// for something like tuple[()] we should print a "()"
if (PyUnicodeWriter_WriteUTF8(writer, "()", 2) < 0) {
if (PyUnicodeWriter_WriteASCII(writer, "()", 2) < 0) {
goto error;
}
}
Expand Down
4 changes: 2 additions & 2 deletions Objects/typevarobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -192,7 +192,7 @@ constevaluator_call(PyObject *self, PyObject *args, PyObject *kwargs)
for (Py_ssize_t i = 0; i < PyTuple_GET_SIZE(value); i++) {
PyObject *item = PyTuple_GET_ITEM(value, i);
if (i > 0) {
if (PyUnicodeWriter_WriteUTF8(writer, ", ", 2) < 0) {
if (PyUnicodeWriter_WriteASCII(writer, ", ", 2) < 0) {
PyUnicodeWriter_Discard(writer);
return NULL;
}
Expand Down Expand Up @@ -273,7 +273,7 @@ _Py_typing_type_repr(PyUnicodeWriter *writer, PyObject *p)
}

if (p == (PyObject *)&_PyNone_Type) {
return PyUnicodeWriter_WriteUTF8(writer, "None", 4);
return PyUnicodeWriter_WriteASCII(writer, "None", 4);
}

if ((rc = PyObject_HasAttrWithError(p, &_Py_ID(__origin__))) > 0 &&
Expand Down
14 changes: 14 additions & 0 deletions Objects/unicodeobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -14083,6 +14083,20 @@ _PyUnicodeWriter_WriteASCIIString(_PyUnicodeWriter *writer,
return 0;
}


int
PyUnicodeWriter_WriteASCII(PyUnicodeWriter *writer,
const char *str,
Py_ssize_t size)
{
assert(writer != NULL);
_Py_AssertHoldsTstate();

_PyUnicodeWriter *priv_writer = (_PyUnicodeWriter*)writer;
return _PyUnicodeWriter_WriteASCIIString(priv_writer, str, size);
}


int
PyUnicodeWriter_WriteUTF8(PyUnicodeWriter *writer,
const char *str,
Expand Down
8 changes: 4 additions & 4 deletions Objects/unionobject.c
Original file line number Diff line number Diff line change
Expand Up @@ -290,7 +290,7 @@ union_repr(PyObject *self)
}

for (Py_ssize_t i = 0; i < len; i++) {
if (i > 0 && PyUnicodeWriter_WriteUTF8(writer, " | ", 3) < 0) {
if (i > 0 && PyUnicodeWriter_WriteASCII(writer, " | ", 3) < 0) {
goto error;
}
PyObject *p = PyTuple_GET_ITEM(alias->args, i);
Expand All @@ -300,12 +300,12 @@ union_repr(PyObject *self)
}

#if 0
PyUnicodeWriter_WriteUTF8(writer, "|args=", 6);
PyUnicodeWriter_WriteASCII(writer, "|args=", 6);
PyUnicodeWriter_WriteRepr(writer, alias->args);
PyUnicodeWriter_WriteUTF8(writer, "|h=", 3);
PyUnicodeWriter_WriteASCII(writer, "|h=", 3);
PyUnicodeWriter_WriteRepr(writer, alias->hashable_args);
if (alias->unhashable_args) {
PyUnicodeWriter_WriteUTF8(writer, "|u=", 3);
PyUnicodeWriter_WriteASCII(writer, "|u=", 3);
PyUnicodeWriter_WriteRepr(writer, alias->unhashable_args);
}
#endif
Expand Down
Loading
Loading