-
-
Notifications
You must be signed in to change notification settings - Fork 33.3k
Description
Feature or enhancement
Proposal:
At the moment, the Py_buffer view onto the empty bytearray() or array.array(typecode) has arbitrary alignment. This is fine in C, since the buffers are length 0. I found on Linux x86-64 with at least CPython 3.14 and building 3.15a1 from source, the buffer backing bytearray is reliably aligned on an odd pointer. E.g., given the script code
import array
from ctypes import *
class py_buffer(Structure):
_fields_ = [
("buf", c_void_p),
("obj", py_object),
("len", c_ssize_t),
("itemsize", c_ssize_t),
("readonly", c_int),
("ndim", c_int),
("format", c_char_p),
("shape", POINTER(c_ssize_t)),
("strides", POINTER(c_ssize_t)),
("suboffsets", POINTER(c_ssize_t)),
("internal", c_void_p),
]
def buf_ptr(buffer):
buf = py_buffer()
assert not pythonapi.PyObject_GetBuffer(py_object(buffer), byref(buf), 0)
ptr = buf.buf
pythonapi.PyBuffer_Release(byref(buf))
return ptr
ptr = buf_ptr(bytearray())
print(f"bytearray() : {ptr:#x}")
ptr = buf_ptr(bytes())
print(f"bytes() : {ptr:#x}")
for t in "bBwhHiIlLqQfd":
ptr = buf_ptr(array.array(t))
print(f"array.array('{t}'): {ptr:#x}")I got, for example
bytearray() : 0x586d06546869
bytes() : 0x586d064fe218
array.array('b'): 0x7ba46a224b99
array.array('B'): 0x7ba46a224b99
array.array('w'): 0x7ba46a224b99
array.array('h'): 0x7ba46a224b99
array.array('H'): 0x7ba46a224b99
array.array('i'): 0x7ba46a224b99
array.array('I'): 0x7ba46a224b99
array.array('l'): 0x7ba46a224b99
array.array('L'): 0x7ba46a224b99
array.array('q'): 0x7ba46a224b99
array.array('Q'): 0x7ba46a224b99
array.array('f'): 0x7ba46a224b99
array.array('d'): 0x7ba46a224b99
Some other languages (Rust, for example) have stricter alignment requirements to produce slice-like object from a raw pointer, even if the number of elements you're allowed to read from it is 0. Concretely in Rust, any &[T] has to be backed by a non-null pointer that's aligned for type T, even if the slice is empty.
It would be convenient for interop if the default empty buffers of bytearray and array.array are aligned for any type; so this fast-path default case doesn't need special handling when taking out direct views onto the data buffer.
I came across this in numpy/numpy#30062, where the Pickle protocol-5 implementation for an empty array causes an empty bytearray to be loaded from the pickle stream (assuming inline buffers) and directly loaded to back the Numpy array, which was detectable in a Rust extension module I own.
Has this already been discussed elsewhere?
This is a minor feature, which does not need previous discussion elsewhere
Links to previous discussion of this feature:
No response