|
| 1 | +====== |
| 2 | +Arrays |
| 3 | +====== |
| 4 | + |
| 5 | +To get the full power out of a C++ extension, you will often need to pass arrays of data between Python and C++. |
| 6 | +Libpy has native support for integrating with numpy, the most popular ndarray library for Python. |
| 7 | + |
| 8 | +Libpy supports receiving arrays as views so that no data needs to be copied. |
| 9 | +Libpy array views can also be const to guarantee that the underlying data isn't mutated. |
| 10 | +Libpy also supports creating Numpy arrays as views over C++ containers without copying the underlying data. |
| 11 | + |
| 12 | +``py::array_view`` |
| 13 | +================== |
| 14 | + |
| 15 | +Libpy can accept numpy arrays, or generally any buffer-like object, through a :cpp:class:`py::ndarray_view`. |
| 16 | +:cpp:class:`py::ndarray_view` is a template type which takes as a parameter the C++ type of the elements of the array and the number of dimensions. |
| 17 | +For example: ``py::ndarray_view<std::int32_t, 3>`` is a view of a 3d array of signed 32 bit integers. |
| 18 | +The type of the elements of a :cpp:class:`py::ndarray_view` are fixed at compile time, but the shape is determined at runtime. |
| 19 | + |
| 20 | +As a convenience, :cpp:type:`py::array_view` is an alias of :cpp:class:`py::ndarray_view` for one dimensional arrays. |
| 21 | + |
| 22 | +Shape and Strides |
| 23 | +----------------- |
| 24 | + |
| 25 | +Like numpy, an array view is composed of three parts: |
| 26 | + |
| 27 | +- shape :: ``std::array<std::size_t>`` |
| 28 | +- strides :: ``std::array<std::int64_t>`` |
| 29 | +- buffer :: ``(const) std::byte*`` |
| 30 | + |
| 31 | +The shape array contains the number of elements along each axis. |
| 32 | +For example: ``{2, 3}`` would be an array with 2 rows and 3 columns. |
| 33 | + |
| 34 | +The strides array contains the number of bytes needed to move one step along each axis. |
| 35 | +For example: given a ``{2, 3}`` shaped array of 4 byte elements, then strides of ``{12, 4}`` would be a C-contiguous array because the rows are contiguous. |
| 36 | +Given the same ``{2, 3}`` shaped array of 4 byte elements, then strides of ``{4, 8}`` would be a Fortran-contiguous array because the rows are contiguous. |
| 37 | + |
| 38 | +The buffer must be a ``(const) std::byte*`` and not a ``(const) T*`` |
| 39 | + |
| 40 | +Non-contiguous views |
| 41 | +-------------------- |
| 42 | + |
| 43 | +Array views do not need to view contiguous arrays. |
| 44 | +For example, given a C-contiguous ``{4, 5}`` array of 2 byte values, we could take a view of first column by producing an array view with strides ``{10}``. |
| 45 | + |
| 46 | +Simple Array Input |
| 47 | +================== |
| 48 | + |
| 49 | +Let's write function to sum an array: |
| 50 | + |
| 51 | +.. code-block:: c++ |
| 52 | + |
| 53 | + std::int64_t simple_sum(py::array_view<const std::int64_t> values) { |
| 54 | + std::int64_t out = 0; |
| 55 | + for (auto value : values) { |
| 56 | + out += value; |
| 57 | + } |
| 58 | + return out; |
| 59 | + } |
| 60 | + |
| 61 | +This function has one parameter, ``values`` which is a view over the data being summed. |
| 62 | +This parameter should be passed by value because it is only a view, and therefore small, like a :cpp:class:`std::string_view`. |
| 63 | + |
| 64 | +From C++ |
| 65 | +-------- |
| 66 | + |
| 67 | +:cpp:type:`py::array_view` has an implicit constructor from any type that exposes both ``data()`` and ``size()`` member functions, like :cpp:class:`std::vector`. |
| 68 | +This means we can call ``simple_sum`` directly from C++, for example: |
| 69 | + |
| 70 | +.. code-block:: c++ |
| 71 | + |
| 72 | + std::vector<std::int64_t> vs(100); |
| 73 | + std::iota(vs.begin(), vs.end(), 0); |
| 74 | + |
| 75 | + std::int64_t sum = simple_sum(vs); |
| 76 | + |
| 77 | +From Python |
| 78 | +----------- |
| 79 | + |
| 80 | +To call ``simple_sum`` from Python, we must first use :cpp:func:`py::automethod` to adapt the function and then attach it to a module. |
| 81 | +For example: |
| 82 | + |
| 83 | +.. code-block:: |
| 84 | +
|
| 85 | + LIBPY_AUTOMODULE(libpy_tutorial, |
| 86 | + arrays, |
| 87 | + ({py::autofunction<simple_sum>("simple_sum")})) |
| 88 | + (py::borrowed_ref<>) { |
| 89 | + return false; |
| 90 | + } |
| 91 | +
|
| 92 | +Now, we can import the function and pass it numpy arrays: |
| 93 | + |
| 94 | +.. ipython:: python |
| 95 | +
|
| 96 | + import numpy as np |
| 97 | + from libpy_tutorial.arrays import simple_sum |
| 98 | + arr = np.arange(10); arr |
| 99 | + simple_sum(arr) |
| 100 | +
|
| 101 | +Shallow Constness |
| 102 | +================= |
| 103 | + |
| 104 | +:cpp:class:`py::ndarray_view` implements shallow constness. |
| 105 | +Shallow constness means that a ``const py::ndarray_view`` allows mutation to the underlying data, but not mutation of what is being pointed to. |
| 106 | +Shallow constness means that :cpp:class:`py::ndarray_view` acts like a pointer, not a reference. |
| 107 | +One may have a ``const`` pointer to non ``const`` data. |
| 108 | + |
| 109 | +To create an immutable view, the ``const`` must be injected into the viewed type. |
| 110 | +Instead of having a ``const`` view of ``int``, have a view of ``const int``. |
| 111 | + |
| 112 | +.. code-block:: c++ |
| 113 | + |
| 114 | + py::ndarray_view<T, n> // mutable elements |
| 115 | + const py::ndarray_view<T, n> // mutable elements |
| 116 | + py::ndarray_view<const T, n> // immutable elements |
| 117 | + |
| 118 | + |
| 119 | +Freeze |
| 120 | +------ |
| 121 | + |
| 122 | +Given a mutable view, the :cpp:func:`py::ndarray_view::freeze` member function returns an immutable view over the same data. |
| 123 | +This is useful for ensuring that a particular component doesn't mutate a view that is otherwise mutable. |
| 124 | +:cpp:func:`py::ndarray_view::freeze` exists for immutable views, but is a nop. |
| 125 | + |
| 126 | + |
| 127 | +``py::array_view`` extended interface |
| 128 | +===================================== |
| 129 | + |
| 130 | +:cpp:class:`py::ndarray_view` has the interface of a standard fixed-size C++ container, like :cpp:class:`std::array`. |
| 131 | +:cpp:class:`py::ndarray_view` does have a few additions to the standard member functions: |
| 132 | + |
| 133 | +Constructors |
| 134 | +------------ |
| 135 | + |
| 136 | +- :cpp:func:`py::ndarray_view::from_buffer_protocol` |
| 137 | +- :cpp:func:`py::ndarray_view::virtual_array` |
| 138 | + |
| 139 | +Extra Member Accessors |
| 140 | +---------------------- |
| 141 | + |
| 142 | +- :cpp:func:`py::ndarray_view::shape` |
| 143 | +- :cpp:func:`py::ndarray_view::strides` |
| 144 | +- :cpp:func:`py::ndarray_view::buffer` |
| 145 | +- :cpp:func:`py::ndarray_view::rank` |
| 146 | +- :cpp:func:`py::ndarray_view::ssize` |
| 147 | + |
| 148 | +Contiguity |
| 149 | +---------- |
| 150 | + |
| 151 | +Member functions that are helpers for checking if a view is over a contiguous array. |
| 152 | + |
| 153 | +- :cpp:func:`py::ndarray_view::is_c_contig` |
| 154 | +- :cpp:func:`py::ndarray_view::is_f_contig` |
| 155 | +- :cpp:func:`py::ndarray_view::is_contig` |
| 156 | + |
| 157 | +Derived Views |
| 158 | +------------- |
| 159 | + |
| 160 | +- :cpp:func:`py::ndarray_view::freeze` |
| 161 | +- :cpp:func:`py::ndarray_view::slice` |
| 162 | + |
| 163 | +Free Functions |
| 164 | +-------------- |
| 165 | + |
| 166 | +- :cpp:func:`py::for_each_unordered` |
| 167 | + |
| 168 | +Constructing Array Views |
| 169 | +======================== |
| 170 | + |
| 171 | +Ndarray views may be constructed from C++ in a few ways. |
| 172 | +The easiest way to get an ndarray view is to accept one as a parameter from a function which has been :cpp:func:`py::automethod` converted. |
| 173 | +Libpy will take care of type and dimensionality checking and extracting the buffer from the underlying Python object. |
| 174 | + |
| 175 | +From C++ |
| 176 | +-------- |
| 177 | + |
| 178 | +From Contiguous C++ Containers |
| 179 | +~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ |
| 180 | + |
| 181 | +One dimensional array views, or :cpp:type:`py::array_view`, objects may be constructed from any C++ object that exposes both a ``data()`` and ``size()`` member functions. |
| 182 | +``data()`` must return a ``T*`` which points to an array of ``T`` elements of size ``size()``. |
| 183 | +Example containers that can be implicitly constructed from are :cpp:class:`std::vector` and :cpp:class:`std::array`. |
| 184 | + |
| 185 | +Example Usage |
| 186 | +````````````` |
| 187 | + |
| 188 | +.. code-block:: c++ |
| 189 | + |
| 190 | + void from_vector() { |
| 191 | + std::vector vec = {1, 2, 3}; |
| 192 | + py::array_view view(vec); |
| 193 | + } |
| 194 | + |
| 195 | + void from_array() { |
| 196 | + std::array arr = {1, 2, 3}; |
| 197 | + py::array_view view(arr); |
| 198 | + } |
| 199 | + |
| 200 | +Low Level Constructor |
| 201 | +~~~~~~~~~~~~~~~~~~~~~ |
| 202 | + |
| 203 | +If one wishes to construct a view from C++ directly, the most fundamental constructor takes the buffer as a ``(const) std::byte*``, the shape array, and the strides array. |
| 204 | +It is the user's responsibility to ensure that the buffer is compatible with the provided shape and strides, no checking will or can be done. |
| 205 | + |
| 206 | +From Buffer-like Objects |
| 207 | +~~~~~~~~~~~~~~~~~~~~~~~~ |
| 208 | + |
| 209 | +To construct an array view from a Python object that exports the buffer protocol, like a :class:`memoryview` or numpy array, there is a static member function :cpp:func:`py::ndarray_view::from_buffer_protocol`. |
| 210 | +Unlike a normal constructor, :cpp:func:`py::ndarray_view::from_buffer_protocol` returns a tuple of two parts: the array view instance and a :cpp:type:`py::buffer`. |
| 211 | +The :cpp:type:`py::buffer` is an RAII object which manages the lifetime of the underlying buffer which the view is over. |
| 212 | +The returned view is only valid as long as the paired :cpp:type:`py::buffer` is alive. |
| 213 | +Accessing through the view outside the lifetime of the :cpp:type:`py::buffer`c may trigger a use after free and is undefined behavior. |
| 214 | +
|
| 215 | +:cpp:func:`py::ndarray_view::from_buffer_protocol` will check that the runtime type of the Python buffer matches the static type of the C++ array view. |
| 216 | +:cpp:func:`py::ndarray_view::from_buffer_protocol` will also check that the runtime dimensionality of the Python buffer matches the static dimensionality of the C++ array view. |
| 217 | + |
| 218 | +Virtual Array Views |
| 219 | +~~~~~~~~~~~~~~~~~~~ |
| 220 | + |
| 221 | +A virtual array view is a scalar which is broadcasted to present as an array view. |
| 222 | +Concretely, a virtual array uses the ``buffer`` member to hold a pointer to a single value, and has strides of all zeros. |
| 223 | +By setting all of the strides to zero, this means that the single scalar can satisfy any shape. |
| 224 | + |
| 225 | +A virtual array view is useful when one must satisfy and interface that requires an array view but would like to pass a constant value. |
| 226 | +A virtual array view is considerably more efficient than allocating an array and filling it with a constant. |
| 227 | +No memory must be allocated, and each access will go to the same cache line. |
| 228 | + |
| 229 | +Because all elements of the view share the same underlying memory, mutable virtual arrays can have unexpected results. |
| 230 | +If any value in the array view is mutated, all of the elements would change. |
| 231 | +This can have unexpected consequences when passing the views to functions that are not prepared for that behavior. |
| 232 | +For this reason, it is recommended to only use const virtual array views. |
| 233 | + |
| 234 | +Virtual array views do not copy nor move from the element being viewed. |
| 235 | +For that reason, the view must not outlive the element being broadcasted. |
| 236 | + |
| 237 | +Example Usage |
| 238 | +````````````` |
| 239 | + |
| 240 | +.. code-block:: c++ |
| 241 | + |
| 242 | + // Library code |
| 243 | + |
| 244 | + /** A function which adds two array views, storing the result in the first |
| 245 | + array view. |
| 246 | + */ |
| 247 | + void add_inplace(py::array_view<int> a, py::array_view<const int> b) { |
| 248 | + std::transform(a.cbegin(), a.cend(), b.cbegin(), a.begin(), std::plus<>{}); |
| 249 | + } |
| 250 | +
|
| 251 | + // User code |
| 252 | + |
| 253 | + /** The user defined function which wants to call `add_inplace` with a |
| 254 | + scalar. |
| 255 | + */ |
| 256 | + void f(py::array_view<int> a) { |
| 257 | + int rhs = 5; |
| 258 | + auto rhs_view = py::array_view<const int>::virtual_array(rhs, a.shape()); |
| 259 | +
|
| 260 | + // `rhs_view` points to the same data as `rhs` |
| 261 | + assert(rhs_view.buffer() == reinterpret_cast<const std::byte*>(&rhs)); |
| 262 | + |
| 263 | + add_inplace(a, rhs_view); |
| 264 | + |
| 265 | + // ... |
| 266 | + } |
| 267 | + |
| 268 | +Here, it is critical not to use ``rhs_view`` after ``rhs`` has gone out of scope because the buffer points to the memory owned by ``rhs``. |
| 269 | + |
| 270 | +Type Erased Views |
| 271 | +================= |
| 272 | + |
| 273 | +:cpp:class:`py::ndarray_view` normally have a static type for the elements; however, Python users of numpy arrays might not always think of arrays in this way. |
| 274 | +Libpy currently only supports exporting a single overload of a function, so some functions which could be written generically need to have a single signature which can accept arrays of any type. |
| 275 | +In addition to the restriction of having a single overload exposed, for some functions, adding a lot of template expansions to have static types doesn't meaningfully improve the performance to justify the increased compile times. |
| 276 | + |
| 277 | +To provide static type-erased values, there are types :cpp:class:`py::any_ref` and :cpp:class:`py::any_cref`. |
| 278 | +:cpp:class:`py::any_ref` values act like references, and :cpp:class:`py::any_cref` act like ``const`` references. |
| 279 | +Unlike a ``void*``, :cpp:class:`py::any_ref` and :cpp:class:`py::any_cref` hold a virtual method table which implements some basic functionality. |
| 280 | +The vtable for both type-erased reference types is a :cpp:class:`py::any_vtable`. |
| 281 | +:cpp:class:`py::any_vtable` supports constructing new values, copying, moving, checking equality, and getting the numpy dtype for the type. |
| 282 | +:cpp:class:`py::any_vtable` can also provide information about the type like the size and alignment. |
| 283 | + |
| 284 | +``py::array_view<py::any_ref>`` and ``py::array_view<py::any_cref>`` have more specific meaning than "view of an array of any ref objects". |
| 285 | +Instead, ``py::array_view<py::any_ref>`` and ``py::array_view<py::any_cref>`` are always homogeneous, meaning all of the elements are the same type. |
| 286 | +``py::array_view<py::any_ref>`` and ``py::array_view<py::any_cref>`` have the following members: |
| 287 | + |
| 288 | +- shape :: ``std::array<std::size_t>`` |
| 289 | +- strides :: ``std::array<std::int64_t>`` |
| 290 | +- buffer :: ``(const) std::byte*`` |
| 291 | +- vtable :: :cpp:class:`py::any_vtable` |
| 292 | + |
| 293 | +The shape and strides are the same as a normal :cpp:class:`py::ndarray`. |
| 294 | +The buffer is now a pointer to an untyped block of data which should be interpreted based on the vtable. |
| 295 | +The vtable member encodes the type of the elements in the array and provides access to the operations on the elements. |
| 296 | + |
| 297 | +Type Casting |
| 298 | +------------ |
| 299 | + |
| 300 | +For performance reasons, it is still useful to convert to a statically typed array view sometimes. |
| 301 | +There is a :cpp:func:`py::ndarray_view::cast` template member function which |
0 commit comments