Skip to content

Commit c762cac

Browse files
authored
Merge pull request numpy#19865 from Mukulikaa/internals-doc-reorg
DOC: Moved NumPy Internals to Under-the-hood documentation for developers
2 parents 6375448 + 50850c0 commit c762cac

File tree

8 files changed

+954
-871
lines changed

8 files changed

+954
-871
lines changed

doc/source/dev/alignment.rst

Lines changed: 113 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,113 @@
1+
.. currentmodule:: numpy
2+
3+
.. _alignment:
4+
5+
****************
6+
Memory Alignment
7+
****************
8+
9+
NumPy alignment goals
10+
=====================
11+
12+
There are three use-cases related to memory alignment in NumPy (as of 1.14):
13+
14+
1. Creating :term:`structured datatypes <structured data type>` with
15+
:term:`fields <field>` aligned like in a C-struct.
16+
2. Speeding up copy operations by using :class:`uint` assignment in instead of
17+
``memcpy``.
18+
3. Guaranteeing safe aligned access for ufuncs/setitem/casting code.
19+
20+
NumPy uses two different forms of alignment to achieve these goals:
21+
"True alignment" and "Uint alignment".
22+
23+
"True" alignment refers to the architecture-dependent alignment of an
24+
equivalent C-type in C. For example, in x64 systems :attr:`float64` is
25+
equivalent to ``double`` in C. On most systems, this has either an alignment of
26+
4 or 8 bytes (and this can be controlled in GCC by the option
27+
``malign-double``). A variable is aligned in memory if its memory offset is a
28+
multiple of its alignment. On some systems (eg. sparc) memory alignment is
29+
required; on others, it gives a speedup.
30+
31+
"Uint" alignment depends on the size of a datatype. It is defined to be the
32+
"True alignment" of the uint used by NumPy's copy-code to copy the datatype, or
33+
undefined/unaligned if there is no equivalent uint. Currently, NumPy uses
34+
``uint8``, ``uint16``, ``uint32``, ``uint64``, and ``uint64`` to copy data of
35+
size 1, 2, 4, 8, 16 bytes respectively, and all other sized datatypes cannot
36+
be uint-aligned.
37+
38+
For example, on a (typical Linux x64 GCC) system, the NumPy :attr:`complex64`
39+
datatype is implemented as ``struct { float real, imag; }``. This has "true"
40+
alignment of 4 and "uint" alignment of 8 (equal to the true alignment of
41+
``uint64``).
42+
43+
Some cases where uint and true alignment are different (default GCC Linux):
44+
====== ========= ======== ========
45+
arch type true-aln uint-aln
46+
====== ========= ======== ========
47+
x86_64 complex64 4 8
48+
x86_64 float128 16 8
49+
x86 float96 4 \-
50+
====== ========= ======== ========
51+
52+
53+
Variables in NumPy which control and describe alignment
54+
=======================================================
55+
56+
There are 4 relevant uses of the word ``align`` used in NumPy:
57+
58+
* The :attr:`dtype.alignment` attribute (``descr->alignment`` in C). This is
59+
meant to reflect the "true alignment" of the type. It has arch-dependent
60+
default values for all datatypes, except for the structured types created
61+
with ``align=True`` as described below.
62+
* The ``ALIGNED`` flag of an ndarray, computed in ``IsAligned`` and checked
63+
by :c:func:`PyArray_ISALIGNED`. This is computed from
64+
:attr:`dtype.alignment`.
65+
It is set to ``True`` if every item in the array is at a memory location
66+
consistent with :attr:`dtype.alignment`, which is the case if the
67+
``data ptr`` and all strides of the array are multiples of that alignment.
68+
* The ``align`` keyword of the dtype constructor, which only affects
69+
:ref:`structured_arrays`. If the structure's field offsets are not manually
70+
provided, NumPy determines offsets automatically. In that case,
71+
``align=True`` pads the structure so that each field is "true" aligned in
72+
memory and sets :attr:`dtype.alignment` to be the largest of the field
73+
"true" alignments. This is like what C-structs usually do. Otherwise if
74+
offsets or itemsize were manually provided ``align=True`` simply checks that
75+
all the fields are "true" aligned and that the total itemsize is a multiple
76+
of the largest field alignment. In either case :attr:`dtype.isalignedstruct`
77+
is also set to True.
78+
* ``IsUintAligned`` is used to determine if an ndarray is "uint aligned" in
79+
an analogous way to how ``IsAligned`` checks for true alignment.
80+
81+
Consequences of alignment
82+
=========================
83+
84+
Here is how the variables above are used:
85+
86+
1. Creating aligned structs: To know how to offset a field when
87+
``align=True``, NumPy looks up ``field.dtype.alignment``. This includes
88+
fields that are nested structured arrays.
89+
2. Ufuncs: If the ``ALIGNED`` flag of an array is False, ufuncs will
90+
buffer/cast the array before evaluation. This is needed since ufunc inner
91+
loops access raw elements directly, which might fail on some archs if the
92+
elements are not true-aligned.
93+
3. Getitem/setitem/copyswap function: Similar to ufuncs, these functions
94+
generally have two code paths. If ``ALIGNED`` is False they will
95+
use a code path that buffers the arguments so they are true-aligned.
96+
4. Strided copy code: Here, "uint alignment" is used instead. If the itemsize
97+
of an array is equal to 1, 2, 4, 8 or 16 bytes and the array is uint
98+
aligned then instead NumPy will do ``*(uintN*)dst) = *(uintN*)src)`` for
99+
appropriate N. Otherwise, NumPy copies by doing ``memcpy(dst, src, N)``.
100+
5. Nditer code: Since this often calls the strided copy code, it must
101+
check for "uint alignment".
102+
6. Cast code: This checks for "true" alignment, as it does
103+
``*dst = CASTFUNC(*src)`` if aligned. Otherwise, it does
104+
``memmove(srcval, src); dstval = CASTFUNC(srcval); memmove(dst, dstval)``
105+
where dstval/srcval are aligned.
106+
107+
Note that the strided-copy and strided-cast code are deeply intertwined and so
108+
any arrays being processed by them must be both uint and true aligned, even
109+
though the copy-code only needs uint alignment and the cast code only true
110+
alignment. If there is ever a big rewrite of this code it would be good to
111+
allow them to use different alignments.
112+
113+

0 commit comments

Comments
 (0)