diff --git a/posts/2024-05-22-multi-dim-arrays.qmd b/posts/2024-05-22-multi-dim-arrays.qmd new file mode 100644 index 0000000..932c534 --- /dev/null +++ b/posts/2024-05-22-multi-dim-arrays.qmd @@ -0,0 +1,286 @@ +--- +title: "Multi-dimensional array indexing" +description: "Recommendations to ensure that array data and + index/coordinate data can be consistently interpreted + across software using different indexing conventions." +author: + - name: "John Bogovic" + - name: "Davis Bennett" + - name: "Michael Innerberger" + - name: "Mark Kittisopikul" + - name: "Stephan Saalfeld" + - name: "Virginia Scarlett" +date: "5/10/2024" +--- + +## Summary and recommendations + +Conventions for indexing and storing multi-dimensional data vary across programming languages and software libraries. +Data sharing across software with different conventions is technically simple *so long as the convention used is clearly +documented.* For example, one software library might store and save point coordinates as `(x,y,z)`, while another +stores them as `(z,y,x)`. Interoperability is easy (just "reverse the numbers") only if it is possible to determine +what convention was used to produce the data. Users and software can determine what convention is used only if it is +clearly and explicitly communicated. + +To enable seamless data and metadata sharing, software and storage formats should clearly and explicitly communicate: + +1. the relationship of array indexes to memory layout +2. the semantic meaning of array dimensions +3. the relationship of coordinate data to array dimensions + +Ambiguities should be clarified by (1) explicitly documenting the array indexing convention (most often C-order or F-order), +and (2) referring to array dimensions and (3) coordinates with a consistent order and/or with labels that are +independent of order (e.g. `['x', 'y', 'z', 't', ...]` (see +[xarray](https://docs.xarray.dev/en/latest/getting-started-guide/why-xarray.html)). + +::: {.callout-tip collapse="true" appearance="minimal"} +# Examples + +These are provided as good examples, not as the only way to communicate the above information. + +#### Array size and dimension interpretation + +> 3D displacement fields are stored as 4D arrays, whose dimensions are ordered `[3,X,Y,Z]` where the first dimension contains +> displacement vector components, and varies fastest in memory (i.e. F-order). + +#### Point coordinates + +> Points are stored as 3D coordinates in a CSV file ordered `X,Y,Z` +> where the `X` dimension varies fastest in memory. + +::: + + +## Multi-dimensional array indexing and memory layout + +Elements of an n-dimensional array are indexed by an ordered n-tuple, each element of which indexes into one of the array's +dimensions. We will also call elements of this tuple "indexes." For example, the tuple `(i,j,k)` indexes a three-dimensional +array where `i` is the "first" index, and `k` is the "last" index. Here, we will consider only the non-negative integers as +valid indexes for arrays, though different contexts may use a different index set. + +Multi-dimensional arrays are often stored as one-dimensional (1D), or "flat," arrays that are interpreted or "reshaped" into +a multi-dimensional array by mapping the n-tuple of coordinates to a single index into the 1D array. The two most common +conventions for this mapping are C-order and F-order. + +In this article, we will refer to n-dimensional arrays as simply "arrays" and 1D arrays as "flat arrays." + +#### Reshaping arrays and stride + +One can think of reshaping a 1D array as a recursive process of grouping a number of adjacent elements. An n-dimensional array +can be reshaped to an (n+1)-dimensional array by grouping a number adjacent elements belonging to the same dimension. +The [*stride*](https://en.wikipedia.org/wiki/Stride_of_an_array) of each dimension can be used to communicate how indexes +relate to memory layout but more typical is to specify C-order or F-order (see below). + +::: {.callout-note collapse="true" appearance="minimal"} +# Note on "units" of stride + +In this article, we will express stride in terms of elements. That is, stride 1 means the adjacent element in memory, +no matter the size in bytes per element. + +In some other contexts, stride is expressed in terms of bytes. For example, an array containing elements of type `float32`, +the smallest stride possible would be 4 bytes: the next element is "4 bytes away". + +::: + +Other terms that are useful for communicating memory layout include: + +* The **"fastest"** or **"inner"** dimension is the dimension with a stride of 1. +* The **"slowest"** or **"outer"** dimension is the dimension with the largest stride. +* The **size** of a dimension is the number of grouped elements. + +The size of an n-dimensional array is described by a list of sizes per dimension. For example: `[3, 5, 7].` In this example, +the *first* dimension has size `3`, the *last* dimension has size `7`. + +::: {.callout-tip collapse="true" appearance="minimal"} +# Example + +Suppose we have this flat array: + +`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]` + + +and three dimensions having sizes 2, 3, and 4. Their strides are 1, 2, and 6 where `2*3 = 6`. There is no grouping to be done +for a dimensions of stride 1, so the first step is to join elements into groups of 2 (the second stride): + +`[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9), (10, 11), (12, 13), (14, 15), (16, 17), (18, 19), (20, 21), (22, 23)]` + +Next, group elements of the new list (which are themselves groups) into groups of 3 (the largest stride). + +`[((0, 1), (2, 3), (4, 5)), ((6, 7), (8, 9), (10, 11)), ((12, 13), (14, 15), (16, 17)), ((18, 19), (20, 21), (22, 23))]` + +Notice: + +* The element adjacent to `0` in the inner group is `1`, hence stride `1`. +* The element adjacent to `0` in the intermediate grouping is `2`, hence stride `2`. +* The element adjacent to `0` in the outer grouping is `6`, hence stride `6`. +::: + + +### C- and F-order + +The terms C-order and F-order come from conventions for indexing arrays in the C and Fortran programming languages. + +* **C-order indexing:** the fastest dimension corresponds to the last index, the slowest dimension corresponds to the first index +* **F-order indexing:** the slowest dimension corresponds to the last index, the fastest dimension corresponds to the first index + + +::: {.callout-tip collapse="true" appearance="minimal"} +# Examples + +Using the same flat array as the above example: + +`[((0, 1), (2, 3), (4, 5)), ((6, 7), (8, 9), (10, 11)), ((12, 13), (14, 15), (16, 17)), ((18, 19), (20, 21), (22, 23))]` + +* The size of this array using C-order is: `[4, 3, 2]` +* The size of this array using F-order is: `[2, 3, 4]` +* The index of element `5` using C-order is: `[0][2][1]` +* The index of element `5` using F-order is: `[1][2][0]` + +::: + +If this array is indexed using C-order, then the last index has stride `1`. As a result, the middle index will have stride `7`, +and the *first* dimension will have stride `7*5 = 35`. + +Consider again an array of size `[3, 5, 7]`, but using F-order indexing. Again, the *first* dimension has size `3`, the *last* +dimension has size `7`. Now, however, using F-order, the *first* dimension will have stride `1`, the *second* dimension will +have stride `3`, and the *third* dimension will have stride `3*5 = 15`. + +**Recommendation:** Communicate the relationship of array indexes and sizes to memory layout. Consider using the language +in the examples below or similar. + +::: {.callout-tip collapse="true" appearance="minimal"} +# Examples + +These are provided as effective examples, not as the only way to communicate information. + +### Example 1 +> Array sizes and coordinates are stored in C-order. + +### Example 2 +> Coordinates are ordered `X,Y,Z` where the `X` is contiguous in memory. + +::: + +## Dimension semantics + +The dimensions of a multi-dimensional array sometimes come with additional semantics depending +on what data they store. We discuss interpreting arrays as matrices and images. + +### Matrices + +Matrices are often represented as a 2D array of numbers. Horizontal groupings of these numbers are called "rows" and vertical +groupings are called "columns." In mathematics, the entries of a matrix $A$ are denoted $a_{ij}$. + +**Recommendation:** Software should be clearly communicate when arrays represent matrices, and follow the *Matrix indexing +convention*. Software and documentation should use the terms "row-major" and "column-major" only when referring to matrices. + +::: {.callout-note appearance="minimal"} +# Matrix indexing convention + +The first index of a matrix ($i$) refers to rows, the second index ($j$) refers to columns. +::: + + +::: {.callout-tip collapse=true appearance="minimal"} +# Row- and column-major + +The terms row- and column-major derive for the storage of matrices. Defining these terms first depends on first agreeing which +index (first or last) refers to rows vs columns for matrices in mathematics. + +* **Consequence:** Given matrix indexing conventions, C-order storage is equivalent to "row-major". +* **Consequence:** Given matrix indexing conventions, F-order storage is equivalent to "column-major". + +#### example + +As a result of the *Matrix Indexing Convention* the size of a matrix with `2` rows and `3` columns is `[2, 3]` for both C- and +F-orderings. Consider: + +``` + column 0 column 1 column 2 + row 0 [ 0 1 2 ] + row 1 [ 3 4 5 ] +``` + +* The flat C-ordered array will be: `[0, 1, 2, 3, 4, 5]` +* The flat F-ordered array will be: `[0, 3, 1, 4, 2, 5]` + +To reiterate, the multi-dimensional indexes for both C- and F-order are: + +``` + column 0 column 1 column 2 + row 0 [ (0,0) (0,1) (0,2) ] + row 1 [ (1,0) (1,1) (1,2) ] +``` + +because, the row index is *always* the first index. + +::: + +For matrices, C- and F- order indexing will agree on the ordering of an array's indexes and dimensions, but will store the +arrays differently when flattened. This is because the matrix indexing convention attaches semantics (row/column) to +the index position (first/second). + +### Images + +Two-dimensional images are often stored as arrays where two of the array dimensions vary the horizontal and vertical positions of the +samples, and as a result these dimensions should be displayed horizontally and vertically, respectively. Most formats for +storing "natural" images store data such that the "horizontal axis" / rows have a smaller stride than the "vertical axis" / +columns. Note: while rows have smaller stride than columns, it is common for rows not to have stride 1, for example when using +"interleaved" color components, the color dimension will have a stride of 1. Biomedical images do not typically have +"horizontal" or "vertical" dimensions, but may have other semantics (e.g., anatomical, or related to the imaging system). + +**Recommendation:** Software and storage formats should be explicit about any semantics that are attached to dimensions and +not use stride as a proxy for or to imply semantics. Documentation may specify semantics in terms of fastest/slowest dimension, +but must explicitly communicate that fact. + +For images, C- and F-order will historically disagree on the ordering of an array's indexes and dimensions, but will store the +arrays the same way when flattened. This is because the convention for natural images historically +associates dimension semantics to the memory layout (the fastest dimension has horizontal semantics). + +## Coordinate data + +Coordinate data are data that refer to "locations" of the multidimensional array. They may be discrete, +and refer to specific samples, or continuous, referring to points "in-between" array locations. + +::: {.callout-tip collapse="true" appearance="minimal"} +# Coordinate data examples + +* Point annotations + * "Structure `A` is located at point `(x,y)`" +* Bounding boxes / ROIs + * "Crop the image to bounding box `(min_x, min_y, width, height)` +* Other collections of points + * "My neuron skeleton consists of points `[[z0, y0, x0], ..., [zN, yN, xN]]` + +::: + + +### cartesian coordinates + +In contrast to the matrix row/column index convention, cartesian coordinates label the horizontal and vertical dimensions `x` +and `y`, respectively. Referencing positions in the 2D plane is done using ordered two-tuples `(x,y)`, where `x` is +conventionally the first index and `y` is the second index. Using cartesian coordinates, varying the first dimensions varies +horizontal position, and varying the second dimension varies the vertical position. + +Applications and workflows that make use of image geometry most commonly use cartesian coordinates. + +**Recommendation:** Name your dimensions `x`, `y`, or `z` only if they are spatial. Name the "horizontal" and "vertical" +dimensions `x` and `y`, respectively if those semantics are relevant. + + +## References + +1) [hdf5 dataspaces](https://docs.hdfgroup.org/hdf5/develop/_h5_s__u_g.html#sec_dataspace) +2) [zarr arrays](https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html#arrays) +3) [nrrd axis ordering](https://teem.sourceforge.net/nrrd/format.html#general.4) +4) [n5 ordering discussion](https://github.com/saalfeldlab/n5/issues/31) +5) [multi-dimensional arrays in vigra](http://ukoethe.github.io/vigra/doc-release/vigranumpy/index.html#more-on-the-motivation-and-use-of-axistags) + + +## Related terms + +| C-order | F-order | +| --------------------- | ------------------------ | +| lexicographical order | co-lexicographical order | +| row-major | column-major | +| matrix indexing | cartesian indexing |