Skip to content
Open
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
286 changes: 286 additions & 0 deletions posts/2024-05-22-multi-dim-arrays.qmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,286 @@
---
title: "Multi-dimensional array indexing"
description: "Recommendations to ensure that array data and
index/coordinate data can be consistently interpreted
across software using different indexing conventions."
author:
- name: "John Bogovic"
- name: "Davis Bennett"
- name: "Michael Innerberger"
- name: "Mark Kittisopikul"
- name: "Stephan Saalfeld"
- name: "Virginia Scarlett"
date: "5/10/2024"
---

## Summary and recommendations

Conventions for indexing and storing multi-dimensional data vary across programming languages and software libraries.
Data sharing across software with different conventions is technically simple *so long as the convention used is clearly
documented.* For example, one software library might store and save point coordinates as `(x,y,z)`, while another
stores them as `(z,y,x)`. Interoperability is easy (just "reverse the numbers") only if it is possible to determine
what convention was used to produce the data. Users and software can determine what convention is used only if it is
clearly and explicitly communicated.

To enable seamless data and metadata sharing, software and storage formats should clearly and explicitly communicate:

1. the relationship of array indexes to memory layout
2. the semantic meaning of array dimensions
3. the relationship of coordinate data to array dimensions

Ambiguities should be clarified by (1) explicitly documenting the array indexing convention (most often C-order or F-order),
and (2) referring to array dimensions and (3) coordinates with a consistent order and/or with labels that are
independent of order (e.g. `['x', 'y', 'z', 't', ...]` (see
[xarray](https://docs.xarray.dev/en/latest/getting-started-guide/why-xarray.html)).

::: {.callout-tip collapse="true" appearance="minimal"}
# Examples

These are provided as good examples, not as the only way to communicate the above information.

#### Array size and dimension interpretation

> 3D displacement fields are stored as 4D arrays, whose dimensions are ordered `[3,X,Y,Z]` where the first dimension contains
> displacement vector components, and varies fastest in memory (i.e. F-order).

#### Point coordinates

> Points are stored as 3D coordinates in a CSV file ordered `X,Y,Z`
> where the `X` dimension varies fastest in memory.

:::


## Multi-dimensional array indexing and memory layout

Elements of an n-dimensional array are indexed by an ordered n-tuple, each element of which indexes into one of the array's
dimensions. We will also call elements of this tuple "indexes." For example, the tuple `(i,j,k)` indexes a three-dimensional
array where `i` is the "first" index, and `k` is the "last" index. Here, we will consider only the non-negative integers as
valid indexes for arrays, though different contexts may use a different index set.

Multi-dimensional arrays are often stored as one-dimensional (1D), or "flat," arrays that are interpreted or "reshaped" into
a multi-dimensional array by mapping the n-tuple of coordinates to a single index into the 1D array. The two most common
conventions for this mapping are C-order and F-order.

In this article, we will refer to n-dimensional arrays as simply "arrays" and 1D arrays as "flat arrays."

#### Reshaping arrays and stride

One can think of reshaping a 1D array as a recursive process of grouping a number of adjacent elements. An n-dimensional array
can be reshaped to an (n+1)-dimensional array by grouping a number adjacent elements belonging to the same dimension.
The [*stride*](https://en.wikipedia.org/wiki/Stride_of_an_array) of each dimension can be used to communicate how indexes
relate to memory layout but more typical is to specify C-order or F-order (see below).

::: {.callout-note collapse="true" appearance="minimal"}
# Note on "units" of stride

In this article, we will express stride in terms of elements. That is, stride 1 means the adjacent element in memory,
no matter the size in bytes per element.

In some other contexts, stride is expressed in terms of bytes. For example, an array containing elements of type `float32`,
the smallest stride possible would be 4 bytes: the next element is "4 bytes away".

:::

Other terms that are useful for communicating memory layout include:

* The **"fastest"** or **"inner"** dimension is the dimension with a stride of 1.
* The **"slowest"** or **"outer"** dimension is the dimension with the largest stride.
* The **size** of a dimension is the number of grouped elements.

The size of an n-dimensional array is described by a list of sizes per dimension. For example: `[3, 5, 7].` In this example,
the *first* dimension has size `3`, the *last* dimension has size `7`.

::: {.callout-tip collapse="true" appearance="minimal"}
# Example

Suppose we have this flat array:

`[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15, 16, 17, 18, 19, 20, 21, 22, 23]`


and three dimensions having sizes 2, 3, and 4. Their strides are 1, 2, and 6 where `2*3 = 6`. There is no grouping to be done
for a dimensions of stride 1, so the first step is to join elements into groups of 2 (the second stride):

`[(0, 1), (2, 3), (4, 5), (6, 7), (8, 9), (10, 11), (12, 13), (14, 15), (16, 17), (18, 19), (20, 21), (22, 23)]`

Next, group elements of the new list (which are themselves groups) into groups of 3 (the largest stride).

`[((0, 1), (2, 3), (4, 5)), ((6, 7), (8, 9), (10, 11)), ((12, 13), (14, 15), (16, 17)), ((18, 19), (20, 21), (22, 23))]`

Notice:

* The element adjacent to `0` in the inner group is `1`, hence stride `1`.
* The element adjacent to `0` in the intermediate grouping is `2`, hence stride `2`.
* The element adjacent to `0` in the outer grouping is `6`, hence stride `6`.
:::


### C- and F-order

The terms C-order and F-order come from conventions for indexing arrays in the C and Fortran programming languages.

* **C-order indexing:** the fastest dimension corresponds to the last index, the slowest dimension corresponds to the first index
* **F-order indexing:** the slowest dimension corresponds to the last index, the fastest dimension corresponds to the first index


::: {.callout-tip collapse="true" appearance="minimal"}
# Examples

Using the same flat array as the above example:

`[((0, 1), (2, 3), (4, 5)), ((6, 7), (8, 9), (10, 11)), ((12, 13), (14, 15), (16, 17)), ((18, 19), (20, 21), (22, 23))]`

* The size of this array using C-order is: `[4, 3, 2]`
* The size of this array using F-order is: `[2, 3, 4]`
* The index of element `5` using C-order is: `[0][2][1]`
* The index of element `5` using F-order is: `[1][2][0]`

:::

If this array is indexed using C-order, then the last index has stride `1`. As a result, the middle index will have stride `7`,
and the *first* dimension will have stride `7*5 = 35`.

Consider again an array of size `[3, 5, 7]`, but using F-order indexing. Again, the *first* dimension has size `3`, the *last*
dimension has size `7`. Now, however, using F-order, the *first* dimension will have stride `1`, the *second* dimension will
have stride `3`, and the *third* dimension will have stride `3*5 = 15`.

**Recommendation:** Communicate the relationship of array indexes and sizes to memory layout. Consider using the language
in the examples below or similar.

::: {.callout-tip collapse="true" appearance="minimal"}
# Examples

These are provided as effective examples, not as the only way to communicate information.

### Example 1
> Array sizes and coordinates are stored in C-order.

### Example 2
> Coordinates are ordered `X,Y,Z` where the `X` is contiguous in memory.

:::

## Dimension semantics

The dimensions of a multi-dimensional array sometimes come with additional semantics depending
on what data they store. We discuss interpreting arrays as matrices and images.

### Matrices

Matrices are often represented as a 2D array of numbers. Horizontal groupings of these numbers are called "rows" and vertical
groupings are called "columns." In mathematics, the entries of a matrix $A$ are denoted $a_{ij}$.

**Recommendation:** Software should be clearly communicate when arrays represent matrices, and follow the *Matrix indexing
convention*. Software and documentation should use the terms "row-major" and "column-major" only when referring to matrices.

::: {.callout-note appearance="minimal"}
# Matrix indexing convention

The first index of a matrix ($i$) refers to rows, the second index ($j$) refers to columns.
:::


::: {.callout-tip collapse=true appearance="minimal"}
# Row- and column-major

The terms row- and column-major derive for the storage of matrices. Defining these terms first depends on first agreeing which
index (first or last) refers to rows vs columns for matrices in mathematics.

* **Consequence:** Given matrix indexing conventions, C-order storage is equivalent to "row-major".
* **Consequence:** Given matrix indexing conventions, F-order storage is equivalent to "column-major".

#### example

As a result of the *Matrix Indexing Convention* the size of a matrix with `2` rows and `3` columns is `[2, 3]` for both C- and
F-orderings. Consider:

```
column 0 column 1 column 2
row 0 [ 0 1 2 ]
row 1 [ 3 4 5 ]
```

* The flat C-ordered array will be: `[0, 1, 2, 3, 4, 5]`
* The flat F-ordered array will be: `[0, 3, 1, 4, 2, 5]`

To reiterate, the multi-dimensional indexes for both C- and F-order are:

```
column 0 column 1 column 2
row 0 [ (0,0) (0,1) (0,2) ]
row 1 [ (1,0) (1,1) (1,2) ]
```

because, the row index is *always* the first index.

:::

For matrices, C- and F- order indexing will agree on the ordering of an array's indexes and dimensions, but will store the
arrays differently when flattened. This is because the matrix indexing convention attaches semantics (row/column) to
the index position (first/second).

### Images

Two-dimensional images are often stored as arrays where two of the array dimensions vary the horizontal and vertical positions of the
samples, and as a result these dimensions should be displayed horizontally and vertically, respectively. Most formats for
storing "natural" images store data such that the "horizontal axis" / rows have a smaller stride than the "vertical axis" /
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what is a natural image?

Copy link
Collaborator

@bogovicj bogovicj Jun 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tough to define precisely, but roughly "an image of physical objects that a human might see using the unaided eye" It's a pretty common term in computer vision

Used here in contrast to "medical" or "microscopic" images, which are not "natural" images

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

might be useful here to clarify that "horizontal" and "vertical" are relative to the camera sensor, not the subject. Landscape and portrait images of the same subject should probably not be displayed based on the memory layout.

columns. Note: while rows have smaller stride than columns, it is common for rows not to have stride 1, for example when using
"interleaved" color components, the color dimension will have a stride of 1. Biomedical images do not typically have
"horizontal" or "vertical" dimensions, but may have other semantics (e.g., anatomical, or related to the imaging system).

**Recommendation:** Software and storage formats should be explicit about any semantics that are attached to dimensions and
not use stride as a proxy for or to imply semantics. Documentation may specify semantics in terms of fastest/slowest dimension,
but must explicitly communicate that fact.

For images, C- and F-order will historically disagree on the ordering of an array's indexes and dimensions, but will store the
arrays the same way when flattened. This is because the convention for natural images historically
associates dimension semantics to the memory layout (the fastest dimension has horizontal semantics).

## Coordinate data

Coordinate data are data that refer to "locations" of the multidimensional array. They may be discrete,
and refer to specific samples, or continuous, referring to points "in-between" array locations.

::: {.callout-tip collapse="true" appearance="minimal"}
# Coordinate data examples

* Point annotations
* "Structure `A` is located at point `(x,y)`"
* Bounding boxes / ROIs
* "Crop the image to bounding box `(min_x, min_y, width, height)`
* Other collections of points
* "My neuron skeleton consists of points `[[z0, y0, x0], ..., [zN, yN, xN]]`

:::


### cartesian coordinates

In contrast to the matrix row/column index convention, cartesian coordinates label the horizontal and vertical dimensions `x`
and `y`, respectively. Referencing positions in the 2D plane is done using ordered two-tuples `(x,y)`, where `x` is
conventionally the first index and `y` is the second index. Using cartesian coordinates, varying the first dimensions varies
horizontal position, and varying the second dimension varies the vertical position.

Applications and workflows that make use of image geometry most commonly use cartesian coordinates.

**Recommendation:** Name your dimensions `x`, `y`, or `z` only if they are spatial. Name the "horizontal" and "vertical"
dimensions `x` and `y`, respectively if those semantics are relevant.


## References

1) [hdf5 dataspaces](https://docs.hdfgroup.org/hdf5/develop/_h5_s__u_g.html#sec_dataspace)
2) [zarr arrays](https://zarr-specs.readthedocs.io/en/latest/v2/v2.0.html#arrays)
3) [nrrd axis ordering](https://teem.sourceforge.net/nrrd/format.html#general.4)
4) [n5 ordering discussion](https://github.com/saalfeldlab/n5/issues/31)
5) [multi-dimensional arrays in vigra](http://ukoethe.github.io/vigra/doc-release/vigranumpy/index.html#more-on-the-motivation-and-use-of-axistags)


## Related terms

| C-order | F-order |
| --------------------- | ------------------------ |
| lexicographical order | co-lexicographical order |
| row-major | column-major |
| matrix indexing | cartesian indexing |