Implement `Array2D` for non-owning 2D views of Host/Device memory #14

horizon-blue · 2025-11-19T08:46:57Z

Summary of Changes

This PR introduces a new Array2D class for managing 2D views into contiguous arrays in host or device memory (powered by mdspan). Note that unlike the typical conainer types, Array2D does not own the underlying buffer: it simply takes in a pointer and construct a 2D view of it. As such, it does not deallocate the underlying buffer when the object itself is out of scope. This is an intentional design because we often need to pass the "views" of the buffer around within device code, and we often need to keep many copies of the same view around within device (in parallel), but we can only deallocate once on the host side.

Aside: while implementing this PR, I found CUDA std actually has an undocumented submdspan type (at cuda::std::submdspan). This could potentially be useful in creating multi-dimensional sub-views/non-contiguous views of the buffer as we start to implement tiling.

Test Plan

To run the unit tests included in the file:

pixi run ctest

linear · 2025-11-19T08:47:06Z

MET-24 Create universal container types

For containers like Array2D and FMBs, we'd like to have data structures that work either on CPU (host) or GPU (device).

One idea is to define generic containers to use Thrust's device_vector or host_vector under the hood. For the purpose of this project, the location of the memory is defined in compile time (i.e. we can specify the location via a template variable to the generic container type).

As part of the deliverable, we should also write unit tests to ensure that we can allocate & deallocate on both device & host.

horizon-blue · 2025-11-20T05:27:28Z

genmetaballs/src/cuda/bindings.cu

        .def_rw("x", &Vec3D::x)
        .def_rw("y", &Vec3D::y)
        .def_rw("z", &Vec3D::z)
-        .def("__add__", &operator+)


The change is needed because the CUDA header files also defines operator+ in the global scope, so the compiler no longer has enough context to uniquely identify the right operator+ here.

mugamma

LGTM, although part of me thinks whether we should just use mdspan I didn't know it existed!

mugamma · 2025-11-20T14:58:56Z

genmetaballs/src/cuda/core/utils.cuh

+    __host__ __device__ constexpr T& operator()(uint32_t row, uint32_t col) {
+        return data_view_(row, col);
+    }
+    __host__ __device__ constexpr T operator()(uint32_t row, uint32_t col) const {
+        return data_view_(row, col);
+    }


Is there a reason for using operator() instead of the more natural operator[]?

Good point! Sadly, prior to C++23, we aren't allowed to define operator[] with more than one arguments. In cuda::std::mdspan, they also made similar design choice, where they defined operator() in earlier C++ standards, and switched to operator[] in C++23 and above

mugamma · 2025-11-20T15:01:22Z

tests/cpp_tests/test_utils.cu

+
+    auto data = TypeParam(rows * cols);
+    // create 2D view into the underlying data on host or device
+    auto array2d = Array2D(thrust::raw_pointer_cast(data.data()), rows, cols);


What's the purpose of thrust::raw_pointer_cast here?

This is needed because thrust::host/device_vector's data() method returns a custom pointer type, so this casting is needed to convert them to raw pointer that mdspan can understand.

Alternatively, if we're expecting to use thrust a lot, I'm just thinking that we can also define overloaded Array2D constructors that are specialized to thrust pointer types so we don't have to explicitly invoke this conversion everywhere

horizon-blue · 2025-11-20T20:24:46Z

part of me thinks whether we should just use mdspan I didn't know it existed!

I was also thinking about that while working on this PR haha (in my earlier commits I was defining Array2D as an alias of mdspan). Though one nice thing about having our custom wrapper is that we can define more convenient helper methods so our codes don't have to be super verbose.

@mugamma

(Closes MET-47) ## Summary of Changes This PR addresses some of the suggestions that @mugamma brought up in #14. In particular, it defines the `operator[]` on `Array2D` to return a 1D view of a row, so we can use patterns like `array2d[i][j]` instead of `array2d(i, j)` to access the element. Another nice thing about returning the 1D span is that we can use range-based for loop to go over the elements in a row as well, e.g. ```cpp for (auto& val : array2d[rows]) { /* do something with val*/ } ``` You can find some example usages in the [included test file](https://github.com/probcomp/GenMetaBalls/pull/17/files#diff-92c53773082b537451d1c0e757c8ac6b5c6f85fd4bcd6e01bf82cade202575fb). Another minor change in this PR is the refactoring of`Array2D` methods to use the new `CUDA_CALLABLE` macro that Arijit introduced recently. ## Test Plan To run the included unit tests: ```bash pixi run test ```

horizon-blue changed the title ~~Generic Array2D on Device/Host~~ Implement Array2D for non-owning 2D views of Host/Device memory Nov 20, 2025

horizon-blue commented Nov 20, 2025

View reviewed changes

horizon-blue requested review from arijit-dasgupta and mugamma November 20, 2025 05:27

horizon-blue marked this pull request as ready for review November 20, 2025 05:28

horizon-blue force-pushed the xiaoyan/met-24-create-universal-container-types branch from 48a0c8a to 5211bb8 Compare November 20, 2025 05:28

horizon-blue added 6 commits November 20, 2025 01:10

Define Array2D as an alias of cuda::std::mdspan

8f2daf9

Parametrize test on host

9729de1

Array2D test on device

429c317

Simplify Array2D test

bc4372b

Refactor Array2D with more friendly member function names

d080465

Clanup unused imports

bbe8659

horizon-blue force-pushed the xiaoyan/met-24-create-universal-container-types branch from 5211bb8 to bbe8659 Compare November 20, 2025 06:11

mugamma approved these changes Nov 20, 2025

View reviewed changes

mugamma merged commit 1424f94 into master Nov 20, 2025
1 check passed

horizon-blue deleted the xiaoyan/met-24-create-universal-container-types branch November 20, 2025 20:09

horizon-blue mentioned this pull request Nov 20, 2025

Update Array2D accessing pattern & use CUDA_CALLABLE macro #17

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Implement `Array2D` for non-owning 2D views of Host/Device memory #14

Implement `Array2D` for non-owning 2D views of Host/Device memory #14

Uh oh!

horizon-blue commented Nov 19, 2025 •

edited

Loading

Uh oh!

linear bot commented Nov 19, 2025

Uh oh!

horizon-blue Nov 20, 2025

Uh oh!

mugamma left a comment

Uh oh!

mugamma Nov 20, 2025

Uh oh!

horizon-blue Nov 20, 2025

Uh oh!

mugamma Nov 20, 2025

Uh oh!

horizon-blue Nov 20, 2025

Uh oh!

Uh oh!

horizon-blue commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Implement Array2D for non-owning 2D views of Host/Device memory #14

Implement Array2D for non-owning 2D views of Host/Device memory #14

Uh oh!

Conversation

horizon-blue commented Nov 19, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Summary of Changes

Test Plan

Uh oh!

linear bot commented Nov 19, 2025

Uh oh!

horizon-blue Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

mugamma left a comment

Choose a reason for hiding this comment

Uh oh!

mugamma Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

horizon-blue Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

mugamma Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

horizon-blue Nov 20, 2025

Choose a reason for hiding this comment

Uh oh!

Uh oh!

horizon-blue commented Nov 20, 2025

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Implement `Array2D` for non-owning 2D views of Host/Device memory #14

Implement `Array2D` for non-owning 2D views of Host/Device memory #14

horizon-blue commented Nov 19, 2025 •

edited

Loading