Skip to content

Conversation

@Sichao25
Copy link
Contributor

This PR debugs the issue where the simplified field could only compile and work in host space. The implementation maintains most APIs returning host space arrays even when executing in GPU. In the future, supporting GPU simplified fields APIs would be preferable.

A few additional issues were not addressed in this PR:

  • Some code may compile but not work on GPU if not included in the tests. (e.g. the point_cloud implementation)
  • Some compiler warnings remain unresolved.

These issues will be addressed in future PRs.

Copy link

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR addresses compilation and runtime issues for simplified field operations to enable GPU execution, while maintaining host-space APIs. The implementation converts plain C++ lambdas to Kokkos-compatible lambdas (KOKKOS_LAMBDA and KOKKOS_CLASS_LAMBDA), refactors inline kernels into explicit functors for better GPU compatibility, and adds necessary memory space conversions between device and host. However, several memory space handling issues remain, particularly in the point_cloud adapter, and there are performance inefficiencies in the copy patterns used for host-device data transfers.

Key changes:

  • Converted plain C++ lambdas to KOKKOS_LAMBDA for device compatibility in test files and field evaluation code
  • Refactored inline Kokkos::parallel_for lambdas into named functors in omega_h and point_cloud adapters
  • Added host-device memory copies and explicit memory space handling throughout field operations

Reviewed changes

Copilot reviewed 13 out of 13 changed files in this pull request and generated 10 comments.

Show a summary per file
File Description
test/test_spline_interpolator.cpp Added fence for synchronization; updated Rank4View construction to use dynamic extents
test/test_field_interpolation.cpp Converted plain lambdas to KOKKOS_LAMBDA; added host memory conversions for field data
test/test_field_evaluation.cpp Converted plain lambdas to KOKKOS_LAMBDA; added host memory conversions for field data
test/test_field_copy.cpp Updated to use HostMemorySpace execution space for parallel operations
test/test_field_communication.cpp Added host copies of device arrays before host access; updated to use proper memory spaces
src/pcms/transfer_field2.h Updated make_const_array_view template parameters (inconsistently)
src/pcms/interpolator/spline_interpolator.hpp Converted lambdas to KOKKOS_CLASS_LAMBDA in functors; added debug printf for errors
src/pcms/adapter/point_cloud/point_cloud_layout.cpp Refactored initialization into functor; added make_const_array_view calls with memory space (but with incorrect memory space casting)
src/pcms/adapter/point_cloud/point_cloud.cpp Refactored coordinate copy into functor; updated memory space handling (with incorrect casting)
src/pcms/adapter/omega_h/omega_h_field_layout.h Added host-side copies of gids, owned, and dof_holder_coords
src/pcms/adapter/omega_h/omega_h_field_layout.cpp Refactored inline kernels into functors; added host memory synchronization with redundant copies
src/pcms/adapter/omega_h/omega_h_field2.h Changed dof_holder_data_ to HostMemorySpace
src/pcms/adapter/omega_h/omega_h_field2.cpp Added extensive host-device memory copies; refactored inline kernels into functors (with performance issues and memory access bugs)

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comment on lines 340 to 342
auto coordinates_host_tmp = Kokkos::create_mirror_view(coordinates_d);
Kokkos::deep_copy(coordinates_host_tmp, hint.coordinates_);
Kokkos::deep_copy(coordinates_d, coordinates_host_tmp);
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant copy operations. Lines 340-342 create a mirror view, copy from hint.coordinates_ (HostMemorySpace) to the mirror, then copy to device. This is unnecessary - you can directly deep_copy from host to device. Replace lines 338-342 with: Kokkos::View<Real**> coordinates_d("coordinates_d", hint.coordinates_.extent(0), hint.coordinates_.extent(1)); Kokkos::deep_copy(coordinates_d, hint.coordinates_);

Suggested change
auto coordinates_host_tmp = Kokkos::create_mirror_view(coordinates_d);
Kokkos::deep_copy(coordinates_host_tmp, hint.coordinates_);
Kokkos::deep_copy(coordinates_d, coordinates_host_tmp);
Kokkos::deep_copy(coordinates_d, hint.coordinates_);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Direct 2D deep_copy fails due to layout differences between views. The extra copy here follows the workaround from the Kokkos documentation for handling layout mismatches.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a comment in place to explain what you are doing and link to the kokkos documentation.

Comment on lines 347 to 352
auto eval_results_tmp =
Kokkos::create_mirror_view(Kokkos::HostSpace(), eval_results);
Kokkos::deep_copy(eval_results_tmp, eval_results);
Kokkos::View<Real**, HostMemorySpace> eval_results_h(
"eval_results_h", eval_results_tmp.extent(0), eval_results_tmp.extent(1));
Kokkos::deep_copy(eval_results_h, eval_results_tmp);
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant copy operations. Lines 347-352 create a mirror, copy from device to mirror, create another host view, then copy mirror to host view. This is unnecessary - you can directly use the mirror view. Replace lines 347-352 with: auto eval_results_h = Kokkos::create_mirror_view_and_copy(Kokkos::HostSpace(), eval_results);

Suggested change
auto eval_results_tmp =
Kokkos::create_mirror_view(Kokkos::HostSpace(), eval_results);
Kokkos::deep_copy(eval_results_tmp, eval_results);
Kokkos::View<Real**, HostMemorySpace> eval_results_h(
"eval_results_h", eval_results_tmp.extent(0), eval_results_tmp.extent(1));
Kokkos::deep_copy(eval_results_h, eval_results_tmp);
auto eval_results_h = Kokkos::create_mirror_view_and_copy(Kokkos::HostSpace(), eval_results);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue regarding layout incompatible views.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a repeating pattern, let's consider making a function called deep_copy_mismatch_layouts (or something along these lines). We can then put the documentation with the function. I think that will make this code much more obvious.

Comment on lines 249 to 252
auto dof_holder_coords_device_tmp =
Kokkos::create_mirror_view(Kokkos::HostSpace(), dof_holder_coords_);
Kokkos::deep_copy(dof_holder_coords_device_tmp, dof_holder_coords_);
Kokkos::deep_copy(dof_holder_coords_host_, dof_holder_coords_device_tmp);
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Redundant copy operations. Lines 249-252 create a mirror view, copy from device to mirror, then copy mirror to dof_holder_coords_host_. This is unnecessary - you can directly deep_copy from device to host. Replace lines 249-252 with: Kokkos::deep_copy(dof_holder_coords_host_, dof_holder_coords_);

Suggested change
auto dof_holder_coords_device_tmp =
Kokkos::create_mirror_view(Kokkos::HostSpace(), dof_holder_coords_);
Kokkos::deep_copy(dof_holder_coords_device_tmp, dof_holder_coords_);
Kokkos::deep_copy(dof_holder_coords_host_, dof_holder_coords_device_tmp);
Kokkos::deep_copy(dof_holder_coords_host_, dof_holder_coords_);

Copilot uses AI. Check for mistakes.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same issue regarding layout incompatible views.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify function as described above.

Comment on lines 44 to 63
Rank1View<const bool, HostMemorySpace> PointCloudLayout::GetOwned() const
{
return make_const_array_view(owned_);
return make_const_array_view<const Kokkos::View<bool*>, HostMemorySpace>(
owned_);
Copy link

Copilot AI Nov 25, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

owned_ is a device memory view (Kokkos::View<bool*>) but GetOwned() returns it as a HostMemorySpace view without a deep_copy. This creates an invalid memory space cast. Either owned_ should be declared as Kokkos::View<bool*, HostMemorySpace>, or a deep_copy to host memory should be performed before returning.

Copilot uses AI. Check for mistakes.
@jacobmerson
Copy link
Collaborator

@Sichao25 Take the copilot review with the appropriate grain of salt. I will go through this in detail over the next few days. I'm a bit crunched for time with proposal writing.

@Sichao25 Sichao25 mentioned this pull request Nov 26, 2025
coords(i, 0) = coordinates(i, 0);
coords(i, 1) = coordinates(i, 1);
});
auto coordinates_host_tmp = Kokkos::create_mirror_view(coords);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think these arrays should probably be stored at the class level so we aren't creating them lots of times, but we can leave that to a future optimization pass.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Created a issue including this optimization #233

Comment on lines 340 to 342
auto coordinates_host_tmp = Kokkos::create_mirror_view(coordinates_d);
Kokkos::deep_copy(coordinates_host_tmp, hint.coordinates_);
Kokkos::deep_copy(coordinates_d, coordinates_host_tmp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs a comment in place to explain what you are doing and link to the kokkos documentation.

auto eval_results = mesh_field_->evaluate(coordinates_d, offsets_d);

auto eval_results_tmp =
Kokkos::create_mirror_view(Kokkos::HostSpace(), eval_results);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally we can move the construction of these views to be a one-time operation based on the field layout which will tell us how big the arrays will need to be.

Comment on lines 347 to 352
auto eval_results_tmp =
Kokkos::create_mirror_view(Kokkos::HostSpace(), eval_results);
Kokkos::deep_copy(eval_results_tmp, eval_results);
Kokkos::View<Real**, HostMemorySpace> eval_results_h(
"eval_results_h", eval_results_tmp.extent(0), eval_results_tmp.extent(1));
Kokkos::deep_copy(eval_results_h, eval_results_tmp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is a repeating pattern, let's consider making a function called deep_copy_mismatch_layouts (or something along these lines). We can then put the documentation with the function. I think that will make this code much more obvious.

return owned_gids;
}

struct ComputeVertexCoordsFunctor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment saying that this is a workaround to specify the parametric coordinates for MeshFields to be replaced when SCOREC/meshFields#70 is resolved.

}
};

struct ComputeEdgeCoordsFunctor
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add comment saying that this is a workaround to specify the parametric coordinates for MeshFields to be replaced when SCOREC/meshFields#70 is resolved.

Comment on lines 249 to 252
auto dof_holder_coords_device_tmp =
Kokkos::create_mirror_view(Kokkos::HostSpace(), dof_holder_coords_);
Kokkos::deep_copy(dof_holder_coords_device_tmp, dof_holder_coords_);
Kokkos::deep_copy(dof_holder_coords_host_, dof_holder_coords_device_tmp);
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

specify function as described above.

gids_[i] = 0;
});
Kokkos::parallel_for(coords_.extent(0),
InitializeOwnedGidsFunctor(owned_, gids_));
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it will be more clear if we use fill and iota.
The fill algorithm is defined here: https://kokkos.org/kokkos-core-wiki/API/algorithms/std-algorithms/all/StdFill.html

It looks like there is no function called iota defined in kokko so we will need to define our own. Don't worry about being super general, just take the 1D array you have here and perform filling as you already do.

@jacobmerson
Copy link
Collaborator

@Sichao25 I think if you add this redev version you will get past the perfstubs issue: https://github.com/SCOREC/pcms/pull/178/files#diff-dd58fe9ad10c75a10b51fd76d06aa41cb591ca2181284244e270c65765464c4fR101

I think what's going on is that previously redev was not exporting perfstubs in its cmake config file.

@Sichao25
Copy link
Contributor Author

Sichao25 commented Dec 8, 2025

Thanks. I will add it and test.

@jacobmerson
Copy link
Collaborator

@Sichao25 I merged #232. Please merge with develop and see if that fixes this test case. Thanks!

@Sichao25
Copy link
Contributor Author

Sichao25 commented Dec 9, 2025

Just merged. Thanks!

@jacobmerson jacobmerson merged commit c6791b3 into SCOREC:develop Dec 9, 2025
4 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants