Skip to content

Refactor kernels to use dense device view#1978

Draft
upsj wants to merge 6 commits intodevice_accessorfrom
refactor_device_accessor
Draft

Refactor kernels to use dense device view#1978
upsj wants to merge 6 commits intodevice_accessorfrom
refactor_device_accessor

Conversation

@upsj
Copy link
Member

@upsj upsj commented Jan 23, 2026

Using https://github.com/upsj/ginkgo-refactor to refactor headers and source files

Still WIP, since some of the replacements were incorrect

Running this requires compiling LLVM with PR llvm/llvm-project#177442
and compiling ginkgo-refactor based on it and then running

# for core source files
clang-tidy --load=./build/libRefactorDenseToView.so --checks=-*,gko-refactor-core-dense-to-view --fix file.cpp
# for kernel source files *_kernels.cpp
clang-tidy --load=./build/libRefactorDenseToView.so --checks=-*,gko-refactor-kernels-dense-to-view --fix file.cpp
# for kernel header files *_kernels.hpp
clang-tidy --load=./build/libRefactorDenseToView.so --checks=-*,gko-refactor-kernel-decls-dense-to-view --fix file.hpp

@upsj upsj self-assigned this Jan 23, 2026
@ginkgo-bot ginkgo-bot added mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:openmp This is related to the OpenMP module. mod:reference This is related to the reference module. type:solver This is related to the solvers type:preconditioner This is related to the preconditioners type:matrix-format This is related to the Matrix formats mod:hip This is related to the HIP module. type:stopping-criteria This is related to the stopping criteria labels Jan 23, 2026
@upsj
Copy link
Member Author

upsj commented Jan 24, 2026

Issues found so far:

  • some header files get mangled (these are few enough we can fix by hand, are those race conditions in parallel replacements? should be ignored by clang-tidy)
  • incorrect application of replacements to some kernel_launch headers
  • Dense::get_stride() replacement failing
  • need to add/replace kernel_launch mappings for device views
  • "missing" omp.h or hip_runtime.h headers preventing clang-tidy from running
  • dpcpp support (attempt compiling against icpx LLVM?)

@upsj
Copy link
Member Author

upsj commented Jan 26, 2026

The get_stride replacement was simply missing. Creating the compile_commands.json with clang as the host compiler should fix the missing header issues. I'll try to compile against dpcpp later. Avoiding replacements in header files and member functions should avoid the kernel_launch issues, also we could skip functions that don't have the executor as first parameter (those are kernels)

@MarcelKoch
Copy link
Member

Maybe it would make sense to add the clang formatting plugin to this PR? That could make it easier to talk about issues with the plugin. We could later decide if the plugin is also distributed in ginkgo (e.g. in the scripts dir) or not.

@upsj
Copy link
Member Author

upsj commented Jan 26, 2026

Not sure I follow, the clang-format step is just a call of clang-format afterwards. Since this is a one-time refactoring, I would keep it separate for now

@upsj upsj force-pushed the device_accessor branch from 2ef7c49 to a2b2791 Compare March 2, 2026 17:44
@upsj
Copy link
Member Author

upsj commented Mar 3, 2026

The automated refactoring needed some significant fixes (I'll use this as a running list until I push it)

  • Some macros used names like _type different from the template parameters they represented, which caused incorrect replacements. I fixed them manually
  • The restriction to functions that have an executor as the first argument was too strict, it excluded functions like solve_qy in cb_gmres_kernels.cpp
  • I needed to replace matrix::Dense by matrix::view::dense also in kernel_launch mapping functions, but I left the accessor in place, since it is a smaller type than the view, as it doesn't store the size. We can discuss this separately
  • Some replacements in macros didn't work properly and needed to be fixed manually
  • Since template deduction can deal correctly with the implicit conversion from Dense* to const Dense*, but the same isn't true for dense views, I needed to add a as_const function which casts explicitly.
  • The same is true at the call site, when the core code passes in a mutable pointer, but the kernel takes a const argument
  • Some pointers inside kernels could also be nullptr, I replaced them by std::optional

@upsj upsj mentioned this pull request Mar 4, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

mod:core This is related to the core module. mod:cuda This is related to the CUDA module. mod:hip This is related to the HIP module. mod:openmp This is related to the OpenMP module. mod:reference This is related to the reference module. type:matrix-format This is related to the Matrix formats type:preconditioner This is related to the preconditioners type:solver This is related to the solvers type:stopping-criteria This is related to the stopping criteria

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants