-
Notifications
You must be signed in to change notification settings - Fork 56
Simplify CUDA Code Generation, main branch (2025.09.19.) #1160
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Simplify CUDA Code Generation, main branch (2025.09.19.) #1160
Conversation
Made the code rely on basic CMake constructs, instead of uding a custom Python script.
|
stephenswat
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What? No. This is replacing a semi-robust solution by a really half-assed solution. If you insist on doing this all via CMake, do the following:
| # Create a well formed postfix for the specialized filenames. | ||
| function(traccc_make_cuda_fname_postfix FNAME) | ||
| set("${FNAME}" "") | ||
| if(NOT "${DETECTOR_NAME}" STREQUAL "") | ||
| set("${FNAME}" "${${FNAME}}.${DETECTOR_NAME}") | ||
| endif() | ||
| if(NOT "${BFIELD_NAME}" STREQUAL "") | ||
| set("${FNAME}" "${${FNAME}}.${BFIELD_NAME}") | ||
| endif() | ||
| string(REPLACE "<scalar>" "" "${FNAME}" "${${FNAME}}") | ||
| set("${FNAME}" "${${FNAME}}" PARENT_SCOPE) | ||
| endfunction() | ||
|
|
||
| # Helper macro for adding a kernel specialization to the build of traccc::cuda. | ||
| function(traccc_add_cuda_specialization TEMPLATE_FILE) | ||
| traccc_make_cuda_fname_postfix(FNAME_POSTFIX) | ||
| configure_file("${TEMPLATE_FILE}" "${TEMPLATE_FILE}${FNAME_POSTFIX}.cu") | ||
| target_sources(traccc_cuda PRIVATE | ||
| "${CMAKE_CURRENT_BINARY_DIR}/${TEMPLATE_FILE}${FNAME_POSTFIX}.cu") | ||
| endfunction() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The fact that this all relies on implicit variable capture needs to go; the bfield name and detector name need to be actual function arguments here. The target also needs to be a template argument so that we don't need a CUDA specialization for this.
| if(NOT "${BFIELD_NAME}" STREQUAL "") | ||
| set("${FNAME}" "${${FNAME}}.${BFIELD_NAME}") | ||
| endif() | ||
| string(REPLACE "<scalar>" "" "${FNAME}" "${${FNAME}}") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CMake string manipulation should go, we need to resolve this all in C++ using template specialization.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This CMake manipulation is here to generate a functional file name. For nothing else. And since the file name is more or less irrelevant (it just needs to be something POSIX compatible), I don't understand why we'd need to do things differently.
| foreach(BFIELD_NAME "const_bfield_backend_t<scalar>" | ||
| "inhom_global_bfield_backend_t<scalar>" | ||
| "inhom_texture_bfield_backend_t") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Bring back the list variable here, and as mentioned before the <scalar> template argument needs to go.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm inputting C++ types here. 🤔 I don't understand your objection. scalar is a valid replacement in your template files.
|
I can make the local CMake function more complicated if absolutely necessary. Though I don't see the code maintenance goal here. The functions/macros are defined, and then are used right away in the same place. For one very specific thing. The goal is not to introduce general, widely usable code here. Since at best it's only HIP that will also be able to make use of this. I also really don't understand why you think that this would be anything but more robust than the current code. 😕 The Python code has a lot of special |



It is possible to simplify how the build would create the specialized kernels for CUDA. While making the code follow standard practices a bit more closely.
I removed the Python script that was doing all this, and switched to simply using configure_file(...). The CMake code still needed to be tweaked slightly to associate workable file names to the type names semi-automatically, but much of the logic of the Python script could be removed.
I tweaked the template files a bit so that their include statements would be easier to handle. And while at it, also renamed them to have the
.inpostfix. As is the convention for these types of files.