Port back ordered block id feature to rocPRIM rocm 7.0.x#2257
Closed
stanleytsang-amd wants to merge 6 commits intorelease/rocm-rel-7.0from
Closed
Port back ordered block id feature to rocPRIM rocm 7.0.x#2257stanleytsang-amd wants to merge 6 commits intorelease/rocm-rel-7.0from
stanleytsang-amd wants to merge 6 commits intorelease/rocm-rel-7.0from
Conversation
Currently, on Windows, GTest cannot print 128-bit ints. We have a check in `test_utils::protected_assert_eq` that avoids calling `ASSERT_EQ` on 128-bit int values directly, since this will cause the values to be printed in the event of an error. This check was relying on the `is_int128` alias, which was being set to `false_type` when `ROCPRIM_HAS_INT128_SUPPORT` was `false`. As a result, when 128-bit types were passed in, our check could not detect them and would fail to stop the printing. In [rocprim/types.hpp](https://github.com/ROCm/rocm-libraries/blob/develop/projects/rocprim/rocprim/include/rocprim/types.hpp#L69), the types `rocprim::int128_t` and `rocprim::uint128_t` are now defined regardless of how `ROCPRIM_HAS_INT128_SUPPORT` is set. This means we no longer need to guard against usage of these types in our test code (we only need to use `ROCPRIM_HAS_INT128_SUPPORT` in cases where we're doing some operation that explicitly won't work on 128-bit ints). This change removes the code that sets the `is_int128` alias to `false_type` when `ROCPRIM_HAS_INT128_SUPPORT` is not set. Doing this is enough to fix the check in `test_utils::protected_assert_eq`.
1 task
Reverse UseGTestAssert condition on Windows
Recently, we added a check to see if GTest's ASSERT_EQ assertion
should be used within the assert_eq function. The code in the if/else
blocks that act on the results of this check was inverted (the "else"
code block should be the "if" block, and vice-versa).
This change fixes the issue by swapping the code blocks.
This change should be merged after #449.
NB4444
approved these changes
Oct 24, 2025
Contributor
NB4444
left a comment
There was a problem hiding this comment.
The back port looks fine. To summarize it back ports the new config launch system, the ordered block id code and some small fixes.
amd-hsong
approved these changes
Oct 24, 2025
As we add support for more GPU architectures, the size of the fat binary is growing. For some builds, it's large enough that we are running into linker errors because symbols are out of range. This change adds a cmake option called `BUILD_OFFLOAD_COMPRESS`, (defaults to `ON`) which enables/disables the `--offload-compress` compiler option. This option causes the compiler to compress the binary after it's generated. When the binary is run, it is decompressed in one shot on startup. The performance penalty for this decompression seems to be minimal. The reduction in size to the fat binary is significant for some targets (up to about 80% in some cases). It also removes these obsolete default build targets: gfx803, gfx900. Build rocPRIM for all default architectures. Ensure there are no linker errors. No linker errors locally - will wait for CI completion to confirm there as well. - [x] Look over the contributing guidelines at https://github.com/ROCm/ROCm/blob/develop/CONTRIBUTING.md#pull-requests.
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
DO NOT MERGE UNTIL PM APPROVAL GIVEN.
Motivation
See #1981 for full explanation. This feature/fix needs to be ported back to ROCm 7.0.x
Technical Details
Porting back 7.1 ordered block id changes to 7.0 codebase. I had to pull in some other (slightly) unrelated 7.1 code changes to make the porting work.
Also had to cherry pick 6c24aff to get the Windows job to pass.
Test Plan
Running CI.
Test Result
Submission Checklist