Skip to content

Conversation

@JackAKirk
Copy link
Contributor

This uses a ptx instruction to specialize syclcompat::bit_reverse for NVPTX backend.

@JackAKirk JackAKirk requested a review from a team as a code owner October 22, 2024 12:32
@JackAKirk JackAKirk changed the title [SYCLCompat] Specialize reverse_bits for nvptx. [SYCLCompat] Specialize reverse_bits for nvptx Oct 22, 2024
@joeatodd joeatodd changed the title [SYCLCompat] Specialize reverse_bits for nvptx [SYCL][COMPAT Specialize reverse_bits for nvptx Oct 23, 2024
@joeatodd joeatodd changed the title [SYCL][COMPAT Specialize reverse_bits for nvptx [SYCL][COMPAT] Specialize reverse_bits for nvptx Oct 23, 2024
Copy link
Contributor

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cheers @JackAKirk - just a wee request otherwise LGTM

template <typename T> inline T reverse_bits(T a) {
static_assert(std::is_unsigned<T>::value && std::is_integral<T>::value,
"unsigned integer required");
#if defined(__NVPTX__)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you if constexpr(std::is_same_v<T, unsigned> here? Theoretically this API accepts e.g. unsigned short, unsigned char and I guess the PTX wouldn't work for that.

Of course, the ptx for those cases could be defined too, so that'd be an alternative.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for pointing this out. I've dealt with this using sizeof

Copy link
Contributor

@joeatodd joeatodd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, cheers Jack

@ldrumm ldrumm merged commit be1679b into intel:sycl Oct 31, 2024
13 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants