Add env RLE by gonidelis · Pull Request #7908 · NVIDIA/cccl

gonidelis · 2026-03-05T21:02:45Z

copy-pr-bot · 2026-03-05T21:02:49Z

Auto-sync is disabled for draft pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

bernhardmgruber · 2026-03-06T08:27:14Z

cub/cub/device/device_run_length_encode.cuh

+            typename ::cuda::std::enable_if_t<
+              ::cuda::std::is_integral_v<NumItemsT> && !::cuda::std::is_same_v<InputIteratorT, void*>,
+              int> = 0>


No need to constrain NumItemsT

Suggested change

typename ::cuda::std::enable_if_t<

::cuda::std::is_integral_v<NumItemsT> && !::cuda::std::is_same_v<InputIteratorT, void*>,

int> = 0>

::cuda::std::enable_if_t<

!::cuda::std::is_same_v<InputIteratorT, void*>,

int> = 0>

bernhardmgruber · 2026-03-06T08:29:10Z

cub/cub/device/device_run_length_encode.cuh

+            typename ::cuda::std::enable_if_t<
+              ::cuda::std::is_integral_v<NumItemsT> && !::cuda::std::is_same_v<InputIteratorT, void*>,
+              int> = 0>


gonidelis · 2026-03-10T01:33:21Z

cub/cub/device/device_select.cuh

 #include <cub/device/dispatch/dispatch_select_if.cuh>
 #include <cub/device/dispatch/dispatch_unique_by_key.cuh>

-#include <cuda/__execution/determinism.h>


slipped in while on a workaround. let it pass it's a byfix

github-actions · 2026-03-10T02:57:20Z

🥳 CI Workflow Results

🟩 Finished in 1h 21m: Pass: 100%/249 | Total: 3d 01h | Max: 1h 21m | Hits: 98%/156109

See results here.

miscco · 2026-03-10T08:28:01Z

cub/cub/device/device_run_length_encode.cuh

+    _CCCL_NVTX_RANGE_SCOPE("cub::DeviceRunLengthEncode::NonTrivialRuns");
+
+    using global_offset_t = detail::choose_signed_offset_t<NumItemsT>;
+    using equality_op     = ::cuda::std::equal_to<>;


I believe this is fine, because it always just returns a bool and does not promote integers

miscco · 2026-03-10T08:29:07Z

cub/cub/device/device_run_length_encode.cuh

+    _CCCL_NVTX_RANGE_SCOPE("cub::DeviceRunLengthEncode::Encode");
+
+    using equality_op              = ::cuda::std::equal_to<>;
+    using reduction_op             = ::cuda::std::plus<>;


Ditto: Should this rather be

Suggested change

using reduction_op = ::cuda::std::plus<>;

using reduction_op = ::cuda::std::plus<length_t>;

Otherwise this will always promote offset_t to a larger integer

The non-env overload where I guess this implementation comes from also uses plus<>. I agree that that's maybe not what we want, since plus<length_t> does not promote and influence the accumulator type. But this should be addressed in a separate PR.

bernhardmgruber · 2026-03-10T09:32:28Z

cub/test/catch2_test_device_run_length_encode_env.cu

+  cudaStream_t custom_stream;
+  REQUIRE(cudaSuccess == cudaStreamCreate(&custom_stream));


Suggestion: Please use cuda::stream so it gets higher test coverage. If you want to pass a cudaStream_tyou can always call .get() (I think) on the cuda::stream to get the raw underlying stream.

This suggestion applies generally for all env-overload PRs.

@gonidelis since the PR auto-merged, please create a note or a tracking issue to replace all manual stream creation by cuda::stream in our unit tests.

github-project-automation bot added this to CCCL Mar 5, 2026

github-project-automation bot moved this to Todo in CCCL Mar 5, 2026

cccl-authenticator-app bot moved this from Todo to In Progress in CCCL Mar 5, 2026

bernhardmgruber reviewed Mar 6, 2026

View reviewed changes

gonidelis added 2 commits March 9, 2026 16:57

Add env RLE

623aec3

Add missing test, fix licenses

40a08c5

gonidelis force-pushed the rle_env branch from 4630c3b to 40a08c5 Compare March 9, 2026 23:57

gonidelis marked this pull request as ready for review March 9, 2026 23:57

gonidelis requested a review from a team as a code owner March 9, 2026 23:57

gonidelis requested a review from pauleonix March 9, 2026 23:57

cccl-authenticator-app bot moved this from In Progress to In Review in CCCL Mar 9, 2026

gonidelis force-pushed the rle_env branch from 2e7290f to cb31c65 Compare March 10, 2026 00:03

gonidelis requested review from NaderAlAwar and srinivasyadav18 March 10, 2026 00:08

Remove any references to determinism

1de3972

gonidelis force-pushed the rle_env branch from cb31c65 to 1de3972 Compare March 10, 2026 00:10

Fix licenses?

c30d037

gonidelis commented Mar 10, 2026

View reviewed changes

gonidelis enabled auto-merge (squash) March 10, 2026 03:21

miscco reviewed Mar 10, 2026

View reviewed changes

bernhardmgruber approved these changes Mar 10, 2026

View reviewed changes

gonidelis merged commit defbe55 into NVIDIA:main Mar 10, 2026
268 checks passed

This was referenced Mar 13, 2026

Revamp enable_if constraints in all env overloads #8033

Open

Don't use cudaStream_t legacy in env RLE tests #8036

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add env RLE#7908

Add env RLE#7908
gonidelis merged 4 commits intoNVIDIA:mainfrom
gonidelis:rle_env

gonidelis commented Mar 5, 2026

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

bernhardmgruber Mar 6, 2026

Uh oh!

bernhardmgruber Mar 6, 2026

Uh oh!

gonidelis Mar 10, 2026 •

edited

Loading

Uh oh!

github-actions bot commented Mar 10, 2026

Uh oh!

miscco Mar 10, 2026

Uh oh!

miscco Mar 10, 2026

Uh oh!

bernhardmgruber Mar 10, 2026

Uh oh!

bernhardmgruber Mar 10, 2026

Uh oh!

bernhardmgruber Mar 10, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

	using reduction_op = ::cuda::std::plus<>;
	using reduction_op = ::cuda::std::plus<length_t>;

		cudaStream_t custom_stream;
		REQUIRE(cudaSuccess == cudaStreamCreate(&custom_stream));

Conversation

gonidelis commented Mar 5, 2026

Uh oh!

copy-pr-bot bot commented Mar 5, 2026

Uh oh!

bernhardmgruber Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

bernhardmgruber Mar 6, 2026

Choose a reason for hiding this comment

Uh oh!

gonidelis Mar 10, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Mar 10, 2026

🥳 CI Workflow Results

🟩 Finished in 1h 21m: Pass: 100%/249 | Total: 3d 01h | Max: 1h 21m | Hits: 98%/156109

Uh oh!

miscco Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

miscco Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

bernhardmgruber Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

bernhardmgruber Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

bernhardmgruber Mar 10, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

gonidelis Mar 10, 2026 •

edited

Loading