Skip to content

Commit 81d4be2

Browse files
authored
Add env SegmentedReduce (non fixed-size overloads) (#7795)
* Add env SegmentedReduce * Add env overloads for DeviceSegmentedReduce ArgMin/ArgMax and refactor to common impl - Add private segmented_reduce_impl that centralizes determinism validation (static_assert rejecting gpu_to_gpu), dispatch_with_env, and tuning extraction, eliminating boilerplate across all env overloads - Refactor Reduce, Sum, Min, Max env overloads to delegate to segmented_reduce_impl - Add new env overloads for ArgMin and ArgMax with full documentation including literalinclude snippet tags - Rewrite env_api tests covering all 6 APIs (Reduce, Sum, Min, Max, ArgMin, ArgMax) with determinism and stream_ref acceptance tests - Unify _env.cu and _env_launch.cu into a single _env.cu test file with default env, launch wrapper, custom stream, and tuning tests * Add env overloads for fixed size segment APIs * add env api literalinclude example just for Reduce and remove non guaranteed api test * Remove fixed_size_segmented_reduce_impl underlying function as it added extra redundant logic for no reason * Add unit tests for fixed-seg-size overloads and argmin argmax * Fix GCC 7 auto deduction in generic lambda for fixed-size ArgMin/ArgMax env overloads * Use __query_result_or_t to query tuning environment * Static assert on numeric_limits specialization * reviews * Use explicit types for plus/minimum/maximum in env overloads to avoid integer promotion * Add cuda::stream-based env to all env API tests * Turn stream.wait() to stream.sync() * Remove fixed-size overloads to simplify PR * Fix breaking change from specializations * Sum(env...) was already there, reintroduce it with the same constrains * Address review nits for DeviceSegmentedReduce env overloads - Add missing non-overlap precondition to env ArgMin and ArgMax docs - Reorder env tests: group all env tests before custom stream tests - Add not_guaranteed determinism test for Reduce env API * Add not_guaranteed query * Docs nits * Add run_to_run and not_guaranteed api tests
1 parent 50a1189 commit 81d4be2

File tree

3 files changed

+1587
-142
lines changed

3 files changed

+1587
-142
lines changed

0 commit comments

Comments
 (0)