Skip to content

[client] prepare sdk_buffer_check_util for non-CUDA stream abstraction#49

Open
superleo wants to merge 2 commits intoalibaba:mainfrom
superleo:gpu_stream_abstraction
Open

[client] prepare sdk_buffer_check_util for non-CUDA stream abstraction#49
superleo wants to merge 2 commits intoalibaba:mainfrom
superleo:gpu_stream_abstraction

Conversation

@superleo
Copy link

@superleo superleo commented Mar 9, 2026

  • sdk_buffer_check_util.h: conditional cuda_util include, GpuStream_t alias, gpu_stream field
  • sdk_buffer_check_util.cc/.cu: use GpuStream_t and gpu_stream
  • sdk_buffer_check_util_test.cc, transfer_client_impl.cc: use gpu_stream
  • No behavior change for CUDA builds

Copy link

@qoderai qoderai bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👋 Review Summary

Nice refactor to abstract the CUDA stream type and thread it consistently through the buffer-check utilities and their callers; the change looks behavior-preserving for CUDA builds and keeps the SDK buffer-check features optional.

🛡️ Key Risks & Issues

  • GpuStream_t type safety: In sdk_buffer_check_util.h, GpuStream_t is aliased to cudaStream_t under USING_CUDA and void* otherwise. With the current build guards, all call sites that actually reach CUDA code still see GpuStream_t as cudaStream_t, so behavior is fine today. The concern is future usage: exposing GpuStream_t as void* in non-CUDA builds weakens type safety and could hide misconfigurations where CUDA-dependent code is compiled or linked incorrectly. Tightening the preprocessor guards around GpuStream_t-dependent declarations or making the abstraction an opaque, more strongly-typed handle would reduce future risk.
  • Stream lifetime and leaks: SdkBufferCheckPool cells own a gpu_stream created via cudaStreamCreateWithFlags, but the pool destructor only frees buffer resources and never destroys the streams. This is a pre-existing issue, not introduced by this PR, but the new abstraction is a natural place to clarify and enforce ownership and add proper cleanup. In long-running processes or scenarios with multiple create/destroy cycles, leaked streams could accumulate and impact stability.

🧪 Verification Advice

  • For CUDA builds, it would be good to run the existing sdk_buffer_check_util tests plus any higher-level client tests that exercise TransferClientImpl with KVCM_SDK_CHECK enabled, to confirm no behavior change in hash/CRC outputs and no regressions under multi-threaded load.
  • If possible in CI, add or run a non-CUDA build configuration that compiles the client with USING_CUDA disabled, ensuring that the headers and new GpuStream_t alias don’t introduce build or link issues and that the buffer-check feature remains properly disabled.

💡 Thoughts & Suggestions

  • Consider centralizing the GpuStream_t definition and any related helper functions in a small abstraction layer (e.g., gpu_stream_util) that encapsulates creation/destruction and makes ownership semantics explicit. This would make it easier to plug in non-CUDA backends later without exposing void* to the rest of the codebase.
  • If you plan more backends, it may be worth documenting the expected lifecycle of gpu_stream within SdkBufferCheckPool and TransferClientImpl so future contributors don’t accidentally diverge from the intended model.

🤖 Generated by QoderView workflow run

@github-actions github-actions bot added the ai reviewed AI has reviewed this PR label Mar 10, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ai reviewed AI has reviewed this PR

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant