Skip to content

ci: add hook consistent check#162

Open
halfcrazy wants to merge 1 commit intoProject-HAMi:mainfrom
halfcrazy:hook-check
Open

ci: add hook consistent check#162
halfcrazy wants to merge 1 commit intoProject-HAMi:mainfrom
halfcrazy:hook-check

Conversation

@halfcrazy
Copy link
Copy Markdown

@halfcrazy halfcrazy commented Mar 24, 2026

Summary

This PR adds a mandatory static consistency gate for CUDA hook registrations and fixes existing inconsistencies in the hook chain.

What’s included

  • Add a new static checker script:
    • hack/check_cuda_hook_consistency.py
  • Enforce consistency checks across:
    • src/cuda/hook.c (cuda_library_entry[])
    • src/include/libcuda_hook.h (cuda_override_enum_t)
    • src/libvgpu.c (DLSYM_HOOK_FUNC(...))
    • wrapper function definitions in src/cuda/*.c and src/libvgpu.c
  • Wire the checker into CI as a required gate:
    • add hook-registry-check job in .github/workflows/build-src.yml
    • make BuildHamiCoreLib depend on this job (needs)
  • Add local make target:
    • make check-cuda-hook-consistency

Consistency fixes included in this PR

  • Align enum order with hook table:
    • fix cuMemcpy2D_v2 / cuMemcpy2DUnaligned_v2 order mismatch in src/include/libcuda_hook.h
  • Complete dlsym hook registration:
    • add DLSYM_HOOK_FUNC(cuMemFreeAsync) in src/libvgpu.c
  • Remove duplicated dlsym hook entries in src/libvgpu.c:
    • duplicate cuDeviceGet
    • duplicate cuMemcpyDtoDAsync_v2

Why

Historically, CUDA API additions required synchronized updates across multiple files, which is error-prone and can silently leave gaps (especially in dlsym/cuGetProcAddress paths).
This PR turns that into an automated, deterministic CI gate to prevent regressions.

Test Plan

  • Run local check:
    • make check-cuda-hook-consistency
  • Verify checker passes on current tree:
    • python3 hack/check_cuda_hook_consistency.py
  • Verify no lint issues introduced in touched files
  • Verify CI workflow contains and runs hook-registry-check before build

Risk / Impact

  • Low runtime risk.
  • Main impact is process hardening:
    • PRs that introduce hook registration mismatches or duplicates will now fail fast in CI.

@hami-robot
Copy link
Copy Markdown
Contributor

hami-robot bot commented Mar 24, 2026

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: halfcrazy
Once this PR has been reviewed and has the lgtm label, please assign archlitchi for approval. For more information see the Kubernetes Code Review Process.

The full list of commands accepted by this bot can be found here.

Details Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@hami-robot
Copy link
Copy Markdown
Contributor

hami-robot bot commented Mar 24, 2026

Welcome @halfcrazy! It looks like this is your first PR to Project-HAMi/HAMi-core 🎉

@hami-robot hami-robot bot added the size/L label Mar 24, 2026
Signed-off-by: Yan Zhu <hackzhuyan@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant