Skip to content

Fix mixed-IR liveness for inline overload DCE#795

Open
cpcloud wants to merge 5 commits intoNVIDIA:mainfrom
cpcloud:fix/issue-718-inline-overload-dce
Open

Fix mixed-IR liveness for inline overload DCE#795
cpcloud wants to merge 5 commits intoNVIDIA:mainfrom
cpcloud:fix/issue-718-inline-overload-dce

Conversation

@cpcloud
Copy link
Contributor

@cpcloud cpcloud commented Feb 18, 2026

Summary

  • fix use/def and DCE liveness tracking for mixed numba.core and vendored CUDA IR nodes created by inline="always" overload inlining
  • preserve inlined overload dependencies so DCE no longer drops the live static_getitem feeding the return value in issue [BUG] KeyError with inline="always" and some interaction with DCE #718
  • add a user-style regression with a module-level overload and kernel launch using cuda.local.array
  • refresh pixi.lock local package entries so numba-cuda variants resolve to 0.27.0

Related: the separate np.dtype signature compatibility fix was split into #797.

Fixes #718

@greptile-apps
Copy link
Contributor

greptile-apps bot commented Feb 18, 2026

Automatic reviews are disabled for this repository.

@copy-pr-bot
Copy link

copy-pr-bot bot commented Feb 18, 2026

Auto-sync is disabled for ready for review pull requests in this repository. Workflows must be run manually.

Contributors can view more details about this message here.

@rparolin rparolin added this to the numba-cuda backlog milestone Feb 18, 2026
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 18, 2026

/ok to test

1 similar comment
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 18, 2026

/ok to test

@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 18, 2026

/ok to test

brandon-b-miller added a commit that referenced this pull request Feb 20, 2026
## Summary
- align CUDA `np.dtype` overload parameters with NumPy (`dtype`,
`align`, `copy`) while avoiding unsupported `**kwargs` in typing
templates
- align CUDA `np.dot` and `np.vdot` overload parameter names with NumPy
(`a`, `b`, `out`) so signature-compatibility checks pass across NumPy
variants
- preserve existing lowering behavior for supported dtype and
BLAS-backed dot/vdot paths

Extracted from #795 to keep the issue-718 DCE fix focused.

---------

Co-authored-by: Cursor <cursoragent@cursor.com>
Co-authored-by: brandon-b-miller <brmiller@nvidia.com>
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 20, 2026

/ok to test

cpcloud and others added 5 commits February 20, 2026 14:29
Issue NVIDIA#718 mixed IR node classes after inline overload expansion, which made use/def and liveness tracking miss live vars and allowed DCE to drop required expressions. Use compatibility var collection in analysis/DCE and add a user-level regression that exercises inline="always" with cuda.local.array.

Co-authored-by: Cursor <cursoragent@cursor.com>
Use a direct list_vars call with AttributeError fallback in compat_list_vars_node instead of pre-checking attributes, which keeps the compatibility path straightforward while preserving mixed-IR behavior.

Co-authored-by: Cursor <cursoragent@cursor.com>
Update local package records in pixi.lock so all matrix variants of numba-cuda reflect version 0.27.0 during pixi-managed test/build workflows.

Co-authored-by: Cursor <cursoragent@cursor.com>
Match the CUDA np.dtype overload parameters to NumPy's public signature so signature-compatibility checks pass on environments where dtype exposes align/copy/kwargs parameters.

Co-authored-by: Cursor <cursoragent@cursor.com>
@cpcloud
Copy link
Contributor Author

cpcloud commented Feb 20, 2026

/ok to test

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[BUG] KeyError with inline="always" and some interaction with DCE

3 participants