Releases · NVIDIA/numba-cuda

06 Apr 21:37

github-actions

v0.30.0

7928b8b

v0.30.0 Latest

Latest

What's Changed

Test cuDF third party test test_groupby_apply_return_reindexed_series by @mroeschke in #823
Move cuda-pathfinder dependency to core by @ZzEeKkAa in #835
Fix cache invalidation logic. by @tpn in #800
Implement launch config infrastructure. by @tpn in #804
Use cuda-pathfinder to locate nvdisasm by @Jlisowskyy in #842
Fix CABI calling convention: skip env global and escape mangled names by @isVoid in #844
Bump Version to 0.30.0 by @isVoid in #848

New Contributors

@mroeschke made their first contribution in #823

Full Changelog: v0.29.0...v0.30.0

Contributors

tpn, ZzEeKkAa, and 3 other contributors

Assets 17

numba_cuda-0.30.0-cp310-cp310-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl

sha256:9900ad3b93b97f986e76397c767cd688aeb4ae04c335c4074913cf1956d03ebb

1.77 MB 2026-04-06T21:37:03Z
numba_cuda-0.30.0-cp310-cp310-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:177242ecb72fd72f0d9840fa105a660cfa35308c0bfb2a2a15dedbbd774b9681

1.77 MB 2026-04-06T21:37:04Z
numba_cuda-0.30.0-cp310-cp310-win_amd64.whl

sha256:18e2a01ea0a852b2233aca523d311b776b1917559fe3b7db7e7b3d404783c707

1.79 MB 2026-04-06T21:37:03Z
numba_cuda-0.30.0-cp311-cp311-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl

sha256:c09fc10248ffaf70e817d369ece2d8cfeffa192daaaf31d05e2fde8cdf8bed55

1.77 MB 2026-04-06T21:37:03Z
numba_cuda-0.30.0-cp311-cp311-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:12d97e50c7e5f8fb708630abfe81e4726a04173ad2d2e1443a95a68510984033

1.77 MB 2026-04-06T21:37:03Z
numba_cuda-0.30.0-cp311-cp311-win_amd64.whl

sha256:369c8077a0b50a5840f5694e89daf0310a2b4cd42aa5b1a060aac8bdaa561757

1.79 MB 2026-04-06T21:37:04Z
numba_cuda-0.30.0-cp312-cp312-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl

sha256:10824ede5a5ad1caf5577008c27d2fe19bcdbe58a67fd5c6d37632e1040918c5

1.81 MB 2026-04-06T21:37:04Z
numba_cuda-0.30.0-cp312-cp312-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl

sha256:c39ae7f7a5cc0a6aecdd88a13fd76d59d2e755e9ad3da38a845530eb65ab2d8b

1.81 MB 2026-04-06T21:37:04Z
numba_cuda-0.30.0-cp312-cp312-win_amd64.whl

sha256:7c3300a480165a407e6c31489595e3d9ed465d72e7ef40973f2c79fccdec69d5

1.79 MB 2026-04-06T21:37:04Z
numba_cuda-0.30.0-cp313-cp313-manylinux_2_24_aarch64.manylinux_2_28_aarch64.whl

sha256:ef0f7daf89f958a4a96c3fbf14a9b163b49b85fb88581ffb360868c3cb9cf5fb

1.82 MB 2026-04-06T21:37:04Z
Source code (zip)

2026-04-06T21:30:36Z
Source code (tar.gz)

2026-04-06T21:30:36Z

17 Mar 21:46

github-actions

v0.29.0

a8221c7

v0.29.0

What's Changed

Extend dbg.value coverage to loadvar for scalar kernel parameters by @jiel-nv in #813
Fix FP8 uint64 cast flake on Windows by @cpcloud in #829
Use dbg.declare for scalar kernel parameters by @cpcloud in #828
Fix mixed-IR liveness for inline overload DCE by @cpcloud in #795
Use cuda-python for nvvm bindings by @brandon-b-miller in #818
fix(ci): cudaRoundMode typing failure in FP8 test by @kaeun97 in #834
Support cuda_bindings FastEnum by @mdboom in #837
Support cuda.core.GraphBuilder as a kernel-launch stream by @Andy-Jost in #836
fix: normalize numpy integer types to python int to prevent overflow errors by @kaeun97 in #774
Bump Version to 0.29.0 by @isVoid in #838

New Contributors

@Andy-Jost made their first contribution in #836

Full Changelog: v0.28.2...v0.29.0

Contributors

mdboom, cpcloud, and 5 other contributors

Assets 17

03 Mar 18:17

github-actions

v0.28.2

e2e23ed

v0.28.2

What's Changed

build(deps): bump the actions-monthly group with 2 updates by @dependabot[bot] in #815
Swallow DLPack conversion error by @leofang in #825
Fix empty DW_AT_location for reused loop variables by @jiel-nv in #811
Bump version to 0.28.2 by @brandon-b-miller in #827

Full Changelog: v0.28.1...v0.28.2

Contributors

leofang, dependabot, and 2 other contributors

Assets 17

02 Mar 23:16

github-actions

v0.28.1

09531d6

v0.28.1

What's Changed

Support Interop with External Objects Via DLPack by @isVoid in #790
skip flaky test for now by @leofang in #820
Bump version to 0.28.1 by @leofang in #821

Full Changelog: v0.28.0...v0.28.1

Contributors

leofang and isVoid

Assets 17

02 Mar 16:54

gmarkall

v0.28.0

b2c6522

v0.28.0

What's Changed

Fix CI checks job to also fail on dependency failures by @kkraus14 in #777
Remove enums and unused ctypes code that required it by @brandon-b-miller in #775
Clear rtsys during context reset by @brandon-b-miller in #783
Fix pnputil to only restart NVIDIA display adapters by @kkraus14 in #779
Bump cudf version to 26.02 in thirdparty tests by @brandon-b-miller in #785
Remove unused cudf patch by @gmarkall in #786
Fix np.dtype overload signature drift by @cpcloud in #797
Use only public .handle to access cuda-core objects by @brandon-b-miller in #794
Remove most of drvapi.py in favor of direct cuda-python usage by @brandon-b-miller in #784
Find CUDA headers via pathfinder by @brandon-b-miller in #771
Fix incorrect DWARF type encodings for i8 and discriminator types by @jiel-nv in #806
Fix int64 elements DWARF encoding in UniTuple types by @jiel-nv in #807
Unpin nvjitlink from ctk version by @ZzEeKkAa in #809
Disable legilize return type for inline always by @ZzEeKkAa in #812
Fix mixed C ABI / Numba ABI internal-call lowering (#781, #789) by @isVoid in #782
Aligned NumPy dtypes caching by @Jlisowskyy in #792
fix(errors): restore CUDA exception hierarchy to avoid slow string compilation by @cpcloud in #796
Revert exception hierarchy unification (#634) and its fix (#796) by @cpcloud in #816
Bump version to 0.28.0 by @gmarkall in #817

New Contributors

@Jlisowskyy made their first contribution in #792

Full Changelog: v0.27.0...v0.28.0

Contributors

cpcloud, gmarkall, and 6 other contributors

Assets 17

05 Feb 23:30

github-actions

v0.27.0

ff49cc3

v0.27.0

What's Changed

remove super args by @cpcloud in #763
test(refactor): clean up run_in_subprocess by @cpcloud in #762
Disable automatic review trigger for Greptile by @gmarkall in #743
Enable apt proxy caching; skip hosted Windows builds by @kkraus14 in #766
build(deps): bump actions/setup-python from 6.1.0 to 6.2.0 in the actions-monthly group across 1 directory by @dependabot[bot] in #768
Add cuda-core to oldest tests by @brandon-b-miller in #769
Generate line info for PHI exporters in terminator block by @jiel-nv in #756
Move CallConv from CUDAContext to FunctionDescriptor by @isVoid in #717
feat: Add documentation for debugging Numba CUDA programs with CUDA GDB and VSCode by @mmason-nvidia in #665
Remove unused rtapi.py by @brandon-b-miller in #773
fix: fix boolean return type mismatch in C ABI wrapper by @kaeun97 in #770
Add CUDA FP8 type + conversion bindings (E5M2/E4M3/E8M0), HW-accel detection, and comprehensive tests by @isVoid in #686
Bump version to 0.27.0 by @gmarkall in #776

Full Changelog: v0.26.0...v0.27.0

Contributors

cpcloud, gmarkall, and 7 other contributors

Assets 17

30 Jan 14:57

github-actions

v0.26.0

500b41f

v0.26.0

What's Changed

Eliminate duplicate DWARF entries for boolean kernel parameters by @jiel-nv in #749
feat: accept cuda.core.Buffer and cuda.core.utils.StridedMemoryView as kernel inputs by @cpcloud in #751
Replace legacy wheels-build.yaml with build-wheel.yml in publish workflow [no-ci] by @kkraus14 in #760
MatMul test: Move from unittest to fully pytest by @maifeeulasad in #754
ci: move benchmarks to single function calls so that the units and results are easier to interpret by @cpcloud in #759
CI cleanup, extend no-ci skip logic, and add GitHub Release uploads by @kkraus14 in #761
bump pixi version and relock by @cpcloud in #757
Bump version to 0.26.0 by @kkraus14 in #764

New Contributors

@maifeeulasad made their first contribution in #754

Full Changelog: v0.25.0...v0.26.0

Contributors

cpcloud, kkraus14, and 2 other contributors

Assets 17

28 Jan 09:29

gmarkall

v0.25.0

bfa805a

v0.25.0

What's Changed

build(deps): bump the actions-monthly group across 1 directory with 8 updates by @dependabot[bot] in #704
chore(dev): build pixi using rattler by @cpcloud in #713
[feat] Initial version of the Numba CUDA GDB pretty-printer by @mmason-nvidia in #692
revert: chore(dev): build pixi using rattler (#713) by @cpcloud in #719
Fix DISubprogram line number to point to function definition line by @jiel-nv in #695
chore(deps): regenerate pixi lockfile by @cpcloud in #722
Disable per-PR nvmath tests + follow same test practice by @leofang in #723
Adding pixi run test and pixi run test-par support by @rparolin in #724
CI: Add CUDA 13.1 testing support by @Copilot in #705
Use pathfinder for dynamic libraries by @brandon-b-miller in #308
ci: remove rapids containers from conda ci by @cpcloud in #737
Pass the -numba-debug flag to libnvvm by @mmason-nvidia in #681
Fix compatibility with NumPy 2.4: np.trapz and np.in1d removed by @kkraus14 in #739
feat: users can pass shared_memory_carveout to @cuda.jit by @kaeun97 in #642
ci: run tests in parallel by @cpcloud in #740
fix: Fix race condition in CUDA Simulator by @ccam80 in #690
fix: enable flake8-bugbear lints and fix found problems by @cpcloud in #708
chore(deps): add cuda-pathfinder to pixi deps by @cpcloud in #741
Fix: Pass correct flags to linker when debugging in the presence of LTOIR code by @mmason-nvidia in #698
Fix missing line info in Jupyter notebooks by @jiel-nv in #742
Fix kernel return type in DISubroutineType debug metadata by @jiel-nv in #745
Fix prologue debug line info pointing to decorator instead of def line by @jiel-nv in #746
Fix max block size computation in forall by @brandon-b-miller in #744
feat: swap out internal device array usage with StridedMemoryView by @cpcloud in #703
Add Python 3.14 to the wheel publishing matrix by @gmarkall in #750

New Contributors

@mmason-nvidia made their first contribution in #692
@ccam80 made their first contribution in #690

Full Changelog: v0.24.0...v0.25.0

Contributors

cpcloud, gmarkall, and 9 other contributors

Assets 2

22 Jan 12:13

gmarkall

v0.24.0

c82ff48

v0.24.0

What's Changed

Set up a new VM-based CI infrastructure by @leofang in #604
chore(dev-deps): remove ipython and pyinstrument by @cpcloud in #670
Use rapidsai/sccache in CI by @trxcllnt in #674
Remove dangling references to NUMBA_CUDA_ENABLE_MINOR_VERSION_COMPATIBILITY by @brandon-b-miller in #675
Drop experimental from cuda.core namespace imports by @brandon-b-miller in #676
Remove customized address space tracking and address class emission in debug info by @jiel-nv in #669
Support python 3.14 by @brandon-b-miller in #599
fix: use freethreading-supported _PySet_NextItemRef where possible by @cpcloud in #682
chore(deps): bump deps in pixi lockfile by @cpcloud in #693
chore: perf lint by @cpcloud in #697
perf: let CAI fall through instead of calling from_cuda_array_interface by @cpcloud in #694
perf: remove some exception control flow and buffer-exception penalization for arrays by @cpcloud in #700
Fix test_wheel_deps_wheels.sh to actually uninstall nvvm and nvrtc packages for CUDA 13 by @brandon-b-miller in #701
Dropping bits in the old CI & Propagating recent changes from cuda-python by @leofang in #683
chore(deps): bump numba-cuda version and relock pixi by @cpcloud in #707
ci: remove redundant conda build in ci by @cpcloud in #711
ci: relock pixi by @cpcloud in #712
chore: disable locked flag to bypass prefix-dev/pixi#5256 by @cpcloud in #714
Add arch specific target support by @ZzEeKkAa in #549

New Contributors

@trxcllnt made their first contribution in #674

Full Changelog: v0.23.0...v0.24.0

Contributors

trxcllnt, cpcloud, and 4 other contributors

Assets 2

18 Dec 13:07

gmarkall

v0.23.0

1b8e3c0

v0.23.0

What's Changed

refactor: remove devicearray code to reduce complexity by @cpcloud in #600
chore: bump version in pixi.toml by @cpcloud in #641
feat: add set_shared_memory_carveout by @kaeun97 in #629
Migrate numba-cuda driver to use cuda.core.launch API by @rparolin in #609
refactor: cull dead linker objects by @cpcloud in #649
Add support for dependabot by @jpascucci-nv in #647
Fix false negative NRT link decision when NRT was previously toggled on by @brandon-b-miller in #650
test: fix bogus self argument to Context by @cpcloud in #656
Only run dependabot monthly and open fewer PRs by @jpascucci-nv in #658
feat: add print support for int64 tuples by @cpcloud in #663
Do not manually set DUMP_ASSEMBLY in nvjitlink tests by @brandon-b-miller in #662
Test RAPIDS 25.12 by @brandon-b-miller in #661
build(deps): bump actions/upload-artifact from 4 to 5 by @dependabot[bot] in #652
build(deps): bump actions/setup-python from 5.6.0 to 6.1.0 by @dependabot[bot] in #655
feat: allow printing nested tuples by @cpcloud in #667
Fix Issue #588: separate compilation of NVVM IR modules when generating debuginfo by @gmarkall in #591
Fix #624: Accept Numba IR nodes in all places Numba-CUDA IR nodes are expected by @gmarkall in #643
Capture global device arrays in kernels and device functions by @shwina in #666

New Contributors

@jpascucci-nv made their first contribution in #647
@dependabot[bot] made their first contribution in #652
@shwina made their first contribution in #666

Full Changelog: v0.22.1...v0.23.0

Contributors

cpcloud, gmarkall, and 6 other contributors

Assets 2

Releases: NVIDIA/numba-cuda

v0.30.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.29.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.28.2

What's Changed

Contributors

Uh oh!

v0.28.1

What's Changed

Contributors

Uh oh!

v0.28.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.27.0

What's Changed

Contributors

Uh oh!

v0.26.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.25.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.24.0

What's Changed

New Contributors

Contributors

Uh oh!

v0.23.0

What's Changed

New Contributors

Contributors

Uh oh!