Skip to content

[BUG] from_dtype caching does not account for types alignment returning invalid types #791

@Jlisowskyy

Description

@Jlisowskyy

Describe the bug
The recently added from_dtype LRU caching uses numpy dtype as the cache key. However, numpy's equality comparison does not distinguish between aligned and unaligned struct types:

f16x2 = np.dtype([("x", np.float16), ("y", np.float16)])
f16x2_aligned = np.dtype([("x", np.float16), ("y", np.float16)], align=True)

This causes a caching collision: if an unaligned type is cached first, all subsequent lookups for the aligned variant will incorrectly return the cached unaligned type. As a result, numba will treat aligned types as unaligned, causing the PTX generator to emit fine-grained memory accesses instead of fast aligned accesses, leading to degraded performance.

Steps/Code to reproduce bug

  1. Prepare a clean python environment
  2. Execute pip install numba-cuda
  3. The code below demonstrates the behaviour:
import numba.cuda

def div_by_2(x):
    return x / 2

def main():
    sig = numba.cuda.types.complex128(numba.cuda.types.complex128)
    func = numba.cuda.compile(div_by_2, sig, device=True, abi="c")
    print(func)

if __name__ == "__main__":
    main()

Expected behavior
Function will print:

True
f1 signature: [(Array(Record([('x', {'type': float16, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float16, 'offset': 2, 'alignment': None, 'title': None, })], 4, False), 1, 'C', False, aligned=False), Array(int64, 1, 'C', False, aligned=True))]
f2 signature: [(Array(Record([('x', {'type': float16, 'offset': 0, 'alignment': None, 'title': None, }), ('y', {'type': float16, 'offset': 2, 'alignment': None, 'title': None, })], 4, False), 1, 'C', False, aligned=False), Array(int64, 1, 'C', False, aligned=True))]

Both signatures uses unaligned type.

Environment details (please complete the following information):

  • Environment location: wsl local machine
  • Method of numba-cuda install: pip install
  • Python env:
numba-cuda==0.27.0
numba==0.63.1
Python 3.11.14

Additional context
numba_cuda/numba/cuda/np/numpy_support.py:124

Metadata

Metadata

Assignees

Labels

bugSomething isn't working

Type

Projects

No projects

Relationships

None yet

Development

No branches or pull requests

Issue actions