Large LocalArray eltypes runs into compiler heuristics 

The following works:

```julia
function broken_kernel()
    c_frags = LocalArray{Tuple{16}, CUDA.WMMA.Fragment{16, 16, 16, 8, Float16, CUDA.WMMA.Unspecified, CUDA.WMMA.Accumulator}}(undef)

    frag = CUDA.WMMA.Fragment{16, 16, 16, 8, Float16, CUDA.WMMA.Unspecified, CUDA.WMMA.Accumulator}(ntuple(_->Float16(0), 8))
    setindex(c_frags, frag, 1)

    return
end

CUDA.code_llvm(GemmKernels.Kernel.broken_kernel, Tuple{})
```

But bumping the eltype to `Tuple{64}` results in an `apply_iterate`.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Large LocalArray eltypes runs into compiler heuristics #99

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Large LocalArray eltypes runs into compiler heuristics #99

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions