Skip to content

ggml_vulkan: RADV crash on ggml_set_rows due to zero size buffer #14845

@netrunnereve

Description

@netrunnereve

With very new versions of RADV from Mesa (f4ad6e6 from a month ago has it, their latest master has it, but 25.0.7 is fine) test-backend-ops fails on a driver assertion due to a zero size buffer. AFAIK the assertion is a relatively new thing and the old versions don't have it.

The run fails on the test SET_ROWS(type=f32,ne=[256,5,1,3],nr23=[1,1],r=1,v=1) after the op was added to Vulkan in #14587. Basically everything where r=1 and v=1 fails, while everything else passes.

ggml_vk_op_f32((0x5abc2b82eb60, name=src (view), type=0, ne0=256, ne1=0, ne2=1, ne3=3, nb0=4, nb1=1024, nb2=1024, nb3=1024), (0x5abc2b82ecd0, name=view_of_rows, type=27, ne0=0, ne1=1, ne2=3, ne3=1, nb0=8, nb1=8, nb2=8, nb3=24), (0x5abc2b82ee40, name=out, type=0, ne0=256, ne1=5, ne2=1, ne3=3, nb0=4, nb1=1024, nb2=5120, nb3=5120), SET_ROWS, )
ggml_vk_sync_buffers()
ggml_vk_dispatch_pipeline(set_rows_f32, {(0x5abc2b5f7fb0, 23552, 0), (0x5abc2b5f7fb0, 30720, 0), (0x5abc2b5f7fb0, 4096, 15360), }, (0,1,1))
test-backend-ops: ../src/amd/vulkan/radv_descriptors.h:79: radv_write_buffer_descriptor_impl: Assertion `buffer->vk.size > 0 && range > 0' failed.
Aborted (core dumped)

Here RADV basically insists that the descriptor buffer range (or what we call vk_subbuffer size) must be greater than zero. I have no idea if that's a Vulkan requirement or not, but anyways src has ne1=0 and view_of_rows has ne0=0 which causes it to set those buffers to zero size.

In test-backend-ops src has ne1 and view_of_rows has ne0 set to r/2 so basically anything with r=1 and n=1 will fail here.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions