ggml-webgpu: move from parameter buffer pool to single buffer with offsets by reeselevine · Pull Request #21278 · ggml-org/llama.cpp

reeselevine · 2026-04-01T18:46:55Z

Overview

Continuing some work to simplify and make the WebGPU backend scheduling more asynchronous, I realized that we don't actually need a pool of parameter buffers. Instead we can use a single buffer with multiple offset slots, and cycle through them on a batch of submissions. This PR replaces a pool with a webgpu_param_arena, and moves all operations to use it. Memset is special because it lives in the global context, but because it is now asynchronous it uses a single parameter buffer.

Requirements

I have read and agree with the contributing guidelines
AI usage disclosure: yes, to help refactor to use the arena and simplify gpu profile future handling

nikhilJain17

Nice! This removes a lot of the footguns we were previously dealing with in kernel submission and waiting by getting rid of individual future handles and just waiting on the whole queue.

reeselevine added 11 commits March 25, 2026 15:21

Work towards removing bitcast

f1eb80e

Move rest of existing types over

e9af481

Add timeout back to wait and remove synchronous set_tensor/memset_tensor

b3aa3be

move to unpackf16 for wider compatibility

67fe089

cleanup

e85e8bc

Remove deadlock condition in free_bufs

32ee70a

Merge remote-tracking branch 'upstream/master' into remove_bitcast

309ef1f

Start work on removing parameter buffer pools

1fc8b64

Simplify and optimize further

9592ed5

Merge remote-tracking branch 'upstream/master' into one-buffer

82008f3

simplify profile futures

d307d47

reeselevine requested a review from a team as a code owner April 1, 2026 18:46

nikhilJain17 approved these changes Apr 1, 2026

View reviewed changes

github-actions bot added ggml changes relating to the ggml tensor library for machine learning WebGPU labels Apr 1, 2026

Fix stride

a2c1d91

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ggml-webgpu: move from parameter buffer pool to single buffer with offsets#21278

ggml-webgpu: move from parameter buffer pool to single buffer with offsets#21278
reeselevine wants to merge 12 commits intoggml-org:masterfrom
reeselevine:one-buffer

reeselevine commented Apr 1, 2026

Uh oh!

nikhilJain17 left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

reeselevine commented Apr 1, 2026

Overview

Requirements

Uh oh!

nikhilJain17 left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants