Skip to content

ggml-webgpu: move from parameter buffer pool to single buffer with offsets#21278

Open
reeselevine wants to merge 12 commits intoggml-org:masterfrom
reeselevine:one-buffer
Open

ggml-webgpu: move from parameter buffer pool to single buffer with offsets#21278
reeselevine wants to merge 12 commits intoggml-org:masterfrom
reeselevine:one-buffer

Conversation

@reeselevine
Copy link
Copy Markdown
Contributor

Overview

Continuing some work to simplify and make the WebGPU backend scheduling more asynchronous, I realized that we don't actually need a pool of parameter buffers. Instead we can use a single buffer with multiple offset slots, and cycle through them on a batch of submissions. This PR replaces a pool with a webgpu_param_arena, and moves all operations to use it. Memset is special because it lives in the global context, but because it is now asynchronous it uses a single parameter buffer.

Requirements

  • I have read and agree with the contributing guidelines
  • AI usage disclosure: yes, to help refactor to use the arena and simplify gpu profile future handling

@reeselevine reeselevine requested a review from a team as a code owner April 1, 2026 18:46
Copy link
Copy Markdown
Contributor

@nikhilJain17 nikhilJain17 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice! This removes a lot of the footguns we were previously dealing with in kernel submission and waiting by getting rid of individual future handles and just waiting on the whole queue.

@github-actions github-actions bot added ggml changes relating to the ggml tensor library for machine learning WebGPU labels Apr 1, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ggml changes relating to the ggml tensor library for machine learning WebGPU

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants