Skip to content

Comments

Add GPU attention tuned block-size defaults and dispatch#2863

Draft
dlwh wants to merge 7 commits intomainfrom
feat/gpu-attention-block-size-defaults
Draft

Add GPU attention tuned block-size defaults and dispatch#2863
dlwh wants to merge 7 commits intomainfrom
feat/gpu-attention-block-size-defaults

Conversation

@dlwh
Copy link
Member

@dlwh dlwh commented Feb 18, 2026

Summary

  • Add GPU attention tuned block-size lookup table and expose it for dispatch.
  • Wire levanter.grug.attention to use inferred tuned block sizes by default and support partial user overrides.
  • Add pallas_mosaic attention kernel module and package exports.
  • Keep existing fallback behavior while enabling device-aware defaults for GPU tuning.

Testing

  • Not run in this pass.

Notes

  • Draft PR for iterative kernel/perf tuning follow-up work.

@claude
Copy link
Contributor

claude bot commented Feb 18, 2026

Claude Code is working…

I'll analyze this and get back to you.

View job run

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant