Optimize sinks attention for prefix cache #260

Todobe · 2025-12-19T07:17:34Z

No description provided.

gemini-code-assist · 2025-12-19T07:17:38Z

Warning

You have reached your daily quota limit. Please wait up to 24 hours and I will start processing your requests again!

* upstream/main: fix little batchsize and int8 quant on ci (sgl-project#302) optimize sinks attention (sgl-project#260) add swiglu_oai_triton (sgl-project#270) update tag to 2026.01.12 (sgl-project#312) feat:add performance compare (sgl-project#311) support add_gemma_rms_norm (sgl-project#310) optimize gdn gating and fused_qkvzba_split_reshape_cat (sgl-project#306) fix layout numTokensPerExpertTensor partial Initialization bug (sgl-project#303) Supplement A2 doc, software and hardware compatibility info (sgl-project#294) Added an environment variable to control whether to enable the Combine Ant Migration feature. (sgl-project#304)

optimize sinks attention

631b8f1

Todobe changed the title ~~Optimize sinks attention~~ Optimize sinks attention for prefix cache Dec 19, 2025

RuixuanZhang06 approved these changes Dec 29, 2025

View reviewed changes

RuixuanZhang06 approved these changes Jan 12, 2026

View reviewed changes

RuixuanZhang06 merged commit 5402f33 into sgl-project:main Jan 12, 2026
4 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Optimize sinks attention for prefix cache #260

Optimize sinks attention for prefix cache #260

Uh oh!

Todobe commented Dec 19, 2025

Uh oh!

gemini-code-assist bot commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Optimize sinks attention for prefix cache #260

Optimize sinks attention for prefix cache #260

Uh oh!

Conversation

Todobe commented Dec 19, 2025

Uh oh!

gemini-code-assist bot commented Dec 19, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants