[Feature] CacheBlend #458

wuhuxiao · 2025-12-03T01:52:54Z

Purpose

Context Independent Cache is all you need.
Use chunk cache hit to speed up TTFT.
What this PR does / why we need it?

Modifications

Does this PR introduce any user-facing change?

Test

ENABLE_SPARSE=True
DATA_DIR=/home/data/kv_cache
MODEL_PATH=/home/models/mistralai/Mistral-7B-Instruct-v0.2
BLEND_DATASET_PATH=/home/datasets/LongBench/data/2wikimqa.jsonl
python offline_inference_blend.py

How was this patch tested?

mag1c-h

Please submit your pull request to the develop branch, the v1 branch is no longer accepting any pull requests.

blend ready

fed6b79

wuhuxiao requested review from HaoLi980405, Zbm1996, harrisonyhq, hek14, mag1c-h, qyh111, saki-daisuki, summer-ai007, wangwenxin0312, xwLearnsLLM, ygwpz, yxkyong and zbb200819 as code owners December 3, 2025 01:52

wuhuxiao force-pushed the dev-ucm-v1_whx branch from b6992ec to fed6b79 Compare December 3, 2025 01:54

wuhuxiao added 4 commits December 3, 2025 09:59

fix version & env

99e60ff

add kernel gloden test for blockwise_rope.py

1d3c774

format test

d1a65d5

format

1bc60cf

mag1c-h requested changes Dec 4, 2025

View reviewed changes

ygwpz deleted the branch ModelEngine-Group:dev-ucm-v1 December 4, 2025 03:08

ygwpz closed this Dec 4, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[Feature] CacheBlend #458

[Feature] CacheBlend #458

Uh oh!

wuhuxiao commented Dec 3, 2025 •

edited

Loading

Uh oh!

mag1c-h left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

[Feature] CacheBlend #458

[Feature] CacheBlend #458

Uh oh!

Conversation

wuhuxiao commented Dec 3, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Purpose

Modifications

Test

Uh oh!

mag1c-h left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

wuhuxiao commented Dec 3, 2025 •

edited

Loading