Update precise-prefix-cache-scorer to use tokens-based GetPodScores #577

acardace · 2026-01-21T14:07:39Z

Summary

Updates the precise-prefix-cache-scorer to perform tokenization in the scheduler and pass pre-computed tokens to GetPodScores, rather than delegating tokenization to the kv-cache indexer.

Related to llm-d/llm-d-kv-cache#244
Related to #530

Note: This PR depends on llm-d/llm-d-kv-cache#266 and must be merged after it.

Changes

Build: Update Makefile PYTHONPATH to reference llm-d-kv-cache module
Scorer: Tokenize the prompt in the scheduler, then pass tokens to GetPodScores
Tests: Adapt to updated signatures and reuse tokenizer's built-in chat templater

elevran · 2026-01-21T15:21:32Z

/hold for post 0.5

acardace · 2026-01-21T15:27:30Z

@elevran what's the release cadence for llm-d-kv-cache? Of course the corresponding PR in kv-cache must be merged first and have a tag before merging this.

elevran · 2026-01-21T17:07:45Z

I believe it's 6w give or take. @vMaroon can give you a more exact answer. From inference scheduler point of view the hold can be removed as we cut the 0.5 RC in the next few days

The new API separates tokenization from scoring, requiring explicit token processor initialization and a two-step flow: tokenize first, then get pod scores. Signed-off-by: Antonio Cardace <[email protected]>

Adapt tests to the new llm-d-kv-cache API Signed-off-by: Antonio Cardace <[email protected]>

elevran · 2026-01-26T11:54:38Z

@acardace @vMaroon @kfswain
does it make sense to do toeknziation as part of the scorer or should this be more of an "infra" service (perhaps as part of an explicit data preparation phase)?

acardace · 2026-01-26T12:46:22Z

@acardace @vMaroon @kfswain does it make sense to do toeknziation as part of the scorer or should this be more of an "infra" service (perhaps as part of an explicit data preparation phase)?

My take is that this is just prepping in order to move tokenization as a service, possibly inside GAIE. I'm actually working on a RFC to introduce tokenization as a service inside the IGW.

github-project-automation bot added this to llm-d-inference-scheduler Jan 21, 2026

github-actions bot requested review from kfswain and nilig January 21, 2026 14:07

github-actions bot added the hold label Jan 21, 2026

elevran moved this to In progress in llm-d-inference-scheduler Jan 21, 2026

acardace force-pushed the feat/getpodscores-with-token branch 2 times, most recently from 5a30a16 to 2859a30 Compare January 22, 2026 10:17

acardace added 2 commits January 22, 2026 12:34

Update precise-prefix-cache-scorer to latest llm-d-kv-cache

0aea474

The new API separates tokenization from scoring, requiring explicit token processor initialization and a two-step flow: tokenize first, then get pod scores. Signed-off-by: Antonio Cardace <[email protected]>

test: update scorer tests for llm-d-kv-cache API

f6a5830

Adapt tests to the new llm-d-kv-cache API Signed-off-by: Antonio Cardace <[email protected]>

acardace force-pushed the feat/getpodscores-with-token branch from 2859a30 to f6a5830 Compare January 22, 2026 11:34

elevran added this to the v0.6 milestone Jan 22, 2026

elevran removed the hold label Jan 26, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update precise-prefix-cache-scorer to use tokens-based GetPodScores #577

Update precise-prefix-cache-scorer to use tokens-based GetPodScores #577

Uh oh!

acardace commented Jan 21, 2026

Uh oh!

elevran commented Jan 21, 2026

Uh oh!

acardace commented Jan 21, 2026

Uh oh!

elevran commented Jan 21, 2026

Uh oh!

elevran commented Jan 26, 2026

Uh oh!

acardace commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Update precise-prefix-cache-scorer to use tokens-based GetPodScores #577

Are you sure you want to change the base?

Update precise-prefix-cache-scorer to use tokens-based GetPodScores #577

Uh oh!

Conversation

acardace commented Jan 21, 2026

Summary

Changes

Uh oh!

elevran commented Jan 21, 2026

Uh oh!

acardace commented Jan 21, 2026

Uh oh!

elevran commented Jan 21, 2026

Uh oh!

elevran commented Jan 26, 2026

Uh oh!

acardace commented Jan 26, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants