feat: Add multi-modality support for content blocks in PrecisePrefixCacheScorer by guygir · Pull Request #565 · llm-d/llm-d-inference-scheduler

guygir · 2026-01-14T22:07:06Z

IMPORTANT: Depends on llm-d/llm-d-kv-cache#255. This PR should not be merged until the kv-cache PR is merged first.

This PR extends the PrecisePrefixCacheScorer to preserve structured multi-modality content blocks (OpenAI API format) through the pipeline.
This is the first stage of multi-modality support - only basic technical feasibility. This PR focuses solely on images - not audio or video, because GAIE already supports images but audio/video support requires additional GAIE changes (These will be addressed in the next stage)

Changes:

Added convertContentToPreprocessingFormat() helper to convert GAIE's structured content blocks to OpenAI API format
Modified getScores() to preserve structured content instead of using raw text
Maintains backward compatibility with text-only content

What Works:

Multi-modal requests are correctly parsed and preserved from GAIE
Structured content blocks flow correctly through the scheduler
Chat template rendering works with structured content
Requests are forwarded correctly to vLLM
Backward compatible with text-only requests

Known Limitations (for current stage):

Tokenization may not match vLLM exactly (images tokenized as text, not vision tokens - this only affects merged preprocessor models like Qwen2-VL)
Block hashes may not match vLLM exactly (missing mm_hash - we're consistent with ourselves but won't match vLLM's hashes for multimodal blocks, which is not an issue, just FYI)
These will be addressed in the next stage.

Testing:

Multi-modality conversion logic test (test/multi_modality/test_multimodality_conversion.go)

…fixCacheScorer, and a basic test script for e2e multi-modality inputs Signed-off-by: Guy Girmonsky <guygir@gmail.com>

Remove redundant \n from fmt.Println string literal (fmt.Println already adds newline) Signed-off-by: Guy Girmonsky <guygir@gmail.com>

Rename test_e2e_multimodality.go to test_multimodality_conversion.go. This test verifies conversion logic (GAIE parsing + format conversion), not the full end-to-end pipeline. Signed-off-by: Guy Girmonsky <guygir@gmail.com>

Update print statements to say 'conversion logic test' instead of 'end-to-end test' Signed-off-by: Guy Girmonsky <guygir@gmail.com>

github-project-automation bot added this to llm-d-inference-scheduler Jan 14, 2026

github-actions bot requested review from elevran and shmuelk January 14, 2026 22:07

guygir added 4 commits January 15, 2026 00:13

feat: Add basic multi-modality support for image inputs in PrecisePre…

7445897

…fixCacheScorer, and a basic test script for e2e multi-modality inputs Signed-off-by: Guy Girmonsky <guygir@gmail.com>

fix: Remove redundant newline in fmt.Println call

75469db

Remove redundant \n from fmt.Println string literal (fmt.Println already adds newline) Signed-off-by: Guy Girmonsky <guygir@gmail.com>

refactor: Rename test file to reflect actual test scope

da032d2

Rename test_e2e_multimodality.go to test_multimodality_conversion.go. This test verifies conversion logic (GAIE parsing + format conversion), not the full end-to-end pipeline. Signed-off-by: Guy Girmonsky <guygir@gmail.com>

fix: Update test output messages to reflect conversion logic scope

78096a5

Update print statements to say 'conversion logic test' instead of 'end-to-end test' Signed-off-by: Guy Girmonsky <guygir@gmail.com>

guygir force-pushed the multi-modality branch from bb7a5e4 to 78096a5 Compare January 14, 2026 22:17

github-actions bot requested review from kfswain and nilig January 14, 2026 22:17

elevran added this to the v0.6 milestone Jan 22, 2026

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

feat: Add multi-modality support for content blocks in PrecisePrefixCacheScorer#565

feat: Add multi-modality support for content blocks in PrecisePrefixCacheScorer#565
guygir wants to merge 4 commits intollm-d:mainfrom
guygir:multi-modality

guygir commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

guygir commented Jan 14, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants