Take cached prompt tokens into account in prefill time calculation #184

irar2 · 2025-09-03T05:44:46Z

Closes #178

In addition, return error if tokenization fails.
Small code refactoring.

mayabar · 2025-09-03T08:29:23Z

pkg/llm-d-inference-sim/simulator_test.go

+			Entry("large overhead, 1024 tokens", 2000, 3000, 800, 1024, 0),
+			Entry("very long prompt", 150, 200, 70, 20000, 0),
+			Entry("medium overhead, 512 tokens, 256 cached", 200, 1000, 150, 512, 256),
+			Entry("large overhead, 1024 tokens, 128 cached", 2000, 3000, 800, 1024, 1008),


typo in name of the test

Signed-off-by: Ira <[email protected]>

mayabar · 2025-09-03T08:45:30Z

/lgtm
/approve

Take cached prompt tokens into account in prefill time calculation

24f5d85

Signed-off-by: Ira <[email protected]>

irar2 requested a review from mayabar September 3, 2025 05:44

mayabar reviewed Sep 3, 2025

View reviewed changes

Review comments

a88a7df

Signed-off-by: Ira <[email protected]>

github-actions bot added the lgtm label Sep 3, 2025

github-actions bot approved these changes Sep 3, 2025

View reviewed changes

github-actions bot merged commit 639b40e into llm-d:main Sep 3, 2025
4 checks passed

irar2 deleted the cached branch October 22, 2025 12:14

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Take cached prompt tokens into account in prefill time calculation #184

Take cached prompt tokens into account in prefill time calculation #184

Uh oh!

irar2 commented Sep 3, 2025

Uh oh!

mayabar Sep 3, 2025

Uh oh!

mayabar commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Take cached prompt tokens into account in prefill time calculation #184

Take cached prompt tokens into account in prefill time calculation #184

Uh oh!

Conversation

irar2 commented Sep 3, 2025

Uh oh!

mayabar Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

mayabar commented Sep 3, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants