Skip to content

Conversation

@irar2
Copy link
Collaborator

@irar2 irar2 commented Sep 3, 2025

Closes #178

In addition, return error if tokenization fails.
Small code refactoring.

@irar2 irar2 requested a review from mayabar September 3, 2025 05:44
Entry("large overhead, 1024 tokens", 2000, 3000, 800, 1024, 0),
Entry("very long prompt", 150, 200, 70, 20000, 0),
Entry("medium overhead, 512 tokens, 256 cached", 200, 1000, 150, 512, 256),
Entry("large overhead, 1024 tokens, 128 cached", 2000, 3000, 800, 1024, 1008),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo in name of the test

Signed-off-by: Ira <[email protected]>
@mayabar
Copy link
Collaborator

mayabar commented Sep 3, 2025

/lgtm
/approve

@github-actions github-actions bot added the lgtm label Sep 3, 2025
@github-actions github-actions bot merged commit 639b40e into llm-d:main Sep 3, 2025
4 checks passed
@irar2 irar2 deleted the cached branch October 22, 2025 12:14
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

Projects

None yet

Development

Successfully merging this pull request may close these issues.

KV Cache simulation for calculation of request prefill time

2 participants