Skip to content

Client-side prompt token count is inaccurate #75

@sjmonson

Description

@sjmonson

The client-side tokenization in guidellm fails to account for the extra tokens added in the server's chat prompt template. There are two possible workarounds:

  1. Enable usage metrics in each request and let the server tell us how many prompt tokens there are.
  2. Use the /completions endpoint rather than /chat/completions as the chat template is not applied on the /completions endpoint.

Metadata

Metadata

Assignees

Labels

internalfiled by core contributor or associate

Type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions