You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
[None][fix] Set token IDs on request after router tokenization to avoid re-tokenization
KvCacheAwareRouter now sets prompt_token_ids (ChatCompletionRequest) or
replaces prompt with token IDs (CompletionRequest) after tokenizing,
so the downstream worker server skips redundant tokenization.
Also adds proper ChatCompletionRequest handling via apply_chat_template.
Signed-off-by: Lizhi Zhou <1432185+reasonsolo@users.noreply.github.com>
0 commit comments