Skip to content

Commit 0f66d99

Browse files
committed
server : use text_to_send instead of detokenized token
This commit adds a check to the server to avoid using the detokenized predicted token when the predicted token id is the same as the token id that the server is responding with. The motivation for this is is to avoid a mismatch between the text tokens where the text_to_send token may include a leading whitespace character but the detokenized token would not. Resolves: #11728
1 parent d2fe216 commit 0f66d99

File tree

1 file changed

+6
-2
lines changed

1 file changed

+6
-2
lines changed

examples/server/server.cpp

Lines changed: 6 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -541,12 +541,16 @@ struct completion_token_output {
541541
json to_json(bool post_sampling_probs) const {
542542
json probs_for_token = json::array();
543543
for (const auto & p : probs) {
544-
std::string txt(p.txt);
544+
// If the predicted token id is the same as this.tok, then we use the text_to_send instead
545+
// of the detokenized token. This is to avoid a mismatch between the text tokens where
546+
// the text_to_send token may include a leading whitespace character but the detokenized
547+
// token would not.
548+
std::string txt = tok == p.tok ? text_to_send : p.txt;
545549
txt.resize(validate_utf8(txt));
546550
probs_for_token.push_back(json {
547551
{"id", p.tok},
548552
{"token", txt},
549-
{"bytes", str_to_bytes(p.txt)},
553+
{"bytes", str_to_bytes(txt)},
550554
{
551555
post_sampling_probs ? "prob" : "logprob",
552556
post_sampling_probs ? p.prob : logarithm(p.prob)

0 commit comments

Comments
 (0)