Skip to content

Commit b4efd77

Browse files
authored
server : add parse_special option to /tokenize endpoint (#14783)
1 parent 2be60cb commit b4efd77

File tree

2 files changed

+4
-1
lines changed

2 files changed

+4
-1
lines changed

tools/server/README.md

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -575,6 +575,8 @@ These words will not be included in the completion, so make sure to add them to
575575

576576
`add_special`: (Optional) Boolean indicating if special tokens, i.e. `BOS`, should be inserted. Default: `false`
577577

578+
`parse_special`: (Optional) Boolean indicating if special tokens should be tokenized. When `false` special tokens are treated as plaintext. Default: `true`
579+
578580
`with_pieces`: (Optional) Boolean indicating whether to return token pieces along with IDs. Default: `false`
579581

580582
**Response:**

tools/server/server.cpp

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4516,9 +4516,10 @@ int main(int argc, char ** argv) {
45164516
json tokens_response = json::array();
45174517
if (body.count("content") != 0) {
45184518
const bool add_special = json_value(body, "add_special", false);
4519+
const bool parse_special = json_value(body, "parse_special", true);
45194520
const bool with_pieces = json_value(body, "with_pieces", false);
45204521

4521-
llama_tokens tokens = tokenize_mixed(ctx_server.vocab, body.at("content"), add_special, true);
4522+
llama_tokens tokens = tokenize_mixed(ctx_server.vocab, body.at("content"), add_special, parse_special);
45224523

45234524
if (with_pieces) {
45244525
for (const auto& token : tokens) {

0 commit comments

Comments
 (0)