Skip to content

Commit 8829d45

Browse files
authored
Add truncation to count_tokens (#3561)
Align with OpenVINO Tokenizers which truncates inputs at model max length.
1 parent 36e60e3 commit 8829d45

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

demos/benchmark/embeddings/benchmark_embeddings.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -73,7 +73,7 @@ def count_tokens(docs, model):
7373
documents = docs.iter(batch_size=1)
7474
num_tokens = 0
7575
for request in documents:
76-
num_tokens += len(tokenizer(request["text"],add_special_tokens=False)["input_ids"][0])
76+
num_tokens += len(tokenizer(request["text"],add_special_tokens=False, truncation=True)["input_ids"][0])
7777
return num_tokens
7878

7979
@dataclass

0 commit comments

Comments
 (0)