Skip to content

Commit f0bd60a

Browse files
authored
[https://nvbugs/5684820][fix] fix the detokenizer issue for DeepSeek-v3.2 (#10106)
Signed-off-by: Fanrong Li <[email protected]>
1 parent 066b653 commit f0bd60a

File tree

1 file changed

+7
-0
lines changed

1 file changed

+7
-0
lines changed

tensorrt_llm/tokenizer/tokenizer.py

Lines changed: 7 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -213,6 +213,13 @@ def trtllm_decode_incrementally(
213213

214214
new_tokens = self.convert_ids_to_tokens(
215215
token_ids, skip_special_tokens=skip_special_tokens)
216+
# filter out None tokens
217+
if None in new_tokens:
218+
logger.warning(
219+
"An unexpected \"None\" token was generated. This may be caused by a generated token ID being out of the "
220+
"tokenizer's vocabulary. Filtering out \"None\" tokens from the newly generated tokens."
221+
)
222+
new_tokens = [token for token in new_tokens if token is not None]
216223
pending_tokens.extend(new_tokens)
217224

218225
curr_new_text = self.convert_tokens_to_string(

0 commit comments

Comments
 (0)