Skip to content

Commit 5471f5a

Browse files
committed
vocab_size sanity check
1 parent 1221d94 commit 5471f5a

File tree

1 file changed

+1
-0
lines changed

1 file changed

+1
-0
lines changed

convert_hf_to_gguf.py

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6425,6 +6425,7 @@ def set_vocab(self):
64256425

64266426
# 3. Generate the tokens and toktypes lists
64276427
vocab_size = self.hparams["vocab_size"]
6428+
assert tokenizer.vocab_size == vocab_size
64286429
special_tokens = tokenizer.special_tokens
64296430
reverse_vocab = {id_ : encoded_tok for encoded_tok, id_ in {**vocab, **special_tokens}.items()}
64306431
tokens: list[str] = []

0 commit comments

Comments
 (0)