1 file changed
+1
-1
lines changedSubmodule tokenizers updated 11 files
- include/pytorch/tokenizers/bpe_tokenizer_base.h+2-1
- src/hf_tokenizer.cpp+95-43
- src/normalizer.cpp+2-7
- src/pre_tokenizer.cpp+1-2
- src/tekken.cpp+4-2
- src/tiktoken.cpp+2-1
- test/resources/hf_tokenizer_dir/special_tokens_map.json+16
- test/resources/hf_tokenizer_dir/tokenizer.json+152
- test/resources/hf_tokenizer_dir/tokenizer_config.json+42
- test/test_hf_tokenizer.cpp+15
- test/test_hf_tokenizer.py+20
0 commit comments