Hello @wangkuiyi ,
It seems this tokenizer only supports one special token "<|endoftext|>".
Does it support other additional special tokens? For instatnce the ones we added in special_tokens_map.json,
like
"<|user|>", "<|assistant|>", "<s>", "</s>" and "<unk>"?
Thanks!