-
Notifications
You must be signed in to change notification settings - Fork 88
Closed
Description
Hi,
I am using your project and noticed that the current tokenizer only works well with English text. When I try to use it with Chinese (or other non-English languages), the results are not satisfactory.
I would like to know:
- Is there a way to use tiktoken as the tokenizer in this project?
- If not, are there plans to support tiktoken or improve non-English language support in the tokenizer?
My use case involves a lot of multilingual text, so having a better tokenizer (like tiktoken, which handles multilingual text well) would be very helpful.
Thank you for your work!
Reactions are currently unavailable
Metadata
Metadata
Assignees
Labels
No labels