Skip to content
Discussion options

You must be logged in to vote

Thank you for your question. You can indeed replace the default tokenizer. This entails writing a function that creates the custom tokenizer (which should implement the tokenizer API) and exposing it as an entry point. How to use a custom tokenizer is described here:

https://spacy.io/usage/linguistic-features#custom-tokenizer

Replies: 1 comment

Comment options

You must be logged in to vote
0 replies
Answer selected by svlandeg
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage feat / tokenizer Feature: Tokenizer
2 participants