Help with finding "Parent" word of exceptions/contractions during tokenization #10725
-
I'm having an issue with the tokenization of contractions in English using spaCy. I know the normal behavior for a sentence like However, I want to be able to check the tokens and see that I also saw that So I know that during the tokenization process, it's identifying exceptions, I just need to access them as an attribute of the tokens. Any help would be appreciated. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
The main
|
Beta Was this translation helpful? Give feedback.
The main
Tokenizer
used in the pipeline doesn't track this information at all (it would affect the performance), so there's no way to access/store this.Tokenizer.explain
is a much slower implementation of the same algorithm just for debugging purposes.