Custom sentence function not working? #9974
-
[Edited to add info regarding tokenizer special case] On Spacy 3.1.2 I've written a custom sentence function based on the example given at https://spacy.io/usage/linguistic-features#sbd-custom. Because I'm starting a sentence on special case tokens YOLO. or YODO. the next token after the period (at i + 2) should obviously NOT start a new sentence so I set it to FALSE according to the docs.
Next my custom sentence function, to start sentences on the special case tokens.
However when I run my pipeline this is exactly what's not happening. The YOLO/YODO are being picked up correctly but not the next token. See output below (not my real data of course):
Any idea why is the pipeline not respecting my setting FALSE on the next token? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment
-
Solved: the problem was indexing. In the example given in the docs, the token following the ellipsis should start the new sentence, whereas in my example the token itself (YOLO. or YODO.) should start the sentence, hence
Now it works. |
Beta Was this translation helpful? Give feedback.
Solved: the problem was indexing.
In the example given in the docs, the token following the ellipsis should start the new sentence, whereas in my example the token itself (YOLO. or YODO.) should start the sentence, hence
Now it works.