Why would a spaCy blank pipeline tokenize input? #13650
Replies: 2 comments 6 replies
-
A pipeline is defined as a tokenizer and then zero or more processes that modify the |
Beta Was this translation helpful? Give feedback.
6 replies
-
I'm sorry... this is a counting error. I've either made a mistake in the annotations or the annotation is reporting incorrect numbers. The boundary I've set exceeds the length of the string. |
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
Uh oh!
There was an error while loading. Please reload this page.
-
Hello;
I'm initializing a blank pipe line via nlp = spacy.blank("en") and feeding it text to label some spans. The output DOC from the call to nlp is tokenizing the input sentence rather than returning a string. What would a string contain to make a blank pipeline do this? This is an issue because when I then run char_span on the doc the indexes into the text are returning tokens instead of the individual characters of the string.
Braden.
Beta Was this translation helpful? Give feedback.
All reactions