Preserve whitespace in sentence segmentation #10548
-
How to reproduce the behaviour
Output:
Notice that the space between two sentences is missing.
Output:
Notice that one whitespace between the first and second sentence is preserved but the two whitespaces between 2nd and 3rd sentences is converted to one whitespace. Is there a way to consistently preserve the whitespace? My understanding from #1707 is that the whitespace will be preserved. But this is not happening. Your Environment
|
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
If you also want to get the trailing space for sentence spans, use It looks like the only change in spaCy since Ines's answer in the linked issue is that it's |
Beta Was this translation helpful? Give feedback.
If you also want to get the trailing space for sentence spans, use
sent.text_with_ws
instead ofsent.text
. See theSpan
attributes here: https://spacy.io/api/span#attributesIt looks like the only change in spaCy since Ines's answer in the linked issue is that it's
text_with_ws
and nottext_with_ws_
.