Skip to content
Discussion options

You must be logged in to vote

So it looks like you're applying extractive summarization on top of spaCy using the sentence tokenizer and POS tags. What's happening is you're running up against the limits of extractive summarization - if your basic unit is sentences, there's not really much you can do with your example document. In fact it would be reasonable to interpret it as a single sentence, in which case there's nothing you can do.

If you want to improve your results with minimal changes, you might look at using a custom Sentencizer to change how sentence splits are detected, possibly treating all colons and semicolons as sentence dividers. If you're focused on summarizing obituaries like this though I would hone…

Replies: 2 comments 1 reply

Comment options

You must be logged in to vote
1 reply
@the1gofer
Comment options

Answer selected by the1gofer
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
usage General spaCy usage feat / sentencizer Feature: Sentencizer (rule-based sentence segmenter)
2 participants