probability of a sentence/N-gram #8252
-
Is there a way to compute the approximate probability of a sentence? The goal is to compute this for a group of sentences - and find out which sentences are the most or the least likely. For example, most news articles repeat someone else's analysis, instead of doing original reporting and presenting new insights. Can we use spacy/BERT to detect novelty? |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 1 reply
-
We don't have any special feature for this, no.
I don't think "novelty", in terms of whether news is original reporting or not, has anything to do with the probability of a sentence in general. You probably want to look at document clustering or event detection. There are some limited circumstances in which probability of a sentence, or the common stand-in of parser perplexity, can be useful, but it's usually more for detecting things like weird grammar than new information. |
Beta Was this translation helpful? Give feedback.
We don't have any special feature for this, no.
I don't think "novelty", in terms of whether news is original reporting or not, has anything to do with the probability of a sentence in general. You probably want to look at document clustering or event detection.
There are some limited circumstances in which probability of a sentence, or the common stand-in of parser perplexity, can be useful, but it's usually more for detecting things like weird gram…