probability of a sentence/N-gram #8252

eip2016num1 · 2021-05-31T17:46:10Z

eip2016num1
May 31, 2021

Is there a way to compute the approximate probability of a sentence?

The goal is to compute this for a group of sentences - and find out which sentences are the most or the least likely.

For example, most news articles repeat someone else's analysis, instead of doing original reporting and presenting new insights. Can we use spacy/BERT to detect novelty?

Answered by polm

Jun 1, 2021

Is there a way to compute the approximate probability of a sentence?

We don't have any special feature for this, no.

For example, most news articles repeat someone else's analysis, instead of doing original reporting and presenting new insights. Can we use spacy/BERT to detect novelty?

I don't think "novelty", in terms of whether news is original reporting or not, has anything to do with the probability of a sentence in general. You probably want to look at document clustering or event detection.

There are some limited circumstances in which probability of a sentence, or the common stand-in of parser perplexity, can be useful, but it's usually more for detecting things like weird gram…

View full answer

polm · 2021-06-01T04:05:14Z

polm
Jun 1, 2021

Is there a way to compute the approximate probability of a sentence?

We don't have any special feature for this, no.

For example, most news articles repeat someone else's analysis, instead of doing original reporting and presenting new insights. Can we use spacy/BERT to detect novelty?

I don't think "novelty", in terms of whether news is original reporting or not, has anything to do with the probability of a sentence in general. You probably want to look at document clustering or event detection.

There are some limited circumstances in which probability of a sentence, or the common stand-in of parser perplexity, can be useful, but it's usually more for detecting things like weird grammar than new information.

1 reply

polm Jun 2, 2021

There is an issue for supporting this, though we aren't working on it at the moment. #6872

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

probability of a sentence/N-gram #8252

Uh oh!

{{title}}

Uh oh!

Replies: 1 comment 1 reply

Uh oh!

{{title}}

Uh oh!

Uh oh!

{{title}}

Uh oh!

Select a reply

Uh oh!

Uh oh!

probability of a sentence/N-gram #8252

Uh oh!

eip2016num1 May 31, 2021

Replies: 1 comment · 1 reply

Uh oh!

polm Jun 1, 2021

Uh oh!

polm Jun 2, 2021

eip2016num1
May 31, 2021

Replies: 1 comment 1 reply

polm
Jun 1, 2021