Recommend scoring hits with BM25(k1=0.9,b=0.4). by jpountz · Pull Request #46 · quickwit-oss/search-benchmark-game

jpountz · 2023-09-25T06:59:56Z

Currently different engines use different parameters for BM25, e.g. Tantivy and Lucene use (k1=1.2,b=0.75) while PISA uses (k1=0.9,b=0.4). Robertson et al. had initially suggested that 1.2/0.75 would make good defaults for BM25 but Trotman et al. later suggested that 0.9/0.4 would make better defaults and this seems to be the consensus nowadays.

The ranking function matters because it affects which hits may be skipped via dynamic pruninng, which in-turn affects search performance.

Closes #45

Currently different engines use different parameters for BM25, e.g. Tantivy and Lucene use (k1=1.2,b=0.75) while PISA uses (k1=0.9,b=0.4). Robertson et al. had initially suggested that 1.2/0.75 would make good defaults for BM25 but Trotman et al. later suggested that 0.9/0.4 would make better defaults and this seems to be the consensus nowadays. The ranking function matters because it affects which hits may be skipped via dynamic pruninng, which in-turn affects search performance. Closes quickwit-oss#45

jpountz · 2023-09-25T07:05:17Z

I believe that PISA does not require changes though it would be nice to make the BM25 configuration more explicit in the query logic, what do you think @amallia? I could use some help making that change as I'm not too familiar with the PISA API.

It looks like Tantivy supports configuring the ranking function, but I'm not proficient in Rust and could use some help there too.

fulmicoton

The PR does not make the change for tantivy does it?

jpountz · 2023-09-25T09:46:18Z

It does not indeed. I would like to change it but I am not familiar with Rust and unsure how to do it. I could use some help.

jpountz · 2023-10-01T12:22:19Z

As a counterpoint, @rmuir pointed me to the DFR paper which shows that BM25 with k1=1.2/b=0.75 happens to closely match the parameter-free I(n)L2 model, giving further evidence that k1=1.2/b=0.75 are good defaults for BM25.

jpountz · 2023-10-01T12:37:14Z

Separately I checked more search engines and IR toolkits:

PISA, Anserini, JASS, ATIRE use 0.9/0.4
Terrier, Vespa, Lucene, Tantivy use 1.2/0.75

So there doesn't really seem to be a consensus actually. The point from the DFR paper that theory meets practice with 1.2/0.75 is quite convincing. Unless I find more evidence that 0.9/0.4 is more effective, I am considering switching PISA to 1.2/0.75 instead of switching Lucene and Tantivy to 0.9/0.4.

Add note about phrases.

00f58c3

fulmicoton reviewed Sep 25, 2023

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Recommend scoring hits with BM25(k1=0.9,b=0.4).#46

Recommend scoring hits with BM25(k1=0.9,b=0.4).#46
jpountz wants to merge 2 commits intoquickwit-oss:masterfrom
jpountz:engine_guidelines

jpountz commented Sep 25, 2023

Uh oh!

jpountz commented Sep 25, 2023

Uh oh!

fulmicoton left a comment

Uh oh!

jpountz commented Sep 25, 2023

Uh oh!

jpountz commented Oct 1, 2023 •

edited

Loading

Uh oh!

jpountz commented Oct 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

jpountz commented Sep 25, 2023

Uh oh!

jpountz commented Sep 25, 2023

Uh oh!

fulmicoton left a comment

Choose a reason for hiding this comment

Uh oh!

jpountz commented Sep 25, 2023

Uh oh!

jpountz commented Oct 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jpountz commented Oct 1, 2023

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

jpountz commented Oct 1, 2023 •

edited

Loading