Skip to content

Commit 80810bf

Browse files
committed
Apply suggestions from code review
1 parent ecb2d1d commit 80810bf

File tree

1 file changed

+11
-1
lines changed
  • content/develop/ai/search-and-query/advanced-concepts

1 file changed

+11
-1
lines changed

content/develop/ai/search-and-query/advanced-concepts/scoring.md

Lines changed: 11 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -94,12 +94,22 @@ FT.SEARCH myIndex "foo" SCORER BM25STD
9494

9595
## BM25STD.NORM
9696

97-
A variation of `BM25STD`, where the scores are normalized by the minimum and maximum score.
97+
A variation of `BM25STD`, where the scores are normalized by the minimum and maximum scores.
98+
99+
`BM25STD.NORM` uses min–max normalization across the collection, making it more accurate in distinguishing documents when term frequency distributions vary significantly. Because it depends on global statistics, results adapt better to collection-specific characteristics, but this comes at a performance cost: min and max values must be computed and updated whenever the collection changes. This method is recommended when ranking precision is critical and the dataset is relatively stable.
98100

99101
## BM25STD.TANH
100102

101103
A variation of `BM25STD.NORM`, where the scores are normalised by linear function `tanh(x)`. `BMSTDSTD.TANH` can take an optional argument, `BM25STD_TANH_FACTOR Y`, which is used to smooth the function and the score values. The default value for `Y` is 4.
102104

105+
`BM25STD.TANH` applies a smooth transformation using the `tanh(x/factor)` function, which avoids collection-dependent statistics and yields faster, more efficient scoring. While this makes it more scalable and consistent across different datasets, the trade-off is reduced accuracy in cases where min–max normalization provides sharper separation. This method is recommended when performance and throughput are prioritized over fine-grained ranking sensitivity.
106+
107+
Following is an example of how to use `BM25STD_TANH_FACTOR Y` in a query.
108+
109+
```
110+
FT.SEARCH idx "term" SCORER BM25STD.TANH BM25STD_TANH_FACTOR 12 WITHSCORES
111+
```
112+
103113
## DISMAX
104114

105115
A simple scorer that sums up the frequencies of matched terms. In the case of union clauses, it will give the maximum value of those matches. No other penalties or factors are applied.

0 commit comments

Comments
 (0)