KNN Query visit_percentage #133753

john-wagster · 2025-08-28T19:26:37Z

Updating the query interface for knn to include visit_percentage

Usage example with a basic knn query (note this PR updates retrievers as well). This is setup such that visit_percentage overrides num_candidates for bbq_disk:

curl -XPUT --header 'Content-Type: application/json' "http://localhost:9200/test" -d '{
  "mappings": {
    "properties": {
       "image-vector": {
        "type": "dense_vector",
        "dims": 3,
        "similarity": "l2_norm",
        "index_options": {
          "type": "bbq_disk"
        }
      }
    }
  }
}'

seq 1 1 | xargs -I % -P1 curl -XPOST --header 'Content-Type: application/json' "http://localhost:9200/test/_doc?refresh" -d '
    { "image-vector": [0.0127, 0.1230, 0.3929] }
'

curl -XGET --header 'Content-Type: application/json' "http://localhost:9200/test/_search" -d '{
  "query" : {
    "knn": {
      "field": "image-vector",
      "query_vector": [0.0127, 0.1230, 0.3929],
      "k": 10,
      "visit_percentage": 10.0
    }
  }
}' | python -mjson.tool

…ticsearch into query_visit_percentage

github-actions · 2025-08-28T22:32:16Z

🔍 Preview links for changed docs

…ticsearch into query_visit_percentage

john-wagster · 2025-08-29T01:41:19Z

docs/reference/elasticsearch/mapping-reference/dense-vector.md

 `confidence_interval`
 :   (Optional, float) Only applicable to `int8_hnsw`, `int4_hnsw`, `int8_flat`, and `int4_flat` index types. The confidence interval to use when quantizing the vectors. Can be any value between and including `0.90` and `1.0` or exactly `0`. When the value is `0`, this indicates that dynamic quantiles should be calculated for optimized quantization. When between `0.90` and `1.0`, this value restricts the values used when calculating the quantization thresholds. For example, a value of `0.95` will only use the middle 95% of the values when calculating the quantization thresholds (e.g. the highest and lowest 2.5% of values will be ignored). Defaults to `1/(dims + 1)` for `int8` quantized vectors and `0` for `int4` for dynamic quantile calculation.

+`default_visit_percentage` {applies_to}`stack: ga 9.1`


since we didn't add these in a previous PR I just added them here but could be convinced to pull docs out to a separate PR

@benwtrent how do you feel about the docs going in with the belief we are going to yank out the feature flag immediately here. Would you be more comfortable with me pulling all of these out to a separate PR?

…ticsearch into query_visit_percentage

benwtrent

I read through again, looking good. Just need to test null cases.

I left some comments on ones that I found, but then I figured, eh, that is too many comments. So I stopped.

Basically, anything that is a "random input" deal, should test the null case. Additionally, the dense vector type tests should verify that when the value is passed, it actually is applied to the query.

The "refresh" popped up on the screen in the middle of the commenting. Sorry if any of them are on an old commit :(

server/src/internalClusterTest/java/org/elasticsearch/search/query/RescoreKnnVectorQueryIT.java

server/src/test/java/org/elasticsearch/index/mapper/vectors/DenseVectorFieldTypeTests.java

server/src/test/java/org/elasticsearch/search/retriever/KnnRetrieverBuilderParsingTests.java

server/src/test/java/org/elasticsearch/search/retriever/RankDocsRetrieverBuilderTests.java

...er/src/test/java/org/elasticsearch/search/vectors/AbstractKnnVectorQueryBuilderTestCase.java

server/src/test/java/org/elasticsearch/search/vectors/KnnSearchBuilderTests.java

test/framework/src/main/java/org/elasticsearch/search/RandomSearchRequestGenerator.java

…torfieldtype test

john-wagster · 2025-09-03T19:09:07Z

@benwtrent

Basically, anything that is a "random input" deal, should test the null case. Additionally, the dense vector type tests should verify that when the value is passed, it actually is applied to the query.

I think I got all of them? I'm double checking here and reviewing tests in general. But I believe I got fixed up what you saw. I appreciate the feedback too; that was super helpful.

Still curious if you think I should punt the docs update out of this PR or not given that stuff is still behind a feature flag or if you feel like we'll take the flag out shortly and it won't matter.

benwtrent · 2025-09-03T19:43:56Z

Let's remove the docs changes

…ticsearch into query_visit_percentage

john-wagster · 2025-09-03T19:54:40Z

docs changes pulled out to here: #134082

benwtrent

I would just make sure that the new tests you added/mutated all pass with various seeds.

I would also maybe do some release tests verification locally to make sure you handled the feature flag.

I am particularly wondering about things like the more general "RandomSearchRequestGenerator" change and if adding the value, then the generated object gets serialized to xcontent, and then parsed (which doesn't get the field if the flag isn't set) will be missing the object.

I don't know if we have tests that do that, but I know that by default release tests do not run in PRs.

john-wagster · 2025-09-04T19:42:36Z

I've tried to run all of the tests multiple times and on several seeds and not seen any problems. Tried breaking RandomSearchRequestGenerator and not been able to do that. Tried turning the feature flag on and off and behavior seems to be ok. bwc tests seem to run ok (although to be fair configuring some of this from a release vs snapshot standpoint is confusing). I think this is good.

After merging two KNN PRs, the release tests started failing. This fixes those tests. Original PRs: * #133806 * #133753

pulled new optional param visit_percentage through the query logic

9147de6

elasticsearchmachine added the v9.2.0 label Aug 28, 2025

john-wagster added 2 commits August 28, 2025 14:27

spotless

2ab8989

Merge branch 'main' into query_visit_percentage

f7b2927

john-wagster added WIP >non-issue labels Aug 28, 2025

john-wagster and others added 6 commits August 28, 2025 16:50

iter

e947aed

Merge branch 'query_visit_percentage' of github.com:john-wagster/elas…

515ec99

…ticsearch into query_visit_percentage

Merge branch 'main' into query_visit_percentage

ade061f

[CI] Auto commit changes from spotless

08d4e45

docs

dc749b3

Merge branch 'query_visit_percentage' of github.com:john-wagster/elas…

3a55ff2

…ticsearch into query_visit_percentage

github-actions bot deployed to docs-preview August 28, 2025 22:31 View deployment

[CI] Auto commit changes from spotless

af20b61

github-actions bot deployed to docs-preview August 28, 2025 22:39 View deployment

john-wagster added 2 commits August 28, 2025 20:39

docs

e02367b

Merge branch 'query_visit_percentage' of github.com:john-wagster/elas…

da98068

…ticsearch into query_visit_percentage

github-actions bot deployed to docs-preview August 29, 2025 01:40 View deployment

john-wagster commented Aug 29, 2025

View reviewed changes

iter

e282003

github-actions bot deployed to docs-preview August 29, 2025 02:18 View deployment

john-wagster added 2 commits August 28, 2025 22:09

iter

98848b9

iter

43f161f

github-actions bot deployed to docs-preview August 29, 2025 19:00 View deployment

[CI] Auto commit changes from spotless

7c304df

github-actions bot deployed to docs-preview August 29, 2025 19:09 View deployment

john-wagster added 2 commits August 29, 2025 14:34

iter

f305de5

Merge branch 'main' into query_visit_percentage

c133b59

github-actions bot deployed to docs-preview August 29, 2025 19:35 View deployment

benwtrent added >non-issue and removed >feature labels Sep 3, 2025

john-wagster and others added 5 commits September 3, 2025 10:21

Delete docs/changelog/133753.yaml

454982e

fixing nulls

6c24dd5

Merge branch 'main' into query_visit_percentage

b1c78e1

Merge branch 'main' into query_visit_percentage

6fb1c2f

Merge branch 'query_visit_percentage' of github.com:john-wagster/elas…

78a0ecb

…ticsearch into query_visit_percentage

benwtrent reviewed Sep 3, 2025

View reviewed changes

john-wagster added 3 commits September 3, 2025 14:04

improving tests with null checks and added diskbbq format to densevec…

3f20296

…torfieldtype test

Merge branch 'main' into query_visit_percentage

d2ac7c7

spotless

84aeeeb

Merge branch 'main' into query_visit_percentage

c70e728

john-wagster added 2 commits September 3, 2025 14:49

removed docs changes to a separate PR

52ae456

Merge branch 'query_visit_percentage' of github.com:john-wagster/elas…

768facc

…ticsearch into query_visit_percentage

john-wagster added 3 commits September 3, 2025 14:55

merge

c7f1a4f

missed a couple Floats

cd47e8c

Merge branch 'main' into query_visit_percentage

bd5268f

benwtrent approved these changes Sep 3, 2025

View reviewed changes

john-wagster added 3 commits September 3, 2025 23:16

properly gate the parser ctors with the feature flag, and fix tests

16c9dbc

merge

180f149

merge

ff2164e

john-wagster merged commit 352b0d8 into elastic:main Sep 4, 2025
33 checks passed

craigtaverner mentioned this pull request Sep 6, 2025

Fix a number of release tests after KNN PRs merged #134261

Merged

elasticsearchmachine pushed a commit that referenced this pull request Sep 6, 2025

Fix a number of release tests after KNN PRs merged (#134261)

66d1a8e

After merging two KNN PRs, the release tests started failing. This fixes those tests. Original PRs: * #133806 * #133753

pquentin mentioned this pull request Oct 15, 2025

Specification fixes for 9.2 elastic/elasticsearch-specification#5372

Closed

9 tasks

john-wagster mentioned this pull request Oct 15, 2025

Adding visit_percentage for the new DiskBBQ elastic/elasticsearch-specification#5495

Merged

KNN Query visit_percentage #133753

KNN Query visit_percentage #133753

Uh oh!

Conversation

john-wagster commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

🔍 Preview links for changed docs

Uh oh!

john-wagster Aug 29, 2025

Choose a reason for hiding this comment

Uh oh!

john-wagster Sep 3, 2025

Choose a reason for hiding this comment

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

john-wagster commented Sep 3, 2025

Uh oh!

benwtrent commented Sep 3, 2025

Uh oh!

john-wagster commented Sep 3, 2025

Uh oh!

benwtrent left a comment

Choose a reason for hiding this comment

Uh oh!

john-wagster commented Sep 4, 2025

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

john-wagster commented Aug 28, 2025 •

edited

Loading

github-actions bot commented Aug 28, 2025 •

edited

Loading