Skip to content

knn search over nested dense_vector fields returns fewer than k hits #131749

@ChrisHegarty

Description

@ChrisHegarty

Several kNN search tests over nested fields are broken.

EDIT: the VectorNestedIT.java has been updated to now include coverage for this scenario.

Reproduction is most straightforward when there is more than one segment, e.g. apply this small change to ensure that the index has more than one segment, then run the test.

$ git diff rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/100_knn_nested_search.yml
diff --git a/rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/100_knn_nested_search.yml b/rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/100_knn_nested_search.yml
index d627be2fb15..832767a53c4 100644
--- a/rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/100_knn_nested_search.yml
+++ b/rest-api-spec/src/yamlRestTest/resources/rest-api-spec/test/search.vectors/100_knn_nested_search.yml
@@ -54,6 +54,12 @@ setup:
           - paragraph_id: 3
             vector: [0, 1.0, 0, 1.8, -15.0]
 
+  - do:
+      indices.refresh: {}
+  - do:
+      indices.flush:
+        index: test
+
   - do:
       index:
         index: test
$ ./gradlew ":rest-api-spec:yamlRestTest" --tests "org.elasticsearch.test.rest.ClientYamlTestSuiteIT" -Dtests.method="test {yaml=search.vectors/100_knn_nested_search/nested kNN search only}" -Druntime.java=24 --info 
REPRODUCE WITH: ./gradlew ":rest-api-spec:yamlRestTest" --tests "org.elasticsearch.test.rest.ClientYamlTestSuiteIT" -Dtests.method="test {yaml=search.vectors/100_knn_nested_search/nested kNN search only}" -Dtests.seed=BA82B9A7A9611F78 -Dtests.locale=sw -Dtests.timezone=America/Argentina/Ushuaia -Druntime.java=24

ClientYamlTestSuiteIT > test {yaml=search.vectors/100_knn_nested_search/nested kNN search only} FAILED
    java.lang.AssertionError: Failure at [search.vectors/100_knn_nested_search:94]: 
    Expected: <2>
         but: was <1>
        at __randomizedtesting.SeedInfo.seed([BA82B9A7A9611F78:32D6867D079D7280]:0)
        at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:524)
        at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.test(ESClientYamlSuiteTestCase.java:484)
        at java.base/jdk.internal.reflect.DirectMethodHandleAccessor.invoke(DirectMethodHandleAccessor.java:104)
        at java.base/java.lang.reflect.Method.invoke(Method.java:565)
        at com.carrotsearch.randomizedtesting.RandomizedRunner.invoke(RandomizedRunner.java:1763)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$8.evaluate(RandomizedRunner.java:946)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$9.evaluate(RandomizedRunner.java:982)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$10.evaluate(RandomizedRunner.java:996)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at org.apache.lucene.tests.util.TestRuleSetupTeardownChained$1.evaluate(TestRuleSetupTeardownChained.java:48)
        at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at org.apache.lucene.tests.util.TestRuleThreadAndTestName$1.evaluate(TestRuleThreadAndTestName.java:45)
        at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl.forkTimeoutingTask(ThreadLeakControl.java:843)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl$3.evaluate(ThreadLeakControl.java:490)
        at com.carrotsearch.randomizedtesting.RandomizedRunner.runSingleTest(RandomizedRunner.java:955)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$5.evaluate(RandomizedRunner.java:840)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$6.evaluate(RandomizedRunner.java:891)
        at com.carrotsearch.randomizedtesting.RandomizedRunner$7.evaluate(RandomizedRunner.java:902)
        at org.elasticsearch.test.cluster.local.DefaultLocalElasticsearchCluster$1.evaluate(DefaultLocalElasticsearchCluster.java:49)
        at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.tests.util.TestRuleStoreClassName$1.evaluate(TestRuleStoreClassName.java:38)
        at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at com.carrotsearch.randomizedtesting.rules.NoShadowingOrOverridesOnMethodsRule$1.evaluate(NoShadowingOrOverridesOnMethodsRule.java:40)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at org.apache.lucene.tests.util.TestRuleAssertionsRequired$1.evaluate(TestRuleAssertionsRequired.java:53)
        at org.apache.lucene.tests.util.AbstractBeforeAfterRule$1.evaluate(AbstractBeforeAfterRule.java:43)
        at org.apache.lucene.tests.util.TestRuleMarkFailure$1.evaluate(TestRuleMarkFailure.java:44)
        at org.apache.lucene.tests.util.TestRuleIgnoreAfterMaxFailures$1.evaluate(TestRuleIgnoreAfterMaxFailures.java:60)
        at org.apache.lucene.tests.util.TestRuleIgnoreTestSuites$1.evaluate(TestRuleIgnoreTestSuites.java:47)
        at org.junit.rules.RunRules.evaluate(RunRules.java:20)
        at com.carrotsearch.randomizedtesting.rules.StatementAdapter.evaluate(StatementAdapter.java:36)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl$StatementRunner.run(ThreadLeakControl.java:390)
        at com.carrotsearch.randomizedtesting.ThreadLeakControl.lambda$forkTimeoutingTask$0(ThreadLeakControl.java:850)
        at java.base/java.lang.Thread.run(Thread.java:1447)

        Caused by:
        java.lang.AssertionError: 
        Expected: <2>
             but: was <1>
            at org.hamcrest.MatcherAssert.assertThat(MatcherAssert.java:20)
            at org.junit.Assert.assertThat(Assert.java:964)
            at org.junit.Assert.assertThat(Assert.java:930)
            at org.elasticsearch.test.rest.yaml.section.MatchAssertion.doAssert(MatchAssertion.java:100)
            at org.elasticsearch.test.rest.yaml.section.Assertion.execute(Assertion.java:68)
            at org.elasticsearch.test.rest.yaml.ESClientYamlSuiteTestCase.executeSection(ESClientYamlSuiteTestCase.java:504)
            ... 41 more

Metadata

Metadata

Assignees

Labels

:Search Relevance/SearchCatch all for Search Relevance:Search Relevance/VectorsVector search>bugTeam:Search RelevanceMeta label for the Search Relevance team in Elasticsearchlucene_10_3_devTracking issue that arise during the development of Lucene 10.3priority:highA label for assessing bug priority to be used by ES engineers

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions