Skip to content
Merged
Show file tree
Hide file tree
Changes from 1 commit
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
d29e647
Support kNN filter on nested metadata
mayya-sharipova Oct 1, 2024
6e6abae
Update docs/changelog/113949.yaml
mayya-sharipova Oct 2, 2024
426db4d
Spotless
mayya-sharipova Oct 2, 2024
d9eb1e9
Address test failure
mayya-sharipova Oct 2, 2024
891b091
Fix test
mayya-sharipova Oct 2, 2024
923b6c0
Fix test failure
mayya-sharipova Oct 2, 2024
d9a13a9
Merge remote-tracking branch 'upstream/main' into knn_query_nested_fi…
mayya-sharipova Oct 2, 2024
9aa706f
Merge remote-tracking branch 'upstream/main' into knn_query_nested_fi…
mayya-sharipova Oct 25, 2024
9c56fff
Allow filters on both parent and nested metadata in the same knn query
mayya-sharipova Oct 25, 2024
afe3d8f
Merge remote-tracking branch 'upstream/main' into knn_query_nested_fi…
mayya-sharipova Jul 9, 2025
7bc25d3
Add documentation
mayya-sharipova Jul 9, 2025
e3b0251
Add pre-check for a query to be both on nested and parent metada field
mayya-sharipova Jul 9, 2025
a2f365e
Revert "Add pre-check for a query to be both on nested and parent met…
mayya-sharipova Jul 11, 2025
f423619
Add small changes
mayya-sharipova Jul 11, 2025
acbfccd
Add test for nested sibling docs
mayya-sharipova Jul 12, 2025
7cc0b09
Merge remote-tracking branch 'upstream/main' into knn_query_nested_fi…
mayya-sharipova Jul 12, 2025
1aa3605
Merge remote-tracking branch 'upstream/main' into knn_query_nested_fi…
mayya-sharipova Jul 29, 2025
5a818bc
Some adjustments
mayya-sharipova Jul 29, 2025
6c3230e
Merge remote-tracking branch 'upstream/main' into knn_query_nested_fi…
mayya-sharipova Jul 29, 2025
c0bb4ff
Update docs/changelog/113949.yaml
mayya-sharipova Jul 29, 2025
24e4943
Merge branch 'main' into knn_query_nested_filter
mayya-sharipova Jul 29, 2025
8f9cf54
Merge branch 'main' into knn_query_nested_filter
mayya-sharipova Jul 29, 2025
867b928
Merge branch 'main' into knn_query_nested_filter
mayya-sharipova Jul 29, 2025
cdf4e92
Merge branch 'main' into knn_query_nested_filter
mayya-sharipova Jul 29, 2025
d6d5f8e
Merge branch 'main' into knn_query_nested_filter
mayya-sharipova Jul 30, 2025
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -402,23 +402,24 @@ setup:
- do:
indices.refresh: {}

- do:
search:
index: nested_text
body:
knn:
field: paragraphs.vector
query_vector: [1, 2]
num_candidates: 10
k: 10
filter:
bool:
must_not:
exists:
field: publish_date

- match: {hits.total.value: 1}
- match: {hits.hits.0._id: "2"}
# This query fails now, breaking change: as the filter matches both nested and non-nested docs .
# - do:
# search:
# index: nested_text
# body:
# knn:
# field: paragraphs.vector
# query_vector: [1, 2]
# num_candidates: 10
# k: 10
# filter:
# bool:
# must_not:
# exists:
# field: publish_date
#
# - match: {hits.total.value: 1}
# - match: {hits.hits.0._id: "2"}
---
"nested Knn search with required similarity appropriately filters inner_hits":
- requires:
Expand Down Expand Up @@ -561,3 +562,30 @@ setup:
- match: { hits.hits.0.inner_hits.nested.hits.hits.0.fields.nested.0.language.0: "FR" }
- close_to: { hits.hits.0._score: { value: 0.0043, error: 0.0001 } }


---
"Test filter on nested fields with filter on both nested and parent metadata is not allowed":
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we add a test/scenario where we are doing a filter on another nested field?

so, assume we have

{
  "nested_1": {"properties": {"vector"....}},
  "nested_2": {"properties": {"product_name":....}}
}

Then the filter would be:

filter: [{ nested: {path: nested_2, query: {match: { nested_2.product_name: "FR" }}] or something (unsure if I got all the nesting right...)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Basically, I think our "auto-nested check" should only apply in the case of the filter matching the vectors nested context. Any other filter should be applied as a "top level" filter and return the relevant parent doc ids (I am unsure if this is exactly the case, but it seems to be right now).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

of course, replicate such a test in the knn query tests if possible.

Copy link
Contributor Author

@mayya-sharipova mayya-sharipova Jul 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for bringing this up.

In acbfccd I've added a test for nested sibling fields. Currently:

  • For nested knn query, we can NOT filter on nested sibling fields, as the query DSL itself doesn't allow it.
  • For top level knn search, we can filter on nested sibling fields, and effectively addressing the 128803 issue.

Are you not happy with this behaviour, are you saying we should NOT allow filter on sibling nested fields in nested knn search and throw an exception, which would mean we should close 128803 as not going to implement?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For nested knn query, we can NOT filter on nested sibling fields, as the query DSL itself doesn't allow it.

This is weird. A separate nested field should be able to join up to the parent level, indicate the parents that are filtered and we join back down to the child level to filter on the vectors. I don't know how to express that in the DSL, but we should document it as a limitation.

- requires:
capabilities:
- method: POST
path: /_search
capabilities: [ knn_filter_on_nested_fields ]
test_runner_features: ["capabilities", "close_to"]
reason: "Capability for filtering on nested fields required"

- do:
catch: /A filter in knn search might match both nested and non-nested documents, which is not allowed. Modify the filter to be either over the top-level or nested metadata./
search:
index: test
body:
_source: false
knn:
field: nested.vector
query_vector: [ -0.5, 90.0, -10, 14.8, -156.0 ]
k: 3
filter:
bool:
must:
- match: { nested.language: "FR" }
- term: { name: "rabbit.jpg" }

Original file line number Diff line number Diff line change
Expand Up @@ -24,6 +24,8 @@
import org.elasticsearch.index.mapper.NestedObjectMapper;
import org.elasticsearch.index.query.SearchExecutionContext;

import java.util.List;

/** Utility class to filter parent and children clauses when building nested
* queries. */
public final class NestedHelper {
Expand Down Expand Up @@ -164,4 +166,41 @@ private static boolean mightMatchNonNestedDocs(SearchExecutionContext searchExec
}
return true;
}

/**
* Returns true if the given query might match both nested documents and non-nested documents
*/
public static boolean mightMatchMixedDocs(Query query, String nestedPath, SearchExecutionContext searchExecutionContext) {
// First check if the individual query might match both types, such as match_all or wildcard or meta field queries
boolean mightMatchNestedDocs = mightMatchNestedDocs(query, searchExecutionContext);
boolean mightMatchNonNestedDocs = mightMatchNonNestedDocs(query, nestedPath, searchExecutionContext);
if (mightMatchNestedDocs && mightMatchNonNestedDocs) {
return true;
}

if (query instanceof final BooleanQuery bq) {
List<BooleanClause> clauses = bq.clauses();
if (clauses.isEmpty()) {
return false;
}
boolean hasNested = false;
boolean hasNonNested = false;
for (BooleanClause clause : clauses) {
Query clauseQuery = clause.query();
boolean clauseMatchesNested = mightMatchNestedDocs(clauseQuery, searchExecutionContext);
boolean clauseMatchesNonNested = mightMatchNonNestedDocs(clauseQuery, nestedPath, searchExecutionContext);
if (clauseMatchesNested) {
hasNested = true;
}
if (clauseMatchesNonNested) {
hasNonNested = true;
}
if (hasNested && hasNonNested) {
return true;
}
}
}

return false;
}
}
Original file line number Diff line number Diff line change
Expand Up @@ -556,6 +556,12 @@ protected Query doToQuery(SearchExecutionContext context) throws IOException {
parentBitSet = context.bitsetFilter(parentFilter);
ArrayList<Query> filterAdjusted = new ArrayList<>(filtersInitial.size());
for (Query f : filtersInitial) {
if (NestedHelper.mightMatchMixedDocs(f, parentPath, context)) {
throw new IllegalArgumentException(
"A filter in knn search might match both nested and non-nested documents, which is not allowed. "
+ "Modify the filter to be either over the top-level or nested metadata."
);
}
// If filter matches non-nested docs, we assume this is a filter over parents docs,
// so we will modify it accordingly: matching parents docs with join to its child docs
if (NestedHelper.mightMatchNonNestedDocs(f, parentPath, context)) {
Expand Down
Loading