Skip to content

Conversation

@jordan-powers
Copy link
Contributor

@jordan-powers jordan-powers commented Jul 15, 2025

In #129126, we stopped double-storing match_only_text fields when they are part of a multi-field, instead extracting the value when needed from the appropriate multi-field mapper.

This introduced an edge case related to ignore_above on keyword fields. If the associated multi-field mapper is a keyword mapper, and if that keyword mapper has ignore_above specified, and a document triggers the ignore_above case, then the original value will be stored in <foo>._original instead of <foo>. In this case, the match_only_text mapper needs to look at the <foo>._original stored field.

Resolves #131298

@jordan-powers jordan-powers self-assigned this Jul 15, 2025
@jordan-powers jordan-powers added >non-issue auto-backport Automatically create backport pull requests when merged :StorageEngine/Mapping The storage related side of mappings v8.19.0 v9.1.0 v9.2.0 labels Jul 15, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

}
return storedFieldFetcher(parentField);
} else if (parent.hasDocValues()) {
var ifd = searchExecutionContext.getForField(parent, MappedFieldType.FielddataOperation.SEARCH);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like there can be a similar problem here because doc values won't include ignored values? Not sure if that matters though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shoot you're right. If the keyword field is store: false, doc_values: true, we don't error, but we completely omit the value from the match_only_text field results.

I'll work on a follow-up PR to address this.

@jordan-powers jordan-powers merged commit 6f8be9c into elastic:main Jul 15, 2025
33 checks passed
jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jul 15, 2025
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
8.19 Commit could not be cherrypicked due to conflicts
9.1

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 131314

@jordan-powers
Copy link
Contributor Author

💚 All backports created successfully

Status Branch Result
8.19

Questions ?

Please refer to the Backport tool documentation

jordan-powers added a commit that referenced this pull request Jul 15, 2025
…) (#131338)

(cherry picked from commit 6f8be9c)

# Conflicts:
#	server/src/main/java/org/elasticsearch/index/mapper/KeywordFieldMapper.java
if (names.length == 1) {
return storedFields.get(names[0]);
}
return Arrays.stream(names).map(storedFields::get).filter(Objects::nonNull).flatMap(List::stream).toList();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: streams are not very efficient, it's better to avoid them for code that runs per document.

jordan-powers added a commit that referenced this pull request Jul 17, 2025
In #131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jul 17, 2025
In elastic#131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
jordan-powers added a commit to jordan-powers/elasticsearch that referenced this pull request Jul 17, 2025
In elastic#131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
elasticsearchmachine pushed a commit that referenced this pull request Jul 17, 2025
In #131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
elasticsearchmachine pushed a commit that referenced this pull request Jul 17, 2025
In #131314 we fixed match_only_text fields with ignore_above keyword
multi-fields in the case that the keyword multi-field is stored. However,
the issue is still present if the keyword field is not stored, but instead
has doc values.

This patch fixes that case.
@jordan-powers jordan-powers deleted the fix_match_only_text_multi_fields_2 branch July 28, 2025 16:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >non-issue :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v8.19.0 v9.1.0 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

NPE on search query phase Cannot invoke "java.util.List.iterator()" because "values" is null

5 participants