Skip to content

Conversation

@martijnvg
Copy link
Member

Change match_only_text's value fetcher to use SortedBinaryDocValues instead of interacting with doc values api directly.

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.

Error that this PR is addressing:

Caused by: java.lang.IllegalStateException: unexpected docvalues type BINARY for field 'process.command_line' (expected one of [SORTED, SORTED_SET]). Re-index with correct docvalues type.
	at [email protected]/org.apache.lucene.index.DocValues.checkField(DocValues.java:218)
	at [email protected]/org.apache.lucene.index.DocValues.getSortedSet(DocValues.java:323)
	at [email protected]/org.elasticsearch.index.mapper.extras.MatchOnlyTextFieldMapper$MatchOnlyTextFieldType.lambda$docValuesFieldFetcher$3(MatchOnlyTextFieldMapper.java:296)
	at [email protected]/org.elasticsearch.index.mapper.extras.SourceConfirmedTextQuery$2$1.get(SourceConfirmedTextQuery.java:305)
	at [email protected]/org.apache.lucene.search.DisjunctionMaxQuery$DisjunctionMaxWeight$1.get(DisjunctionMaxQuery.java:156)
	at [email protected]/org.apache.lucene.search.BooleanScorerSupplier.opt(BooleanScorerSupplier.java:529)
	at [email protected]/org.apache.lucene.search.BooleanScorerSupplier.getInternal(BooleanScorerSupplier.java:145)
	at [email protected]/org.apache.lucene.search.BooleanScorerSupplier.get(BooleanScorerSupplier.java:117)
	at [email protected]/org.apache.lucene.search.BooleanScorerSupplier.requiredBulkScorer(BooleanScorerSupplier.java:377)
	at [email protected]/org.apache.lucene.search.BooleanScorerSupplier.booleanScorer(BooleanScorerSupplier.java:219)
	at [email protected]/org.apache.lucene.search.BooleanScorerSupplier.bulkScorer(BooleanScorerSupplier.java:177)
	at [email protected]/org.apache.lucene.search.Weight.bulkScorer(Weight.java:178)
	at [email protected]/org.elasticsearch.search.internal.ContextIndexSearcher.searchLeaf(ContextIndexSearcher.java:454)
	at [email protected]/org.apache.lucene.search.IndexSearcher.search(IndexSearcher.java:809)
	at [email protected]/org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:387)
	at [email protected]/org.elasticsearch.search.internal.ContextIndexSearcher.lambda$search$3(ContextIndexSearcher.java:365)
	at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:328)
	at [email protected]/org.apache.lucene.search.TaskExecutor$Task.run(TaskExecutor.java:173)
	at [email protected]/org.apache.lucene.search.TaskExecutor.invokeAll(TaskExecutor.java:111)
	at [email protected]/org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:369)
	at [email protected]/org.elasticsearch.search.internal.ContextIndexSearcher.search(ContextIndexSearcher.java:336)
	at [email protected]/org.elasticsearch.search.query.QueryPhase.addCollectorsAndSearch(QueryPhase.java:212)
	... 23 more

(marking as non-issue, given that this is a bug in unreleased version of ES)

… instead of interacting with doc values api directly.

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@not-napoleon not-napoleon added the auto-backport Automatically create backport pull requests when merged label Jul 8, 2025
Copy link
Member

@not-napoleon not-napoleon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good to me. The test covers the scenario we're concerned about, and the solution makes sense.

Copy link
Contributor

@jordan-powers jordan-powers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

not-napoleon added a commit that referenced this pull request Jul 8, 2025
… instead of interacting with doc values api directly. (#130854)

This pulls #130845 into the serverless fix branch for patch deployment.  Original description:

Change match_only_text's value fetcher to use SortedBinaryDocValues instead of interacting with doc values api directly.

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.

Co-authored-by: Martijn van Groningen <[email protected]>
@martijnvg
Copy link
Member Author

The serverless check failure looks unrelated:

EsqlClientYamlIT > test {p0=esql/80_text/NOT IN on text} FAILED	
    java.lang.AssertionError: circuit breakers not reset to 0Expected a map containing	
    estimated_size_in_bytes: expected <0> but was <80>	
             estimated_size: expected "0b" but was "80b"	
                   overhead: <1.0> unexpected but ok	
        limit_size_in_bytes: <322122547> unexpected but ok	
                 limit_size: "307.1mb" unexpected but ok	
                    tripped: <0> unexpected but ok	
        at __randomizedtesting.SeedInfo.seed([AF935BE020E1872D:27C7643A8E1DEAD5]:0)	
        at org.elasticsearch.test.MapMatcher.assertMap(MapMatcher.java:85)	
        at org.elasticsearch.xpack.esql.qa.rest.EsqlSpecTestCase.lambda$assertRequestBreakerEmpty$0(EsqlSpecTestCase.java:444)	
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1552)	
        at org.elasticsearch.test.ESTestCase.assertBusy(ESTestCase.java:1524)	
        at org.elasticsearch.xpack.esql.qa.rest.EsqlSpecTestCase.assertRequestBreakerEmpty(EsqlSpecTestCase.java:436)	
        at org.elasticsearch.xpack.esql.qa.mixed.EsqlClientYamlIT.assertRequestBreakerEmpty(EsqlClientYamlIT.java:41)

@martijnvg martijnvg merged commit 5037d68 into elastic:main Jul 9, 2025
8 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 130845

martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 9, 2025
… instead of interacting with doc values api directly. (elastic#130854)

This pulls elastic#130845 into the serverless fix branch for patch deployment.  Original description:

Change match_only_text's value fetcher to use SortedBinaryDocValues instead of interacting with doc values api directly.

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.

Co-authored-by: Martijn van Groningen <[email protected]>
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Jul 9, 2025
… instead of interacting with doc values api directly. (elastic#130854)

This pulls elastic#130845 into the serverless fix branch for patch deployment.  Original description:

Change match_only_text's value fetcher to use SortedBinaryDocValues instead of interacting with doc values api directly.

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.

Co-authored-by: Martijn van Groningen <[email protected]>
martijnvg added a commit that referenced this pull request Jul 9, 2025
… instead of interacting with doc values api directly. (#130895)

Backporting #130854 to 9.1 branch.

This pulls #130845 into the serverless fix branch for patch deployment.  Original description:

Change match_only_text's value fetcher to use SortedBinaryDocValues instead of interacting with doc values api directly.

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.

Co-authored-by: Mark Tozzi <[email protected]>
martijnvg added a commit that referenced this pull request Jul 9, 2025
… instead of interacting with doc values api directly. (#130896)

Backporting #130854 to 8.19 branch.

This pulls #130845 into the serverless fix branch for patch deployment.  Original description:

Change match_only_text's value fetcher to use SortedBinaryDocValues instead of interacting with doc values api directly.

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.

Co-authored-by: Mark Tozzi <[email protected]>
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 17, 2025
… instead of interacting with doc values api directly. (elastic#130845)

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 17, 2025
… instead of interacting with doc values api directly. (elastic#130845)

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.
mridula-s109 pushed a commit to mridula-s109/elasticsearch that referenced this pull request Jul 17, 2025
… instead of interacting with doc values api directly. (elastic#130845)

This way, via field data abstraction, the right doc values type is used, and the right conversions happen. Values of all field types will get converted to strings.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged >non-issue :StorageEngine/Mapping The storage related side of mappings Team:StorageEngine v8.19.1 v9.1.1 v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants