-
Notifications
You must be signed in to change notification settings - Fork 0
[CLEAN] Synthetic Benchmark PR #137483 - Store keyword fields that trip ignore_above in binary doc values #27
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: base_pr_137483_20251204_6030
Are you sure you want to change the base?
Conversation
PR Compliance Guide 🔍Below is a summary of compliance checks for this PR:
Compliance status legend🟢 - Fully Compliant🟡 - Partial Compliant 🔴 - Not Compliant ⚪ - Requires Further Human Verification 🏷️ - Compliance label |
||||||||||||||||||||||||
PR Code Suggestions ✨Explore these optional code suggestions:
|
|||||||||||||||||
User description
Benchmark PR elastic#137483
Type: Clean (correct implementation)
Original PR Title: Store keyword fields that trip ignore_above in binary doc values
Original PR Description: https://github.com/elastic/logs-program/issues/13
Original PR URL: elastic#137483
PR Type
Enhancement
Description
Store ignored keyword values in binary doc values instead of stored fields
Add
BinaryDocValuesSyntheticFieldLoaderLayerfor reading binary doc valuesImplement
CustomBinaryDocValueswrapper for decoding binary doc valuesRefactor synthetic source field fetchers to use binary doc values
Deduplicate ignored values using
MultiValuedBinaryDocValuesFieldDiagram Walkthrough
File Walkthrough
MatchOnlyTextFieldMapper.java
Refactor to use binary doc values for ignored valuesmodules/mapper-extras/src/main/java/org/elasticsearch/index/mapper/extras/MatchOnlyTextFieldMapper.java
BinaryDocValues,DocValues,ByteArrayStreamInput, andSortedBinaryDocValuesparentFieldFetcher()to useignoredValuesDocValuesFieldFetcher()instead ofstoredFieldFetcher()delegateFieldFetcher()to simplify field name handling anduse binary doc values
getValuesFromDocValues()method to decode values fromSortedBinaryDocValuesignoredValuesDocValuesFieldFetcher()method to read ignored valuesfrom binary doc values
CustomBinaryDocValuesinner class to wrapBinaryDocValuesandexpose
SortedBinaryDocValuesinterfaceBinaryDocValuesSyntheticFieldLoaderLayer.java
New layer for reading binary doc valuesserver/src/main/java/org/elasticsearch/index/mapper/BinaryDocValuesSyntheticFieldLoaderLayer.java
CompositeSyntheticFieldLoader.DocValuesLayerinterface
ByteArrayStreamInput[count][length1][value1][length2][value2]...XContentBuilderand checkvalue existence
KeywordFieldMapper.java
Store ignored values in binary doc values fieldserver/src/main/java/org/elasticsearch/index/mapper/KeywordFieldMapper.java
ElasticsearchException,BytesStreamOutput, andLinkedHashSetStoredFieldwithMultiValuedBinaryDocValuesFieldfor storingignored values
StoredFieldLayerwithBinaryDocValuesSyntheticFieldLoaderLayerin synthetic source support
MultiValuedBinaryDocValuesFieldinner class to store and encodemultiple binary values
LinkedHashSetlength-prefixed entries
NativeArrayIntegrationTestCase.java
Collapse single-element arrays in sourcetest/framework/src/main/java/org/elasticsearch/index/mapper/NativeArrayIntegrationTestCase.java
arrayToSource()to collapse single-element arrays into scalarfields
WildcardFieldMapper.java
Use shared binary doc values layerx-pack/plugin/wildcard/src/main/java/org/elasticsearch/xpack/wildcard/mapper/WildcardFieldMapper.java
BinaryDocValues,LeafReader,ByteArrayStreamInputBinaryDocValuesSyntheticFieldLoaderLayerWildcardSyntheticFieldLoaderwithBinaryDocValuesSyntheticFieldLoaderLayerWildcardSyntheticFieldLoaderinner class implementationKeywordSyntheticSourceNativeArrayIntegrationTests.java
Update test for deduplicated ignored valuesserver/src/test/java/org/elasticsearch/index/mapper/KeywordSyntheticSourceNativeArrayIntegrationTests.java
source
expectedArrayValuesto show deduplicated results10_basic.yml
Update test data for deduplicationmodules/mapper-extras/src/yamlRestTest/resources/rest-api-spec/test/match_only_text/10_basic.yml
"Apache Lucene"to test input array