Skip to content

Conversation

@martijnvg
Copy link
Member

@martijnvg martijnvg commented Sep 18, 2025

The ordinal range encoding doesn't support multi-valued fields, but the logic that prohibits its usage for multi-valued field was missing, which resulted in multi-valued fields being incorrectly encoded.

The ordinal range encoding was introduced for primary sort fields only to allow a more efficient look-ahead / skip logic (#133018). This is to more efficiently determine whether a constant block could be used for a range of matching doc ids in compute engine.

Closes #134950

(marking as non-issue as this is a bug doesn't occur in a stateful release)

The ordinal range encoding doesn't support multi-valued fields, but the logic that prohibits its usage for multi-valued field was missing, which resulted in multi-valued fields being incorrectly encoded.

The ordinal range encoding was introduced for primary sort fields only to allow a more efficient look-ahead / skip logic. This is to efficiently determine whether a constant block could be used for a range of matching doc ids in compute engine.

Closes elastic#134950
final RandomAccessInput addressesInput = data.randomAccessSlice(entry.addressesOffset, entry.addressesLength);
final LongValues addresses = DirectMonotonicReader.getInstance(entry.addressesMeta, addressesInput, merging);

assert entry.sortedOrdinals == null : "encoded ordinal range supports only one value per document";
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Invoking getValues(...) will result into a NPE if entry.sortedOrdinals != null.
Maybe we can return a singleton SortedNumericDocValues instance here, so that at least the the first value of every document is available?

@martijnvg martijnvg marked this pull request as ready for review September 18, 2025 11:22
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@dnhatn
Copy link
Member

dnhatn commented Sep 18, 2025

@martijnvg Thanks for fixing this - this is the correct solution. However, indexes with multi-valued fields that have already been encoded with ordinal range will remain broken. Should we also support reading them?

@martijnvg
Copy link
Member Author

Thanks Nhat for taking a look.

However, indexes with multi-valued fields that have already been encoded with ordinal range will remain broken. Should we also support reading them?

Yes, I was thinking about this too: #134979 (comment)
However we can only read the first value for each document and return it is a singleton sorted set. You think that this is ok?

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks Martijn!

@martijnvg martijnvg enabled auto-merge (squash) September 18, 2025 17:18
@martijnvg martijnvg merged commit d9783ad into elastic:main Sep 18, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Weird TSDB NPE

3 participants