Skip to content

Conversation

@martijnvg
Copy link
Member

@martijnvg martijnvg commented Mar 31, 2025

The change contains the following changes:

  • The numDocsWithField field moved from SortedNumericEntry to NumericEntry. Making this statistic always available.
  • Store jump table after values in ES87TSDBDocValuesConsumer#writeField(...). Currently it is stored before storing values. This will allow us later to iterate over the SortedNumericDocValues once. When merging, this is expensive as an merge sort on the fly is being executed.

This change will allow all the optimizations that are listed in #125403

Note that most of the change is test code.

The change contains the following changes:
* The numDocsWithField field moved from SortedNumericEntry to NumericEntry. Making this statistic always available.
* Store jump table after values in writeField(...) Currently it is stored before storing values.

This change will allow all the optimizations that are listed in elastic#125403
@martijnvg martijnvg marked this pull request as ready for review March 31, 2025 14:04
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

entry.jumpTableEntryCount = meta.readShort();
entry.denseRankPower = meta.readByte();
private static void readNumeric(IndexInput meta, NumericEntry entry, int version) throws IOException {
if (version < ES87TSDBDocValuesFormat.VERSION_TWO) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since these branches are far away, should we fork a new consumer and producer instead?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done: 378ee29

@martijnvg martijnvg added the test-full-bwc Trigger full BWC version matrix tests label Apr 1, 2025
Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we move ES87TSDBDocValuesConsumer to tests only and throw UOE in ES87TSDBDocValuesFormat#fieldsConsumer?

LGTM. Thanks Martijn!

@elasticsearchmachine elasticsearchmachine added the serverless-linked Added by automation, don't add manually label Apr 2, 2025
@martijnvg martijnvg merged commit 52d6839 into elastic:main Apr 2, 2025
17 checks passed
martijnvg added a commit to martijnvg/elasticsearch that referenced this pull request Apr 4, 2025
Backporting elastic#125933 to 8.x branch.

The change contains the following changes:

- The numDocsWithField field moved from SortedNumericEntry to NumericEntry. Making this statistic always available.
- Store jump table after values in ES87TSDBDocValuesConsumer#writeField(...). Currently it is stored before storing values. This will allow us later to iterate over the SortedNumericDocValues once. When merging, this is expensive as a merge sort on the fly is being executed.

This change will allow all the optimizations that are listed in elastic#125403
martijnvg added a commit that referenced this pull request Apr 4, 2025
)

Backporting #125933 to 8.x branch.

The change contains the following changes:

- The numDocsWithField field moved from SortedNumericEntry to NumericEntry. Making this statistic always available.
- Store jump table after values in ES87TSDBDocValuesConsumer#writeField(...). Currently it is stored before storing values. This will allow us later to iterate over the SortedNumericDocValues once. When merging, this is expensive as a merge sort on the fly is being executed.

This change will allow all the optimizations that are listed in #125403
andreidan pushed a commit to andreidan/elasticsearch that referenced this pull request Apr 9, 2025
…5933)

The change contains the following changes:

- The numDocsWithField field moved from SortedNumericEntry to NumericEntry. Making this statistic always available.
- Store jump table after values in ES87TSDBDocValuesConsumer#writeField(...). Currently it is stored before storing values. This will allow us later to iterate over the SortedNumericDocValues once. When merging, this is expensive as a merge sort on the fly is being executed.

This change will allow all the optimizations that are listed in elastic#125403
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>non-issue serverless-linked Added by automation, don't add manually :StorageEngine/Codec Team:StorageEngine test-full-bwc Trigger full BWC version matrix tests v8.19.0 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants