Skip to content

Conversation

@dnhatn
Copy link
Member

@dnhatn dnhatn commented Aug 16, 2025

When a keyword is the primary sort field, we store the starting document of each ordinal instead of blocks of ordinals. By default, this is not enabled if the average number of documents per ordinal is less than 512, as storing block values may be more efficient and safer. Reading a large range of documents—a common pattern in ES|QL—can be more efficient with this approach.

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I took a first look and this is a great idea to speedup reading the _tsid (and other doc values fields that are primary sort field) without having to enable a doc value skipper.

() -> new ES819TSDBDocValuesFormat(random().nextInt(Integer.MIN_VALUE, 2), random().nextInt(1, 32), random().nextBoolean())
);
assertTrue(ex.getMessage().contains("skipIndexIntervalSize must be > 1"));
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in TsdbDocValueBwcTests we should also add a test method that tests ES819TSDBDocValuesFormat without ordinal range encoding to with ordinal range encoding for bwc purposes?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added a test in 6bdfd69

@martijnvg
Copy link
Member

I executed the tsdb track with this change and seeing the following results:

|                                                  50th percentile latency | esql-memory-pagefault-rate-by-container-hour |  4307.26        |   3646.82        |   -660.443       |     ms |  -15.33% |
|                                                  50th percentile latency |  esql-memory-pagefault-rate-by-container-day |  3585.47        |   3154.48        |   -430.99        |     ms |  -12.02% |
|                                                  50th percentile latency |               esql-memory-pagefault-rate-day |  3139.99        |   2847.56        |   -292.424       |     ms |   -9.31% |
|                                                  50th percentile latency |              esql-memory-pagefault-rate-hour |  3739.36        |   3480.29        |   -259.076       |     ms |   -6.93% |
|                                                  50th percentile latency |           esql-memory-usage-by-container-day |  1350.7         |   1394.87        |     44.1711      |     ms |   +3.27% |
|                                                  50th percentile latency |           esql-memory-usage-by-container-day |  1350.7         |   1394.87        |     44.1711      |     ms |   +3.27% |
|                                                  50th percentile latency |          esql-memory-usage-by-container-hour |  1637.53        |   1675.15        |     37.6272      |     ms |   +2.30% |
|                                                  50th percentile latency |                        esql-memory-usage-day |   698.187       |    739.367       |     41.1804      |     ms |   +5.90% |
|                                                  50th percentile latency |                       esql-memory-usage-hour |   984.767       |    996.926       |     12.159       |     ms |   +1.23% |

The p50 latency of rate based queries nicely improved. I think the difference of p50 of other queries is within noise range.

Other details:

|                                                        Median Throughput |                                        index | 63901.2         |  64452.7         |    551.492       | docs/s |   +0.86% |
|                                                               Store size |                                              |     4.04692     |      4.05151     |      0.00459     |     GB |   +0.11% |
|                                                    tsdb _tsid doc values |                                              |     9.84666     |      3.2097      |     -6.63696     |     MB |  -67.40% |

Indexing throughput is similar, while tsid disk usage got much smaller (compared to total store size this is neglectable).

Full rally compare:
comparison-2.txt

Copy link
Member

@martijnvg martijnvg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nhat! LGTM

@martijnvg martijnvg added the test-full-bwc Trigger full BWC version matrix tests label Aug 19, 2025
startDocs.add(doc);
}
}
startDocs.add(maxDoc);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it possible that we've already inserted maxDoc in the loop above?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should not, because startDocs should be at most maxDoc - 1.

private long rangeEndExclusive = -1;

SortedOrdinalReader(long maxOrd, DirectMonotonicReader startDocs) {
this.maxOrd = Math.toIntExact(maxOrd);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: toIntExact not needed?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks - I removed it in 8dc9641.

Copy link
Contributor

@kkrik-es kkrik-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Nhat!

@dnhatn dnhatn added :StorageEngine/TSDB You know, for Metrics >enhancement labels Aug 19, 2025
@dnhatn dnhatn marked this pull request as ready for review August 19, 2025 05:54
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-storage-engine (Team:StorageEngine)

@elasticsearchmachine
Copy link
Collaborator

Hi @dnhatn, I've created a changelog YAML for you.

@dnhatn
Copy link
Member Author

dnhatn commented Aug 20, 2025

@martijnvg @kkrik-es Thank you so much for the discussion and review!

@dnhatn dnhatn merged commit 3d48dd5 into elastic:main Aug 20, 2025
30 checks passed
@dnhatn dnhatn deleted the sorted-dv-codec branch August 20, 2025 03:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

>enhancement :StorageEngine/TSDB You know, for Metrics Team:StorageEngine test-full-bwc Trigger full BWC version matrix tests v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants