Write prefix partition for tsid in tsdb codec#144617
Conversation
server/src/main/java/org/elasticsearch/index/codec/tsdb/es819/ES819TSDBDocValuesProducer.java
Show resolved
Hide resolved
kkrik-es
left a comment
There was a problem hiding this comment.
It'd be nice if @martijnvg can also take a look.
++ we should wait for a review from Martijn! |
server/src/main/java/org/elasticsearch/index/codec/tsdb/es819/ES819TSDBDocValuesProducer.java
Show resolved
Hide resolved
|
Pinging @elastic/es-storage-engine (Team:StorageEngine) |
tsdb-metricsgen-270m - 256 partitions
|
|
@kkrik-es I increased the number of partitions from 256 to 1024 (from 16 bits to 18 bits), with a little more overhead but much greater benefit during query. Can you take another look? |
tsdb-metricsgen-270m - 1024 partitions
|
Looks good, let's also update the description accordingly. |
| static final int VERSION_BINARY_DV_COMPRESSION = 1; | ||
| static final int VERSION_NUMERIC_LARGE_BLOCKS = 2; | ||
| static final int VERSION_CURRENT = VERSION_NUMERIC_LARGE_BLOCKS; | ||
| static final int VERSION_PREFIX_PARTITIONS = 3; |
There was a problem hiding this comment.
Version 3 is kind of in ES819Version3TSDBDocValuesFormat, maybe use value 4 for the prefix partitioning? And maybe add that version 3 is part of that new format.
In hindsight ES819Version3TSDBDocValuesFormat wasn't needed (incorrect serverless upgrading made me think we had to introduce this), but it also doesn't have its own codec versioning scheme. This isn't idea, but hopefully we can start with new versioning scheme when ES94TSDBDocValuesFormat is introduced.
|
@kkrik-es @martijnvg Thank you for the reviews! |
Move prefix partition read/write logic into abstract base classes: - Add PrefixPartitionedEntry, PartitionedDocValues on BaseSortedDocValues - Update readSorted/readSortedSet for version-gated partition metadata - Update doAddSortedField/addTermsDict for PrefixedPartitionsWriter - Add writePrefixPartitions to TSDBDocValuesFormatConfig - Make PrefixedPartitionsReader/Writer public for cross-package access - Fix test type references (ES819BinaryDocValues -> TSDBBinaryDocValues)
Follow-up to #143955, which introduced a single-byte metric prefix in the tsid layout.
This PR writes prefix partition metadata for the
_tsidfield. The_tsidfield is grouped by its first 2 bytes - the metric prefix byte (byte-0) plus one random byte (byte-1) - yielding up to 256 partitions per metric. The partition records the starting document for each prefix group, allowing the query engine to slice data so that each slice contains only time-series sharing the same prefix.This enables ESQL to partition work across slices without splitting any individual time-series - a requirement for aggregations like rate. This should reduce memory usage and improve performance compared to time-interval partitioning, which requires multiple queries over fragmented data.
The compute engine is not wired up yet, so no improvements are expected yet, but this change may cause a small regression in indexing throughput and storage overhead, which is expected to be trivial.
Relates #143955