Skip to content

Conversation

dnhatn
Copy link
Member

@dnhatn dnhatn commented Oct 8, 2025

With this change, we pack the dimension into a single value before the second aggregation in time-series queries and unpack it afterward. This avoids generating permutations for multi-valued dimensions in the second aggregation, which is not desirable.

For example, the query

TS k8s | STATS max(rate(request)) BY host, tbucket(1minute)

is rewritten as:

TS k8s
 | STATS rate=rate(request), host=VALUES(host) BY _tsid, tbucket=TBUCKET(1minute)
 | EVAL packed_host=PACK_DIMENSION(host)
 | STATS sum(rate) BY packed_host, tbucket
 | EVAL host=UNPACK_DIMENSION(packed_host)
 | KEEP rate, host, tbucket

There is some overhead with packing and unpacking values, but we tried to isolate this behavior to time-series queries with dimension fields only. That is why I chose this approach.

@dnhatn dnhatn force-pushed the pack-dimensions branch 3 times, most recently from 49bf9c8 to 25b5f7e Compare October 10, 2025 03:51
$if(BytesRef)$
public BytesRef getBytesRef(int position, BytesRef ignore) {
public BytesRef getBytesRef(int position, BytesRef scratch) {
scratch.bytes = value.bytes;
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will open a separate PR for this fix.

assertThat(Expressions.attribute(rate.field()).name(), equalTo("network.total_bytes_in"));
LastOverTime lastSum = as(Alias.unwrap(aggsByTsid.aggregates().get(1)), LastOverTime.class);
assertThat(Expressions.attribute(lastSum.field()).name(), equalTo("network.cost"));
Values clusterValues = as(Alias.unwrap(aggsByTsid.aggregates().get(3)), Values.class);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will update these tests.

return Math.max(INITIAL_SIZE_IN_BYTES, positionCount);
}

static BytesRefBlock packBytesValues(DriverContext driverContext, BytesRefBlock raw) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is where we encode multi-valued block into a single value.

}
}

static BytesRefBlock unpackBytesValues(DriverContext driverContext, BytesRefBlock encoded) {
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And decode here.

@dnhatn
Copy link
Member Author

dnhatn commented Oct 10, 2025

I will also verify this change with the competitive benchmark.

7.203958127639015 | prod | [eu, us] | 2024-05-10T00:10:00.000Z
6.34494062999877 | staging | us | 2024-05-10T00:10:00.000Z
5.700488689624205 | prod | [eu, us] | 2024-05-10T00:20:00.000Z
5.4539153439153445 | prod | [eu, us] | 2024-05-10T00:00:00.000Z
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We return arrays for grouping dimensions.

firstPassAggs.add(newFinalGroup);
var valuesAgg = new Alias(g.source(), g.name(), new Values(g.source(), g));
firstPassAggs.add(valuesAgg);
if (g.isDimension()) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What happens with labels, i.e. grouping attributes that are neither dimensions nor metrics? I think multi-values were not allowed in dimensions for a long time, so this may be the case for some older configs.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, so we want to pack not only dimensions but also labels. If that's the case, we might need a different approach.

Copy link
Contributor

@kkrik-es kkrik-es left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks Nhat. Let's check the numbers on competitive benchmark, in case there are big surprises.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants