-
Notifications
You must be signed in to change notification settings - Fork 990
Open
Labels
feature requestNew feature or requestNew feature or request
Description
A compound aggregation is an aggregation that depends on other aggregations. For example, MEAN depends on SUM and COUNT_VALID. As such, when computing compound aggregations, we need to firstly compute the dependent aggregations. However, computing the intermediate results for such dependencies typically involves unnecessary work that can accumulate into a significant overhead if the number of aggregations is large.
For example:
- For computing
MIN/MAXof strings, we firstly computeARG_MIN/ARG_MAX, producing a gather map to gather the input. However, suchARG_MIN/ARG_MAXaggregations launch kernels to compute the unused null mask and null count for the gather map. - Similarly, for computing
M2, we firstly computeSUMandSUM_OF_SQUARED. These aggregations also launch kernels to compute the unused null mask and null count for the intermediate sums.
We can do better by avoiding to compute null mask and null count if not necessary. We can easily identify if an aggregation is requested by the user or just needed as an intermediate result for computing other compound aggs, then only compute its null mask/null count in such situations.
Metadata
Metadata
Assignees
Labels
feature requestNew feature or requestNew feature or request