Skip to content
Merged
Changes from 1 commit
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Original file line number Diff line number Diff line change
Expand Up @@ -105,7 +105,53 @@
*
* <h3>Creating aggregators for your function</h3>
* <p>
* Aggregators contain the core logic of your aggregation. That is, how to combine values, what to store, how to process data, etc.
* Aggregators contain the core logic of how to combine values, what to store, how to process data, etc.
* Currently, we rely on code generation (per aggregation per type) in order to implement such functionality.
* This approach was picked for performance reasons (namely to avoid virtual method calls and boxing types).
* As a result we could not rely on interfaces implementation and generics.
* </p>
* <p>
* In order to implement aggregation logic create your class (typically named "${FunctionName}${Type}Aggregator").
* Annotate it with {@link org.elasticsearch.compute.ann.Aggregator} and {@link org.elasticsearch.compute.ann.GroupingAggregator}
* The first one is responsible for an entire data set aggregation, while the second one is responsible for grouping within buckets.
* </p>
* <p>
* Before you start implementing it, please note that:
* <ul>
* <li>All methods must be public static</li>
* <li>
* combine, combineStates, combineIntermediate, evaluateFinal methods (see below) could be omitted and generated automatically
* when both input type I and mutable accumulator state SS and GS are primitive (DOUBLE, INT).
* </li>
* <li>TBD explain {@code IntermediateState}</li>
* <li>TBD explain special internal state `seen`</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With the warnings feature, there's also the "failed" state. Identical to seen. Never used in main though, only here: https://github.com/elastic/elasticsearch/pull/116170/files#diff-8a408014887a6dc87eed1f71346536fc77636245d5411714b6ba2cf265812538R18

Maybe worth mentioning, with the warnExceptions attribute

* </ul>
* </p>
* <p>
* Aggregation expects:
* <ul>
* <li>type SS (a mutable state used to accumulate result of the aggregation) to be public, not inner and implements {@link org.elasticsearch.compute.aggregation.AggregatorState}</li>
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
* <li>{@code SS init()} or {@code SS initSingle()} returns empty initialized aggregation state</li>
* <li>{@code void combine(SS state, I input)} or {@code SS combine(SS state, I input)} adds input entry to the aggregation state</li>
* <li>{@code void combineIntermediate(SS state, intermediate states)} adds serialized aggregation state to the current aggregation state (used to combine results across different nodes)</li>
* <li>{@code Block evaluateFinal(SS state, BigArrays? DriverContext?)} converts the inner state of the aggregation to the result column</li>
* </ul>
* </p>
* <p>
* Grouping aggregation expects:
* <ul>
* <li>type GS (a mutable state used to accumulate result of the grouping aggregation) to be public, not inner and implements {@link org.elasticsearch.compute.aggregation.GroupingAggregatorState}</li>
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you mean:

Suggested change
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
* <li>type T (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>

From the following comments.

* <li>{@code GS init()} or {@code GS initGrouping()} returns empty initialized grouping aggregation state</li>
* <li>{@code void combine(GS state, int groupId, T input)} adds input entry to the corresponding group (bucket) of the grouping aggregation state</li>
* <li>{@code void combineStates(GS targetState, int targetGroupId, GS otherState, int otherGroupId)} merges other grouped aggregation state into the first one</li>
* <li>{@code void combineIntermediate(GS current, int groupId, intermediate states)} adds serialized aggregation state to the current grouped aggregation state (used to combine results across different nodes)</li>
* <li>{@code Block evaluateFinal(GS state, IntVectorSelected, BigArrays? DriverContext?)} converts the inner state of the grouping aggregation to the result column</li>
* </ul>
* </p>
* <p>
*
* </p>
* <ol>
* <li>
Expand Down