Skip to content

Commit b3a89d7

Browse files
committed
Document aggregation code generation
1 parent 1378b59 commit b3a89d7

File tree

1 file changed

+47
-1
lines changed
  • x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate

1 file changed

+47
-1
lines changed

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate/package-info.java

Lines changed: 47 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -105,7 +105,53 @@
105105
*
106106
* <h3>Creating aggregators for your function</h3>
107107
* <p>
108-
* Aggregators contain the core logic of your aggregation. That is, how to combine values, what to store, how to process data, etc.
108+
* Aggregators contain the core logic of how to combine values, what to store, how to process data, etc.
109+
* Currently, we rely on code generation (per aggregation per type) in order to implement such functionality.
110+
* This approach was picked for performance reasons (namely to avoid virtual method calls and boxing types).
111+
* As a result we could not rely on interfaces implementation and generics.
112+
* </p>
113+
* <p>
114+
* In order to implement aggregation logic create your class (typically named "${FunctionName}${Type}Aggregator").
115+
* Annotate it with {@link org.elasticsearch.compute.ann.Aggregator} and {@link org.elasticsearch.compute.ann.GroupingAggregator}
116+
* The first one is responsible for an entire data set aggregation, while the second one is responsible for grouping within buckets.
117+
* </p>
118+
* <p>
119+
* Before you start implementing it, please note that:
120+
* <ul>
121+
* <li>All methods must be public static</li>
122+
* <li>
123+
* combine, combineStates, combineIntermediate, evaluateFinal methods (see below) could be omitted and generated automatically
124+
* when both input type I and mutable accumulator state SS and GS are primitive (DOUBLE, INT).
125+
* </li>
126+
* <li>TBD explain {@code IntermediateState}</li>
127+
* <li>TBD explain special internal state `seen`</li>
128+
* </ul>
129+
* </p>
130+
* <p>
131+
* Aggregation expects:
132+
* <ul>
133+
* <li>type SS (a mutable state used to accumulate result of the aggregation) to be public, not inner and implements {@link org.elasticsearch.compute.aggregation.AggregatorState}</li>
134+
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
135+
* <li>{@code SS init()} or {@code SS initSingle()} returns empty initialized aggregation state</li>
136+
* <li>{@code void combine(SS state, I input)} or {@code SS combine(SS state, I input)} adds input entry to the aggregation state</li>
137+
* <li>{@code void combineIntermediate(SS state, intermediate states)} adds serialized aggregation state to the current aggregation state (used to combine results across different nodes)</li>
138+
* <li>{@code Block evaluateFinal(SS state, BigArrays? DriverContext?)} converts the inner state of the aggregation to the result column</li>
139+
* </ul>
140+
* </p>
141+
* <p>
142+
* Grouping aggregation expects:
143+
* <ul>
144+
* <li>type GS (a mutable state used to accumulate result of the grouping aggregation) to be public, not inner and implements {@link org.elasticsearch.compute.aggregation.GroupingAggregatorState}</li>
145+
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
146+
* <li>{@code GS init()} or {@code GS initGrouping()} returns empty initialized grouping aggregation state</li>
147+
* <li>{@code void combine(GS state, int groupId, T input)} adds input entry to the corresponding group (bucket) of the grouping aggregation state</li>
148+
* <li>{@code void combineStates(GS targetState, int targetGroupId, GS otherState, int otherGroupId)} merges other grouped aggregation state into the first one</li>
149+
* <li>{@code void combineIntermediate(GS current, int groupId, intermediate states)} adds serialized aggregation state to the current grouped aggregation state (used to combine results across different nodes)</li>
150+
* <li>{@code Block evaluateFinal(GS state, IntVectorSelected, BigArrays? DriverContext?)} converts the inner state of the grouping aggregation to the result column</li>
151+
* </ul>
152+
* </p>
153+
* <p>
154+
*
109155
* </p>
110156
* <ol>
111157
* <li>

0 commit comments

Comments
 (0)