Skip to content

Commit e9087c5

Browse files
committed
update docs
1 parent f0e3938 commit e9087c5

File tree

2 files changed

+39
-55
lines changed
  • x-pack/plugin/esql

2 files changed

+39
-55
lines changed

x-pack/plugin/esql/compute/ann/src/main/java/org/elasticsearch/compute/ann/Aggregator.java

Lines changed: 2 additions & 10 deletions
Original file line numberDiff line numberDiff line change
@@ -37,18 +37,10 @@
3737
* are ever collected.
3838
* </p>
3939
* <p>
40-
* The generation code will also look for a method called {@code combineValueCount}
41-
* which is called once per received block with a count of values. NOTE: We may
42-
* not need this after we convert AVG into a composite operation.
43-
* </p>
44-
* <p>
4540
* The generation code also looks for the optional methods {@code combineIntermediate}
4641
* and {@code evaluateFinal} which are used to combine intermediate states and
47-
* produce the final output. If the first is missing then the generated code will
48-
* call the {@code combine} method to combine intermediate states. If the second
49-
* is missing the generated code will make a block containing the primitive from
50-
* the state. If either of those don't have sensible interpretations then the code
51-
* generation code will throw an error, aborting the compilation.
42+
* produce the final output. Please note, those are auto-generated when aggregating
43+
* primitive types such as boolean, int, long, float, double.
5244
* </p>
5345
*/
5446
@Target(ElementType.TYPE)

x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/aggregate/package-info.java

Lines changed: 37 additions & 45 deletions
Original file line numberDiff line numberDiff line change
@@ -112,43 +112,66 @@
112112
* </p>
113113
* <p>
114114
* In order to implement aggregation logic create your class (typically named "${FunctionName}${Type}Aggregator").
115+
* It must be placed in `org.elasticsearch.compute.aggregation` in order to be picked up by code generation.
115116
* Annotate it with {@link org.elasticsearch.compute.ann.Aggregator} and {@link org.elasticsearch.compute.ann.GroupingAggregator}
116117
* The first one is responsible for an entire data set aggregation, while the second one is responsible for grouping within buckets.
117118
* </p>
118119
* <h4>Before you start implementing it, please note that:</h4>
119120
* <ul>
120121
* <li>All methods must be public static</li>
121122
* <li>
122-
* init, initSingle, initGrouping could declare optional BigArrays, DriverContext arguments that are going to be injected automatically.
123+
* init/initSingle/initGrouping could have optional BigArrays, DriverContext arguments that are going to be injected automatically.
123124
* It is also possible to declare any number of arbitrary arguments that must be provided via generated Supplier.
124125
* </li>
125126
* <li>
126127
* combine, combineStates, combineIntermediate, evaluateFinal methods (see below) could be omitted and generated automatically
127128
* when both input type I and mutable accumulator state SS and GS are primitive (DOUBLE, INT).
128129
* </li>
129130
* <li>
131+
* Code generation expects at least one IntermediateState field that is going to be used to keep
132+
* the serialized state of the aggregation (eg AggregatorState and GroupingAggregatorState).
133+
* It must be defined even if you rely on autogenerated implementation for the primitive types.
130134
* </li>
131-
* <li>TBD explain {@code IntermediateState}</li>
132-
* <li>TBD explain special internal state `seen`</li>
133135
* </ul>
134136
* <h4>Aggregation expects:</h4>
135137
* <ul>
136-
* <li>type SS (a mutable state used to accumulate result of the aggregation) to be public, not inner and implements {@link org.elasticsearch.compute.aggregation.AggregatorState}</li>
138+
* <li>
139+
* type SS (a mutable state used to accumulate result of the aggregation) to be public, not inner and implements
140+
* {@link org.elasticsearch.compute.aggregation.AggregatorState}
141+
* </li>
137142
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
138143
* <li>{@code SS init()} or {@code SS initSingle()} returns empty initialized aggregation state</li>
139144
* <li>{@code void combine(SS state, I input)} or {@code SS combine(SS state, I input)} adds input entry to the aggregation state</li>
140-
* <li>{@code void combineIntermediate(SS state, intermediate states)} adds serialized aggregation state to the current aggregation state (used to combine results across different nodes)</li>
145+
* <li>
146+
* {@code void combineIntermediate(SS state, intermediate states)} adds serialized aggregation state
147+
* to the current aggregation state (used to combine results across different nodes)
148+
* </li>
141149
* <li>{@code Block evaluateFinal(SS state, DriverContext)} converts the inner state of the aggregation to the result column</li>
142150
* </ul>
143151
* <h4>Grouping aggregation expects:</h4>
144152
* <ul>
145-
* <li>type GS (a mutable state used to accumulate result of the grouping aggregation) to be public, not inner and implements {@link org.elasticsearch.compute.aggregation.GroupingAggregatorState}</li>
153+
* <li>
154+
* type GS (a mutable state used to accumulate result of the grouping aggregation) to be public, not inner and implements
155+
* {@link org.elasticsearch.compute.aggregation.GroupingAggregatorState}
156+
* </li>
146157
* <li>type I (input to your aggregation function), usually primitive types and {@link org.apache.lucene.util.BytesRef}</li>
147158
* <li>{@code GS init()} or {@code GS initGrouping()} returns empty initialized grouping aggregation state</li>
148-
* <li>{@code void combine(GS state, int groupId, T input)} adds input entry to the corresponding group (bucket) of the grouping aggregation state</li>
149-
* <li>{@code void combineStates(GS targetState, int targetGroupId, GS otherState, int otherGroupId)} merges other grouped aggregation state into the first one</li>
150-
* <li>{@code void combineIntermediate(GS current, int groupId, intermediate states)} adds serialized aggregation state to the current grouped aggregation state (used to combine results across different nodes)</li>
151-
* <li>{@code Block evaluateFinal(GS state, IntVectorSelected, DriverContext)} converts the inner state of the grouping aggregation to the result column</li>
159+
* <li>
160+
* {@code void combine(GS state, int groupId, I input)} adds input entry to the corresponding group (bucket)
161+
* of the grouping aggregation state
162+
* </li>
163+
* <li>
164+
* {@code void combineStates(GS targetState, int targetGroupId, GS otherState, int otherGroupId)} merges other grouped
165+
* aggregation state into the first one
166+
* </li>
167+
* <li>
168+
* {@code void combineIntermediate(GS current, int groupId, intermediate states)} adds serialized aggregation state
169+
* to the current grouped aggregation state (used to combine results across different nodes)
170+
* </li>
171+
* <li>
172+
* {@code Block evaluateFinal(GS state, IntVectorSelected, DriverContext)} converts the inner state
173+
* of the grouping aggregation to the result column
174+
* </li>
152175
* </ul>
153176
* <ol>
154177
* <li>
@@ -160,31 +183,8 @@
160183
* </p>
161184
* </li>
162185
* <li>
163-
* The methods in the aggregator will define how it will work:
164-
* <ul>
165-
* <li>
166-
* Adding the `type init()` method will autogenerate the code to manage the state, using your returned value
167-
* as the initial value for each group.
168-
* </li>
169-
* <li>
170-
* Adding the `type initSingle()` or `type initGrouping()` methods will use the state object you return there instead.
171-
* <p>
172-
* You will also have to provide `evaluateIntermediate()` and `evaluateFinal()` methods this way.
173-
* </p>
174-
* </li>
175-
* </ul>
176-
* Depending on the way you use, adapt your `combine*()` methods to receive one or other type as their first parameters.
177-
* </li>
178-
* <li>
179-
* If it's also a {@link org.elasticsearch.compute.ann.GroupingAggregator}, you should provide the same methods as commented before:
180-
* <ul>
181-
* <li>
182-
* Add an `initGrouping()`, unless you're using the `init()` method
183-
* </li>
184-
* <li>
185-
* Add all the other methods, with the state parameter of the type of your `initGrouping()`.
186-
* </li>
187-
* </ul>
186+
* Implement (or create an empty) methods according to the above list.
187+
* Also check {@link org.elasticsearch.compute.ann.Aggregator} JavaDoc as it contains generated method usage.
188188
* </li>
189189
* <li>
190190
* Make a test for your aggregator.
@@ -195,16 +195,8 @@
195195
* </p>
196196
* </li>
197197
* <li>
198-
* Check the Javadoc of the {@link org.elasticsearch.compute.ann.Aggregator}
199-
* and {@link org.elasticsearch.compute.ann.GroupingAggregator} annotations.
200-
* Add/Modify them on your aggregator.
201-
* </li>
202-
* <li>
203-
* The {@link org.elasticsearch.compute.ann.Aggregator} JavaDoc explains the static methods you should add.
204-
* </li>
205-
* <li>
206-
* After implementing the required methods (Even if they have a dummy implementation),
207-
* run the CsvTests to generate some extra required classes.
198+
* Code generation is triggered when running the tests.
199+
* Run the CsvTests to generate the code. Generated code should include:
208200
* <p>
209201
* One of them will be the {@code AggregatorFunctionSupplier} for your aggregator.
210202
* Find it by its name ({@code <Aggregation-name><Type>AggregatorFunctionSupplier}),

0 commit comments

Comments
 (0)