Skip to content

Conversation

@nik9000
Copy link
Member

@nik9000 nik9000 commented Aug 4, 2025

Adds support for multi-argument grouped aggs. Previously we only had support for aggs with a value and a timestamp. Now we can support any number of fields.

This change is mostly mechanical and mostly in generated code. The only production code changes are:

  1. Rename things like block and vector to fooBlock so names are unique with multi-argument fields. foo comes from the name of the argument to combine.
  2. Support for passing Blocks of longs in FirstOverTime and LastOverTime. This will allow it to function outside of TSDB.
  3. Flip the order of the arguments in the combine methods for FirstOverTime and LastOverTime so they line up with how the ESQL code orders the arguments. This shouldn't change the runtime at all.
  4. Rename the offset variable we use for reading from vectors in non-grouped code so it's the same name as we use in grouped code. This just keeps the code gen a little more consistent.

Relates to #108385

Adds support for multi-argument grouped aggs. Previously we only had
support for aggs with a value and a timestamp. Now we can support any
number of fields.

This change is mostly mechanical and mostly in generated code. The only
production code changes are:
1. Rename things like `block` and `vector` to `fooBlock` so names are
   unique with multi-argument fields. `foo` comes from the name of the
   argument to `combine`.
2. Support for passing `Block`s of `long`s in `FirstOverTime` and
   `LastOverTime`. This will allow it to function outside of TSDB.
3. Flip the order of the arguments in the `combine` methods for
   `FirstOverTime` and `LastOverTime` so they line up with how the ESQL
   code orders the arguments. This shouldn't change the runtime at all.
4. Rename the offset variable we use for reading from vectors in
   non-grouped code so it's the same name as we use in grouped code.
   This just keeps the code gen a little more consistent.
@nik9000 nik9000 requested review from dnhatn and idegtiarenko August 4, 2025 23:29
@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Aug 4, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Contributor

@ivancea ivancea left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!
Reviewing implementers feels like reviewing regex though, it would be nice to have a draft with a real use case (Maybe for future PRs)

private TypeSpec addInput(Consumer<MethodSpec.Builder> addBlock) {
TypeSpec.Builder builder = TypeSpec.anonymousClassBuilder("");
builder.addSuperinterface(GROUPING_AGGREGATOR_FUNCTION_ADD_INPUT);
private TypeSpec addInput(boolean valuesAreVector) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: valuesIsVector, as we're talking about a single block?
Same for other cases

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's all of the inputs - usually one, but for FIRST/LAST it'll be two.

@nik9000
Copy link
Member Author

nik9000 commented Aug 5, 2025

Reviewing implementers feels like reviewing regex though, it would be nice to have a draft with a real use case (Maybe for future PRs)

Yeah, I get that.

The ValuesOverTime ones show the big changes. What would help more? Pointing them out? Copying bits into the commit message?

Copy link
Member

@dnhatn dnhatn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks great. Thanks Nik!

@nik9000
Copy link
Member Author

nik9000 commented Aug 5, 2025

Thanks folks!

I will have to think about a test that "attacks" nested loops before we start using this much.

@nik9000 nik9000 merged commit 311a722 into elastic:main Aug 5, 2025
34 checks passed
szybia added a commit to szybia/elasticsearch that referenced this pull request Aug 6, 2025
…cking

* upstream/main: (24 commits)
  Revert "[Fleet] add privileges to `kibana_system` to read integrations data (elastic#132400)" (elastic#132499)
  ESQL: Rename evaluators for FIRST and LAST (elastic#132466)
  Add inference fields to semantic text docs (elastic#132471)
  ESQL: Allow FIRST and LAST as method name (elastic#132469)
  ESQL: Add javadoc for PushDownAndCombineFilters (elastic#132484)
  Misc cleanups in Coordinator (elastic#132452)
  [DiskBBQ] Write the maximum posting list size to avoid resizing the docId array (elastic#132447)
  Improve exception handling for JsonXContentParser (elastic#123439)
  Clarify quantization on semantic_text BBQ dense vector default (elastic#132470)
  Fix test infra NPE in doEnsureClusterStateConsistency (elastic#131859)
  Stabilize CancellableTasksIT#testRemoveBanParentsOnDisconnect (elastic#131858)
  Move ClusterApplierService assertion after logging exception (elastic#132446)
  ESQL: Support for multi-argument aggs (elastic#132424)
  Update wolfi (versioned) (elastic#132457)
  ESQL: Fix Function javadoc (elastic#132399)
  [ML] Inference API disable partial search results (elastic#132362)
  Unmute testTermsQuery tests (elastic#132409)
  Fix index lookup when field-caps returns empty mapping (elastic#132138)
  CompressorFactory.compressor (elastic#132448)
  ESQL add formatting to plans in javadoc (elastic#132421)
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL >refactoring Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants