Skip to content

Conversation

@costin
Copy link
Member

@costin costin commented Mar 12, 2025

Avoid creating outputSet between nodes that passthrough their input

Relates #124395

Avoid creating outputSet between nodes that passthrough their input

Relates elastic#124395
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

@elasticsearchmachine elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Mar 12, 2025
@elasticsearchmachine
Copy link
Collaborator

Hi @costin, I've created a changelog YAML for you.

@costin
Copy link
Member Author

costin commented Mar 12, 2025

The initial PR (making AttributeMap/Set immutable) become larger and ended up in chasing subtle bugs with planning - I've extracted this small change as it has a direct impact on queries with many fields.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @costin, I like the idea and I think it makes a lot of sense.
It only needs a little change to UnaryPlan, see comment below.

Comment on lines 46 to 50
@Override
public AttributeSet outputSet() {
return child().outputSet();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid this is incorrect: now all the subplans that override output() but not outputSet() (eg. Eval) return a wrong output set, based on the child rather than on the actual output

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, this is dangerous, we normally don't have to override outputSet but rely it to be constructed from output().

A IMHO better way that remains safe:

    public AttributeSet outputSet() {
        if (lazyOutputSet == null) {
            List<Attribute> output = output();
            lazyOutputSet = (output == child.output?) child.outputSet() : new AttributeSet(output);
        }
        return lazyOutputSet;

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this change itself makes sense. I'd like it a little better if the improvements were tracked by benches - I think this would strongly benefit from a micro benchmark (or a nightly!) that demonstrates the effect of this improvement on execution times of our optimizer(s).

Comment on lines 46 to 50
@Override
public AttributeSet outputSet() {
return child().outputSet();
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

++, this is dangerous, we normally don't have to override outputSet but rely it to be constructed from output().

A IMHO better way that remains safe:

    public AttributeSet outputSet() {
        if (lazyOutputSet == null) {
            List<Attribute> output = output();
            lazyOutputSet = (output == child.output?) child.outputSet() : new AttributeSet(output);
        }
        return lazyOutputSet;


@Override
public AttributeSet outputSet() {
return child().outputSet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

With a better default implementation of this in UnaryPlan, this might be obsolete. Same goes for the corresponding overrides in the other classes.

Copy link
Contributor

@luigidellaquila luigidellaquila left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Love it!
LGTM, thanks @costin

Copy link
Contributor

@alex-spies alex-spies left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

Comment on lines 68 to 69
return child().output();
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Out of scope, but I noticed this is redundant.

@costin costin added the auto-backport Automatically create backport pull requests when merged label Mar 20, 2025
@costin costin enabled auto-merge (squash) March 20, 2025 00:48
@costin costin disabled auto-merge March 20, 2025 02:43
@costin costin merged commit e186b15 into elastic:main Mar 20, 2025
15 of 17 checks passed
@costin costin deleted the esql/reuse-outputset branch March 20, 2025 02:44
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.18
8.x
9.0

costin added a commit to costin/elasticsearch that referenced this pull request Mar 20, 2025
…24611)

Avoid creating outputSet between nodes that passthrough their input

Relates elastic#124395
costin added a commit to costin/elasticsearch that referenced this pull request Mar 20, 2025
…24611)

Avoid creating outputSet between nodes that passthrough their input

Relates elastic#124395
elasticsearchmachine pushed a commit that referenced this pull request Mar 20, 2025
…125275)

Avoid creating outputSet between nodes that passthrough their input

Relates #124395
elasticsearchmachine pushed a commit that referenced this pull request Mar 20, 2025
…125273)

Avoid creating outputSet between nodes that passthrough their input

Relates #124395
costin added a commit that referenced this pull request Mar 20, 2025
…125274)

Avoid creating outputSet between nodes that passthrough their input

Relates #124395
smalyshev pushed a commit to smalyshev/elasticsearch that referenced this pull request Mar 21, 2025
…24611)

Avoid creating outputSet between nodes that passthrough their input

Relates elastic#124395
omricohenn pushed a commit to omricohenn/elasticsearch that referenced this pull request Mar 28, 2025
…24611)

Avoid creating outputSet between nodes that passthrough their input

Relates elastic#124395
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged >enhancement Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) v8.18.1 v8.19.0 v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants