Skip to content

Conversation

@ivancea
Copy link
Contributor

@ivancea ivancea commented Feb 13, 2025

Why and what?

First part of #122588

Some functions don't serialize their Source. This sometimes makes them emit wrong warnings, with -1:-1 line and without the correct source text.

This PR pretends to integrate ser/deserialization before executing some (randomly chosen) functions, as well as using a non-empty Source to build them.

Note: It's different from the SerializationTests: Here we check that, whether serialized or not, the functionality is identical

To-Do

  • Make the source "realistic", using the fields names, so they can also be deserialized. This may be more complicated, as fields currently use a synthetic source. May be discarded or moved to another PR
  • From a quick investigation, some functions don't serialize the source but emit warnings. ESQL: Fix functions emitting warnings with no source #122821

@ivancea ivancea added >test Issues or PRs that are addressing/adding tests Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) :Analytics/ES|QL AKA ESQL v9.1.0 labels Feb 13, 2025
Expression newRight = newChildren.get(1);

return left.equals(newLeft) && right.equals(newRight) ? this : replaceChildren(newLeft, newRight);
return replaceChildren(newLeft, newRight);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what to think about this, I may try to do... something around it, as the test code currently needs replaceChildren() to actually replace them even if only the Source changed.
This is the only ESQL expression I found doing this, so I need to check if this was made for some specific reason

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can do a expression.replaceChildren(dummies).replaceChildren(new). A bit weird, but it is what it is!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Finally replaced with that, and restored the original code here

}

/**
* @deprecated Sources created by this can't be correctly deserialized. For use in tests only.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure if deprecated is the way, but this is used only in tests, and can't be serialized. So I preferred to make it explicit that this is dangerous

@ivancea ivancea marked this pull request as ready for review February 13, 2025 17:18
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-analytical-engine (Team:Analytics)

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's a good idea.

}

private Expression randomSerializeDeserialize(Expression expression) {
if (false && randomBoolean()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

false && maybe a leftover

*/
protected final Expression buildFieldExpression(TestCaseSupplier.TestCase testCase) {
return build(testCase.getSource(), testCase.getDataAsFields());
return randomSerializeDeserialize(build(testCase.getSource(), testCase.getDataAsFields()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see! So you are asserting that everything works the same after a round trip of serialization/deserialization. That sounds nice.

implements
Supplier<TestCaseSupplier.TestCase> {

public static final Source TEST_SOURCE = new Source(new Location(1, 0), "source");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think it'd be better to randomize this in the places we call it?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually probably not - we want it to be consistent in the error message.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would like to autogenerate it, so we can also deserialize the fields with a real source, with real names, and remove synthetic Sources from tests if possible.
There's some minor problems I have to handle first to do that, so I'll try in another PR

new TestCaseSupplier.TypedData(null, DataType.KEYWORD, "order").forceLiteral()
),
"third argument of [] cannot be null, received [order]"
"third argument of [source] cannot be null, received [order]"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if source should be TOP or something. source feels sort of surprising here even though it's easy to make it the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You could call functionName() to get the function I think.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's currently a static Source. Making it dynamic is possible, but requires also changing the Configuration.
As commented in other thread here, I'll try that later, but it requires some extra refactor, and I feel like it's better in a separate PR.
I thoguht about changing that "source" to something more localizable like "$source$" or even "function()". But at this point, I'm not sure it adds much

Supplier<TestCaseSupplier.TestCase> {

public static final Source TEST_SOURCE = new Source(new Location(1, 0), "source");
public static final Configuration TEST_CONFIGURATION = EsqlTestUtils.configuration(TEST_SOURCE.text());
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Warning - these are very slow to randomly generate. What you are doing is fine here - but a while back I had some tests that generated a random configuration in a loop - and, like 1000 or them took some time. Anyway, you shouldn't have a problem with this one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least with what we have here, I didn't see a notable difference in times. I'll consider if I try to generate a realistic source, as the config query would have to change per-test. Something to check if I open that other PR


@Override
protected Expression serializeDeserializeExpression(Expression expression) {
// TODO: This aggregation doesn't serialize the Source, and must be fixed.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be really nice to open an issue about it and reference it here from the todo.
This way we could track its progress and know if todos are resolved or accidentally left after the feature was implemented.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure! That's the issue I was tracking actually, so I should have had one to begin with :hehe:

Configuration differentConfig;
do {
differentConfig = randomConfiguration();
} while (config.equals(differentConfig));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

NIT: This could be simplified with Configuration differentConfig = randomValueOtherThan(config, () -> randomConfiguration());

@ivancea ivancea requested a review from idegtiarenko February 14, 2025 12:46
@ivancea ivancea added auto-backport Automatically create backport pull requests when merged v8.18.0 v9.0.1 labels Feb 14, 2025
@ivancea ivancea merged commit a1b16d9 into elastic:main Feb 18, 2025
17 checks passed
@ivancea ivancea deleted the esql-source-functions-warnings branch February 18, 2025 10:47
@elasticsearchmachine
Copy link
Collaborator

💚 Backport successful

Status Branch Result
8.18
9.0

ivancea added a commit to ivancea/elasticsearch that referenced this pull request Feb 18, 2025
# Why and what?
First part of elastic#122588

Some functions don't serialize their Source. This sometimes makes them emit wrong warnings, with -1:-1 line and without the correct source text.

This PR pretends to integrate ser/deserialization before executing some (randomly chosen) functions, as well as using a non-empty Source to build them.

_Note:_ It's different from the SerializationTests: Here we check that, whether serialized or not, the functionality is identical
@ivancea ivancea removed the v8.18.0 label Feb 18, 2025
ivancea added a commit to ivancea/elasticsearch that referenced this pull request Feb 18, 2025
# Why and what?
First part of elastic#122588

Some functions don't serialize their Source. This sometimes makes them emit wrong warnings, with -1:-1 line and without the correct source text.

This PR pretends to integrate ser/deserialization before executing some (randomly chosen) functions, as well as using a non-empty Source to build them.

_Note:_ It's different from the SerializationTests: Here we check that, whether serialized or not, the functionality is identical
elasticsearchmachine pushed a commit that referenced this pull request Feb 18, 2025
# Why and what?
First part of #122588

Some functions don't serialize their Source. This sometimes makes them emit wrong warnings, with -1:-1 line and without the correct source text.

This PR pretends to integrate ser/deserialization before executing some (randomly chosen) functions, as well as using a non-empty Source to build them.

_Note:_ It's different from the SerializationTests: Here we check that, whether serialized or not, the functionality is identical
ivancea added a commit that referenced this pull request Feb 18, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

:Analytics/ES|QL AKA ESQL auto-backport Automatically create backport pull requests when merged Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) >test Issues or PRs that are addressing/adding tests v8.19.0 v9.0.1 v9.1.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants