ESQL: Speed up CASE for some parameters #112295

nik9000 · 2024-08-28T17:15:31Z

This speeds up the CASE function when it has two or three arguments and both of the arguments are constants or fields. This works because CASE is lazy so it can avoid warnings in cases like

CASE(foo != 0, 2 / foo, 1)

And, in the case where the function is very slow, it can avoid the computations.

But if the lhs and rhs of the CASE are constant then there isn't any work to avoid.

The performance improvment is pretty substantial:

 (operation)  Before   Error   After    Error  Units
 case_1_lazy  97.422 ± 1.048  101.571 ± 0.737  ns/op
case_1_eager  79.312 ± 1.190    4.601 ± 0.049  ns/op

The top line is a CASE that has to be lazy - it shouldn't change. The 4 nanos change here is noise. The eager version improves by about 94%.

This speeds up the `CASE` function when it has two or three arguments and both of the arguments are constants or fields. This works because `CASE` is lazy so it can avoid warnings in cases like ``` CASE(foo != 0, 2 / foo, 1) ``` And, in the case where the function is *very* slow, it can avoid the computations. But if the lhs and rhs of the `CASE` are constant then there isn't any work to avoid. The performance improvment is pretty substantial: ``` (operation) Before Error After Error Units case_1_lazy 97.422 ± 1.048 101.571 ± 0.737 ns/op case_1_eager 79.312 ± 1.190 4.601 ± 0.049 ns/op ``` The top line is a `CASE` that has to be lazy - it shouldn't change. The 4 nanos change here is noise. The eager version improves by about 94%.

elasticsearchmachine · 2024-08-28T17:15:56Z

Hi @nik9000, I've created a changelog YAML for you.

…zoom

nik9000 · 2024-08-28T17:16:43Z

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/EvalOperator.java

+             */
+            default boolean safeToEvalInLazy() {
+                return false;
+            }


I suppose this is something that we could derive from the Expression - but it felt simpler to put it here. And it's just a boolean at this point - though maybe it should be a cost estimate at some point.

In case of constants, the unvisited branches can be removed and the case simplified.

That works if the expression is constant, but not if the values are constant. We still have to evaluate in that case. And when we do we can do it the fast way with something like this.

nik9000 · 2024-08-30T16:29:24Z

I'm waiting until I merge #112401 to pick this one back up.

not-napoleon · 2024-09-04T14:28:42Z

.../src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/conditional/Case.java

+                    && conditionsFactories.get(0).value.safeToEvalInLazy()
+                    && elseValueFactory.safeToEvalInLazy()) {
+
+                    return new EagerEvaluator(context, conditionsFactories.get(0).apply(context), elseValueFactory.get(context));


I can't tell if the logic is wrong here or if the naming is just confusing. To me, this reads like "if it is safe to evaluate in lazy, return an eager evaluator". I would normally say "eager evaluation" was the opposite of "lazy evaluation".

Bad name, I think. It's "if this is safe to eagerly evaluate when we expect a lazy"

I'll rename this now that the CASE bugs are fixed.

elasticsearchmachine · 2024-09-09T19:25:10Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

costin · 2024-09-10T03:14:17Z

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/EvalOperator.java

+             * that evaluate quickly and can not produce warnings may override this to
+             * {@code true} to get a significant speed-up in {@code CASE}-like operations.
+             */
+            default boolean eagerEvalSafeInLazy() {


another alternative is to be extract this as a marking interface that gets implemented by certain factories.

I think that's a lot less readable.

luigidellaquila

LGTM
The method names are a bit cryptic indeed, but the javadoc helps a lot

luigidellaquila · 2024-09-24T08:14:58Z

.../src/main/java/org/elasticsearch/xpack/esql/expression/function/scalar/conditional/Case.java

+                if (lhsOrRhs.hadMultivaluedFields()) {
+                    condition.registerMultivalue();
+                }
+                for (int p = 0; p < lhs.getPositionCount(); p++) {


We could leverage BooleanVector.isConstant() here, eg.

BooleanVector mask = lhsOrRhs.mask(); if (mask.isConstant()) { builder.copyFrom(mask.getBoolean(0) ? lhs : rhs, 0, lhs.getPositionCount()); } else { for (int p = 0; p < lhs.getPositionCount(); p++) { ...

Or even return lhs/rhs...? Not sure it's safe for memory accounting

I believe it is safe, but tests can figure it out. I'll try it.

This speeds up the `CASE` function when it has two or three arguments and both of the arguments are constants or fields. This works because `CASE` is lazy so it can avoid warnings in cases like ``` CASE(foo != 0, 2 / foo, 1) ``` And, in the case where the function is *very* slow, it can avoid the computations. But if the lhs and rhs of the `CASE` are constant then there isn't any work to avoid. The performance improvment is pretty substantial: ``` (operation) Before Error After Error Units case_1_lazy 97.422 ± 1.048 101.571 ± 0.737 ns/op case_1_eager 79.312 ± 1.190 4.601 ± 0.049 ns/op ``` The top line is a `CASE` that has to be lazy - it shouldn't change. The 4 nanos change here is noise. The eager version improves by about 94%.

elasticsearchmachine · 2024-09-24T16:56:06Z

💚 Backport successful

Status	Branch	Result
✅	8.x

This speeds up the `CASE` function when it has two or three arguments and both of the arguments are constants or fields. This works because `CASE` is lazy so it can avoid warnings in cases like ``` CASE(foo != 0, 2 / foo, 1) ``` And, in the case where the function is *very* slow, it can avoid the computations. But if the lhs and rhs of the `CASE` are constant then there isn't any work to avoid. The performance improvment is pretty substantial: ``` (operation) Before Error After Error Units case_1_lazy 97.422 ± 1.048 101.571 ± 0.737 ns/op case_1_eager 79.312 ± 1.190 4.601 ± 0.049 ns/op ``` The top line is a `CASE` that has to be lazy - it shouldn't change. The 4 nanos change here is noise. The eager version improves by about 94%.

``` before after (operation) Score Error Score Error Units coalesce_2_noop 75.949 ± 3.961 -> 0.010 ± 0.001 ns/op 99.9% coalesce_2_eager 99.299 ± 6.959 -> 4.292 ± 0.227 ns/op 95.7% coalesce_2_lazy 113.118 ± 5.747 -> 26.746 ± 0.954 ns/op 76.4% ``` We tend to advise folks that "COALESCE is faster than CASE", but, as of 8.16.0/elastic#112295 that wasn't the true. I was working with someone a few days ago to port a scripted_metric aggregation to ESQL and we saw COALESCE taking ~60% of the time. That won't do. The trouble is that CASE and COALESCE have to be *lazy*, meaning that operations like: ``` COALESCE(a, 1 / b) ``` should never emit a warning if `a` is not `null`, even if `b` is `0`. In 8.16/elastic#112295 CASE grew an optimization where it could operate non-lazily if it was flagged as "safe". This brings a similar optimization to COALESCE, see it above as "case_2_eager", a 95.7% improvement. It also brings and arguably more important optimization - entire-block execution for COALESCE. The schort version is that, if the first parameter of COALESCE returns no nulls we can return it without doing anything lazily. There are a few more cases, but the upshot is that COALESCE is pretyt much *free* in cases where long strings of results are `null` or not `null`. That's the `coalesce_2_noop` line. Finally, when there mixed null and non-null values we were using a single builder with some fairly inefficient paths. This specializes them per type and skips some slow null-checking where possible. That's the `coalesce_2_lazy` result, a more modest 76.4%.

``` before after (operation) Score Error Score Error Units coalesce_2_noop 75.949 ± 3.961 -> 0.010 ± 0.001 ns/op 99.9% coalesce_2_eager 99.299 ± 6.959 -> 4.292 ± 0.227 ns/op 95.7% coalesce_2_lazy 113.118 ± 5.747 -> 26.746 ± 0.954 ns/op 76.4% ``` We tend to advise folks that "COALESCE is faster than CASE", but, as of 8.16.0/#112295 that wasn't the true. I was working with someone a few days ago to port a scripted_metric aggregation to ESQL and we saw COALESCE taking ~60% of the time. That won't do. The trouble is that CASE and COALESCE have to be *lazy*, meaning that operations like: ``` COALESCE(a, 1 / b) ``` should never emit a warning if `a` is not `null`, even if `b` is `0`. In 8.16/#112295 CASE grew an optimization where it could operate non-lazily if it was flagged as "safe". This brings a similar optimization to COALESCE, see it above as "case_2_eager", a 95.7% improvement. It also brings and arguably more important optimization - entire-block execution for COALESCE. The schort version is that, if the first parameter of COALESCE returns no nulls we can return it without doing anything lazily. There are a few more cases, but the upshot is that COALESCE is pretty much *free* in cases where long strings of results are `null` or not `null`. That's the `coalesce_2_noop` line. Finally, when there mixed null and non-null values we were using a single builder with some fairly inefficient paths. This specializes them per type and skips some slow null-checking where possible. That's the `coalesce_2_lazy` result, a more modest 76.4%. NOTE: These %s of improvements on COALESCE itself, or COALESCE with some load-overhead operators like `+`. If COALESCE isn't taking a *ton* time in your query don't get particularly excited about this. It's fun though. Closes #119953

``` before after (operation) Score Error Score Error Units coalesce_2_noop 75.949 ± 3.961 -> 0.010 ± 0.001 ns/op 99.9% coalesce_2_eager 99.299 ± 6.959 -> 4.292 ± 0.227 ns/op 95.7% coalesce_2_lazy 113.118 ± 5.747 -> 26.746 ± 0.954 ns/op 76.4% ``` We tend to advise folks that "COALESCE is faster than CASE", but, as of 8.16.0/elastic#112295 that wasn't the true. I was working with someone a few days ago to port a scripted_metric aggregation to ESQL and we saw COALESCE taking ~60% of the time. That won't do. The trouble is that CASE and COALESCE have to be *lazy*, meaning that operations like: ``` COALESCE(a, 1 / b) ``` should never emit a warning if `a` is not `null`, even if `b` is `0`. In 8.16/elastic#112295 CASE grew an optimization where it could operate non-lazily if it was flagged as "safe". This brings a similar optimization to COALESCE, see it above as "case_2_eager", a 95.7% improvement. It also brings and arguably more important optimization - entire-block execution for COALESCE. The schort version is that, if the first parameter of COALESCE returns no nulls we can return it without doing anything lazily. There are a few more cases, but the upshot is that COALESCE is pretty much *free* in cases where long strings of results are `null` or not `null`. That's the `coalesce_2_noop` line. Finally, when there mixed null and non-null values we were using a single builder with some fairly inefficient paths. This specializes them per type and skips some slow null-checking where possible. That's the `coalesce_2_lazy` result, a more modest 76.4%. NOTE: These %s of improvements on COALESCE itself, or COALESCE with some load-overhead operators like `+`. If COALESCE isn't taking a *ton* time in your query don't get particularly excited about this. It's fun though. Closes elastic#119953

``` before after (operation) Score Error Score Error Units coalesce_2_noop 75.949 ± 3.961 -> 0.010 ± 0.001 ns/op 99.9% coalesce_2_eager 99.299 ± 6.959 -> 4.292 ± 0.227 ns/op 95.7% coalesce_2_lazy 113.118 ± 5.747 -> 26.746 ± 0.954 ns/op 76.4% ``` We tend to advise folks that "COALESCE is faster than CASE", but, as of 8.16.0/#112295 that wasn't the true. I was working with someone a few days ago to port a scripted_metric aggregation to ESQL and we saw COALESCE taking ~60% of the time. That won't do. The trouble is that CASE and COALESCE have to be *lazy*, meaning that operations like: ``` COALESCE(a, 1 / b) ``` should never emit a warning if `a` is not `null`, even if `b` is `0`. In 8.16/#112295 CASE grew an optimization where it could operate non-lazily if it was flagged as "safe". This brings a similar optimization to COALESCE, see it above as "case_2_eager", a 95.7% improvement. It also brings and arguably more important optimization - entire-block execution for COALESCE. The schort version is that, if the first parameter of COALESCE returns no nulls we can return it without doing anything lazily. There are a few more cases, but the upshot is that COALESCE is pretty much *free* in cases where long strings of results are `null` or not `null`. That's the `coalesce_2_noop` line. Finally, when there mixed null and non-null values we were using a single builder with some fairly inefficient paths. This specializes them per type and skips some slow null-checking where possible. That's the `coalesce_2_lazy` result, a more modest 76.4%. NOTE: These %s of improvements on COALESCE itself, or COALESCE with some load-overhead operators like `+`. If COALESCE isn't taking a *ton* time in your query don't get particularly excited about this. It's fun though. Closes #119953

nik9000 added >enhancement :Analytics/ES|QL AKA ESQL v8.16.0 labels Aug 28, 2024

nik9000 requested review from luigidellaquila, not-napoleon, ChrisHegarty and alex-spies August 28, 2024 17:15

nik9000 added 3 commits August 28, 2024 13:15

Update docs/changelog/112295.yaml

394b9d7

Fit px

3979049

Merge remote-tracking branch 'nik9000/esql_case_zoom' into esql_case_…

5fd2f66

…zoom

nik9000 commented Aug 28, 2024

View reviewed changes

nik9000 added 3 commits August 29, 2024 10:08

Merge branch 'main' into esql_case_zoom

0e1c8ac

Merge branch 'main' into esql_case_zoom

19a8247

WIP

f3abaef

not-napoleon reviewed Sep 4, 2024

View reviewed changes

nik9000 added 2 commits September 9, 2024 13:07

Merge branch 'main' into esql_case_zoom

a4af922

Merge

3117287

nik9000 marked this pull request as ready for review September 9, 2024 19:24

nik9000 requested a review from a team as a code owner September 9, 2024 19:24

elasticsearchmachine added the Team:Analytics Meta label for analytical engine team (ESQL/Aggs/Geo) label Sep 9, 2024

nik9000 added 3 commits September 9, 2024 15:33

Merge branch 'main' into esql_case_zoom

deb5845

Rename

3c45129

Format

9b1f6c4

costin reviewed Sep 10, 2024

View reviewed changes

mark-vieira added v9.0.0 and removed v8.16.0 labels Sep 11, 2024

Merge branch 'main' into esql_case_zoom

b37d645

luigidellaquila approved these changes Sep 24, 2024

View reviewed changes

nik9000 added 3 commits September 24, 2024 09:21

Merge branch 'main' into esql_case_zoom

e6c0863

ESQL: Constant branch

85c757e

Merge branch 'main' into esql_case_zoom

15ea99c

nik9000 added auto-backport-and-merge v8.16.0 labels Sep 24, 2024

nik9000 merged commit 5c91edd into elastic:main Sep 24, 2024
15 checks passed

nik9000 mentioned this pull request Sep 24, 2024

[8.x] ESQL: Speed up CASE for some parameters (#112295) #113487

Merged

nik9000 mentioned this pull request Jan 14, 2025

Speed up COALESCE significantly #120139

Merged

ESQL: Speed up CASE for some parameters #112295

ESQL: Speed up CASE for some parameters #112295

Uh oh!

Conversation

nik9000 commented Aug 28, 2024

Uh oh!

elasticsearchmachine commented Aug 28, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

nik9000 commented Aug 30, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

elasticsearchmachine commented Sep 9, 2024

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

luigidellaquila left a comment

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Sep 24, 2024

💚 Backport successful

Uh oh!

Uh oh!