Esql mv_contains function #133636

mjmbischoff · 2025-08-27T10:35:18Z

This is a fixup for #133099 which was reverted from main as ESQL: Track memory in evaluators (#133392) got merged to main at the same time. Causing compile errors.

…entation improvements. Fix issue with byteref being empty, which caused fold to fail.

…ompatibility test environment. - not sure how to test it as, I feel like the version should be main on main / dev. Doing the dance for now.

…lifecycle metadata Documentation rewording. Co-authored-by: Liam Thompson <[email protected]>

- Fixing tests by removing logic to return null if all parameters are null. The standard generator had to be circumvented, should follow up with separate PR to make it more intelligent to avoid it. - Overwritten part of the test methods to avoid the null expectation.

…TAINS`

Co-authored-by: Iván Cea Fontenla <[email protected]>

elasticsearchmachine · 2025-08-27T10:37:54Z

Pinging @elastic/es-analytical-engine (Team:Analytics)

elasticsearchmachine · 2025-08-27T10:37:54Z

Hi @mjmbischoff, I've created a changelog YAML for you.

ivancea

After a second look on the evaluators, I found what looks like a bug. Added also a comment around how we can test that case to better catch them

...main/java/org/elasticsearch/xpack/esql/expression/function/scalar/multivalue/MvContains.java

ivancea · 2025-08-27T12:56:45Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/string.csv-spec

+ROW a = "a", b = ["a", "b", "c"], n = null
+| EVAL aa = mv_contains(a, a), 
+       bb = mv_contains(b, b), 
+       ab = mv_contains(a, b), 
+       ba = mv_contains(b,a), 
+       na = mv_contains(n, a), 
+       an = mv_contains(a, n), 
+       nn = mv_contains(n,n)


To reproduce the other comment problem with nulls, this won't be enough, as this always works with a single row.

A possible way to generate the data for that could be:

ROW a = [1, 2, 3], n = null | MV_EXPAND a # Now we have multiple rows | EVAL a = CASE(a == 2, null, a) # And we add a null to the non-null column | EVAL an = mv_contains(a, n). na = mv_contains(n, a)

If it works how I think and this effectively uses the MvContainsNullEvaluator, this should end up with an error

I would really love a ROWS source command that can take multiple ROW's for both examples and test cases.

I'm unresolving this, as the test for mixed nulls wasn't added yet (Mixed in the same column) 👀
The ROWS or similar syntax would be handy, but it would be a breaking change, and it's not planned. Usually the test indices have enough things to test anyway, or a new index can be added (Not a quick change, I would avoid that here).
For custom index tests, we would then make a YAML test (Example). But rarely for functions, as CSV tests are usually enough

Addressed in 66ba735 Let me know if I missed any combinations.

Looks good, thanks for adding it! You could add the same CASE(...) for the subset too (blocks with some nulls inside for subsets). Since the custom evaluator was removed, this isn't as important. But could be a nice test to have, specially if we autogenerate the evaluator at a later stage

...main/java/org/elasticsearch/xpack/esql/expression/function/scalar/multivalue/MvContains.java

docs/changelog/133099.yaml

Refactor null handling in `MvContains` evaluators and add `MvContainsNullSupersetEvaluator` for better type-specific evaluation logic.

...main/java/org/elasticsearch/xpack/esql/expression/function/scalar/multivalue/MvContains.java

…position.

…torFactory` and adding global constant boolean evaluators for improved reusability.

…ed records for `ConstantNull`, `ConstantTrue`, and `ConstantFalse`, adding memory usage tracking.

x-pack/plugin/esql/compute/src/main/java/org/elasticsearch/compute/operator/EvalOperator.java

...main/java/org/elasticsearch/xpack/esql/expression/function/scalar/multivalue/MvContains.java

ivancea · 2025-08-29T11:14:05Z

...main/java/org/elasticsearch/xpack/esql/expression/function/scalar/multivalue/MvContains.java

+        final var valueCount = subset.getValueCount(position);
+        final var startIndex = subset.getFirstValueIndex(position);
+        for (int valueIndex = startIndex; valueIndex < startIndex + valueCount; valueIndex++) {
+            var value = valueExtractor.extractValue(subset, valueIndex);


Uhm I didn't see this before, but I think boxing could be a problem performance-wise.

We usually have an evaluator/vector/block per type for 2 reasons:

Avoid itable accesses

Avoid primitive boxing

Now, how much extra time will this take, I don't know. This MV_CONTAINS function has more overhead than other scalar functions, so maybe it's not that important. But I'm not sure about that.

There are 2 optimizations that would be ideal here:

Having a method per type

Having a specialization for sorted values (See Block#mvOrdering())

The second one can be made later. The first one too, I guess?

It's in preview, so I think it's fine either way. If it's merged as-is, I would create an issue to improve it later

Btw, to do this per-type, whether now or later, this could be a case of using StringTemplates (Example), to avoid repeated code.
These functions could be extracted into their own static classes (Or the same, but for the full evaluators, with the functions directly inside).

yes, should create a followup issue for this. Also want to follow up on improving the Implementers to avoid having Evaluators here and have them autogenerated. Leaving this unresolved until after merge so I don't forget to open an issue.

ivancea · 2025-08-29T11:20:27Z

...java/org/elasticsearch/xpack/esql/expression/function/scalar/multivalue/MvContainsTests.java

+        }));
+    }
+
+    // Adjusted from static method anyNullIsNull in {@code AbstractScalarFunctionTestCase#}


Can you add a line describing exactly what changed here? So it's easier later to remove this override if/when we extend the original to handle this case

Added in 23d92f0 but want to leave this unresolved as well to follow up - need to look into making the test class more adaptive.

ivancea · 2025-08-29T11:26:23Z

x-pack/plugin/esql/qa/testFixtures/src/main/resources/string.csv-spec

+ROW a = "a", b = ["a", "b", "c"], n = null
+| EVAL aa = mv_contains(a, a), 
+       bb = mv_contains(b, b), 
+       ab = mv_contains(a, b), 
+       ba = mv_contains(b,a), 
+       na = mv_contains(n, a), 
+       an = mv_contains(a, n), 
+       nn = mv_contains(n,n)


I'm unresolving this, as the test for mixed nulls wasn't added yet (Mixed in the same column) 👀
The ROWS or similar syntax would be handy, but it would be a breaking change, and it's not planned. Usually the test indices have enough things to test anyway, or a new index can be added (Not a quick change, I would avoid that here).
For custom index tests, we would then make a YAML test (Example). But rarely for functions, as CSV tests are usually enough

...java/org/elasticsearch/xpack/esql/expression/function/scalar/multivalue/MvContainsTests.java

… visibility of methods / constructors.

ivancea

Thanks for the changes!
And as a summary, the things to work on later that I remember are:

Multivalue evaluator implementer with nulls
Performance improvements: avoiding boxing and using inherent sorting when available (Probably to be solved before removing the preview label? Some microbenchmarks would be interesting here too)

elasticsearchmachine · 2025-08-29T17:02:58Z

💔 Backport failed

The backport operation could not be completed due to the following error:

There are no branches to backport to. Aborting.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 133636

mjmbischoff and others added 30 commits August 19, 2025 01:46

Add mv_contains_all function

69cfa82

[CI] Auto commit changes from spotless

e250247

Enhance mv_contains_all function with additional examples and docum…

f02c400

…entation improvements. Fix issue with byteref being empty, which caused fold to fail.

[CI] Auto commit changes from spotless

4d0d5bc

Add documentation for mv_contains_all function

dd44baa

Fixing documentation for mv_contains_all function

fa3a339

Regenerated mv_contains_all documentation

05d64b6

Update docs/changelog/133099.yaml

1a00c73

Update docs/changelog/133099.yaml

d6f5e71

Merge branch 'main' into esql-MV_CONTAINS_ALL

8134f60

Update docs/changelog/133099.yaml

b4c31b9

Readding skipped markers as else the test gets run in the backwards c…

1a1b88d

…ompatibility test environment. - not sure how to test it as, I feel like the version should be main on main / dev. Doing the dance for now.

Merge branch 'main' into esql-MV_CONTAINS_ALL

88bd226

Merge branch 'main' into esql-MV_CONTAINS_ALL

bf837d9

Update mv_contains_all docs to reflect the correct version and add …

23544d4

…lifecycle metadata Documentation rewording. Co-authored-by: Liam Thompson <[email protected]>

[CI] Auto commit changes from spotless

a1e1ba3

Merge branch 'main' into esql-MV_CONTAINS_ALL

fdfb149

Processing review, changing null handling

cd7625e

[CI] Auto commit changes from spotless

979aac9

Slightly more efficient null handling and fix null test logic

254fc59

[CI] Auto commit changes from spotless

b0e942a

Merge branch 'main' into esql-MV_CONTAINS_ALL

fb7d7fd

[CI] Auto commit changes from spotless

fbe8a2b

Update documentation to reflect MV_CONTAINS_ALL renaming to `MV_CON…

f71848f

…TAINS`

Fix MvContains javadoc

5584a93

Add preview marker to documentation link

bfa03d8

Co-authored-by: Iván Cea Fontenla <[email protected]>

I think checkstyle needs to be updated :D

b9a222b

Merge branch 'main' into esql-MV_CONTAINS_ALL

51733d1

Merge branch 'main' into esql-MV_CONTAINS_ALL

8228c24

ivancea requested changes Aug 27, 2025

View reviewed changes

mjmbischoff and others added 3 commits August 28, 2025 04:32

removing yaml from other PR

f4d3020

WIP still has test failures.

1f7f430

Refactor null handling in `MvContains` evaluators and add `MvContainsNullSupersetEvaluator` for better type-specific evaluation logic.

[CI] Auto commit changes from spotless

2393634

ivancea reviewed Aug 28, 2025

View reviewed changes

mjmbischoff and others added 9 commits August 28, 2025 17:00

It seems like the isNull operates on the row position, not the value …

b3da482

…position.

[CI] Auto commit changes from spotless

dcc1310

Merge branch 'main' into esql-MV_CONTAINS_ALL

2aa3eec

Refactor MvContains null handling by reusing existing `IsNullEvalua…

aee4e35

…torFactory` and adding global constant boolean evaluators for improved reusability.

Refactor MvContains null handling by reusing existing `IsNullEvalua…

87ec7ab

…torFactory` and adding global constant boolean evaluators for improved reusability.

[CI] Auto commit changes from spotless

23b2aad

Refactor constant evaluators in EvalOperator by introducing dedicat…

4aaac35

…ed records for `ConstantNull`, `ConstantTrue`, and `ConstantFalse`, adding memory usage tracking.

[CI] Auto commit changes from spotless

d676cef

Merge branch 'main' into esql-MV_CONTAINS_ALL

ebcbfc7

mjmbischoff requested a review from ivancea August 29, 2025 08:28

ivancea reviewed Aug 29, 2025

View reviewed changes

mjmbischoff and others added 6 commits August 29, 2025 15:11

Extending test case to cover multirow null

66ba735

Remove unused process method from MvContains class.

7492fad

Documenting changes done to copied test method due to inflexibility /…

23d92f0

… visibility of methods / constructors.

Always true.

d33e333

Merge branch 'main' into esql-MV_CONTAINS_ALL

3d7619b

[CI] Auto commit changes from spotless

a31922b

ivancea approved these changes Aug 29, 2025

View reviewed changes

mjmbischoff merged commit 97abc87 into elastic:main Aug 29, 2025
33 checks passed

elasticsearchmachine added the backport pending label Aug 29, 2025

This was referenced Sep 30, 2025

Updating EvaluatorImplementer to add generated evaluators for MvContains #135723

Merged

MV_CONTAINS avoid autoboxing #135991

Merged

Esql mv_contains function #133636

Esql mv_contains function #133636

Uh oh!

Conversation

mjmbischoff commented Aug 27, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 27, 2025

Uh oh!

elasticsearchmachine commented Aug 27, 2025

Uh oh!

ivancea left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ivancea Aug 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ivancea left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

elasticsearchmachine commented Aug 29, 2025

💔 Backport failed

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

mjmbischoff commented Aug 27, 2025 •

edited

Loading

ivancea Aug 29, 2025 •

edited

Loading