Skip to content

Conversation

@mjmbischoff
Copy link
Contributor

Adding contains ES:QL function, this function checks if a substring is contained within the given string, returning a boolean. This is equivalent with using locate and checkling for !=0

@elasticsearchmachine elasticsearchmachine added v9.2.0 external-contributor Pull request authored by a developer outside the Elasticsearch team labels Aug 16, 2025
@mjmbischoff mjmbischoff self-assigned this Aug 16, 2025
@mjmbischoff mjmbischoff added >feature Team:ES|QL auto-backport Automatically create backport pull requests when merged labels Aug 16, 2025
@github-actions
Copy link
Contributor

github-actions bot commented Aug 16, 2025

@mjmbischoff mjmbischoff marked this pull request as ready for review August 16, 2025 12:22
@mjmbischoff mjmbischoff requested a review from nik9000 August 16, 2025 12:23
@mjmbischoff mjmbischoff added the :Search Relevance/ES|QL Search functionality in ES|QL label Aug 16, 2025
@elasticsearchmachine elasticsearchmachine added the Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch label Aug 16, 2025
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-search-relevance (Team:Search Relevance)

@elasticsearchmachine
Copy link
Collaborator

Hi @mjmbischoff, I've created a changelog YAML for you.

@mjmbischoff mjmbischoff enabled auto-merge (squash) August 16, 2025 13:36
@mjmbischoff
Copy link
Contributor Author

If merged, can rewrite the following:

FROM main_data_index
| LOOKUP JOIN my_lookup_index ON destination.ip
| WHERE LOCATE(CONCAT("#",MV_CONCAT(destination.denied_ports::STRING, "#"),"#"), CONCAT("#",destination.port::STRING,"#"))!=0
| KEEP message, destination.*
FROM main_data_index
| LOOKUP JOIN my_lookup_index ON destination.ip
| WHERE CONTAINS(CONCAT("#",MV_CONCAT(destination.denied_ports::STRING, "#"),"#"), CONCAT("#",destination.port::STRING,"#"))
| KEEP message, destination.*

But a nicer way to check multi valued fields for a value is still needed: #120782

@mjmbischoff
Copy link
Contributor Author

#98545 (comment) should probably be updated.

mjmbischoff and others added 4 commits August 19, 2025 17:10
…ssumed it was a marker for documentation. But it's used to ignore the test cases. Re-enabling the tests and fixing them.
…ompatibility test environment. - not sure how to test it as, I feel like the version should be main on main / dev. Doing the dance for now.
@nik9000
Copy link
Member

nik9000 commented Aug 22, 2025

If merged, can rewrite the following:

terrible code
into
terrible code

I'm so sorry we haven't written a better way yet.

Copy link
Member

@nik9000 nik9000 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you drop the null checks in the eval method please? They are confusing because those things can't be null because of the @Evaluator annotation. Makes a reader have to go and look up stuff in the generated code.

}
String utf8ToString = str.utf8ToString();
return utf8ToString.contains(substr.utf8ToString());
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's probably faster to do it by hand without converting to a String first. But this is fine for now.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so as well, but having been bitten by character encoding subtleties before.. 😅

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right. ESQL deals in utf-8 naturally like Elasticsearch and Lucene mostly do. It's probably fine to do the contains on raw strings because utf-8 is like that. But it's worth making super sure before you do it. So it can wait.

@nik9000
Copy link
Member

nik9000 commented Aug 22, 2025

Looks great - I asked for two things that should happen before merging (the capability and the null check) and one thing you can do, or it can wait for a while (replace utf8ToString with a loop checking).

@mjmbischoff
Copy link
Contributor Author

While I think we need a 'passAlongNulls'/'dontHandleNulls' option for the generator; Here I'm perfectly fine with the handling, as all string functions seem to return null if any of the arguments are null. (For some it's undocumented - should probably be fixed)

@nik9000
Copy link
Member

nik9000 commented Aug 24, 2025

Here I'm perfectly fine with the handling, as all string functions seem to return null if any of the arguments are null.

95% of our functions have "any null input returns null" semantics which is very SQL. Very in the spirit of "null means unknown" which SQL likes. In Elasticsearch null often means "empty list" more than "unknown". Which is one of the really sneaky things about all this.

But in this case I think "any null input returns null" is natural and good.

@nik9000
Copy link
Member

nik9000 commented Aug 24, 2025

For some it's undocumented - should probably be fixed

If you have a list or want to add the docs, that'd be awesome.

@mjmbischoff
Copy link
Contributor Author

For some it's undocumented - should probably be fixed

If you have a list or want to add the docs, that'd be awesome.

I've now added it to my (long) TODO list. 😅

@mjmbischoff mjmbischoff merged commit b7aaf31 into elastic:main Aug 24, 2025
34 checks passed
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

The backport operation could not be completed due to the following error:

There are no branches to backport to. Aborting.

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 133016

nik9000 added a commit to nik9000/elasticsearch that referenced this pull request Aug 26, 2025
When we compile the code for `CONTAINS` we generate an evaluator java
class and commit that, as is our ancient custom. But because elastic#133016
didn't see elastic#133392, we committed out of date code. That's fine because
we regenerate the code on every compile. But it's annoying because every
clone is out of date. This updates the generated file.

You may be asking "why do you commit the generated code if you just
generate it at compile time?" That's a good question! It's a grand
tradition, one that we will probably one day leave behind. But let's
celebrate it today by committing more code.
dnhatn pushed a commit that referenced this pull request Aug 27, 2025
When we compile the code for `CONTAINS` we generate an evaluator java
class and commit that, as is our ancient custom. But because #133016
didn't see #133392, we committed out of date code. That's fine because
we regenerate the code on every compile. But it's annoying because every
clone is out of date. This updates the generated file.

You may be asking "why do you commit the generated code if you just
generate it at compile time?" That's a good question! It's a grand
tradition, one that we will probably one day leave behind. But let's
celebrate it today by committing more code.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

auto-backport Automatically create backport pull requests when merged backport pending external-contributor Pull request authored by a developer outside the Elasticsearch team >feature :Search Relevance/ES|QL Search functionality in ES|QL Team:ES|QL Team:Search Relevance Meta label for the Search Relevance team in Elasticsearch v9.2.0

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants