diff --git a/docs/reference/esql/functions/description/match.asciidoc b/docs/reference/esql/functions/description/match.asciidoc index 2375fcc3b4521..c4cb0d46af55a 100644 --- a/docs/reference/esql/functions/description/match.asciidoc +++ b/docs/reference/esql/functions/description/match.asciidoc @@ -2,4 +2,4 @@ *Description* -Use `MATCH` to perform a <> on the specified field. Using `MATCH` is equivalent to using the `match` query in the Elasticsearch Query DSL. Match can be used on fields from the text family like <> and <>, as well as other field types like keyword, boolean, dates, and numeric types. Match can use <> to specify additional options for the match query. All <> are supported. For a simplified syntax, you can use the <> `:` operator instead of `MATCH`. `MATCH` returns true if the provided query matches the row. +Use `MATCH` to perform a <> on the specified field. Using `MATCH` is equivalent to using the `match` query in the Elasticsearch Query DSL. Match can be used on fields from the text family like <> and <>, as well as other field types like keyword, boolean, dates, and numeric types. When Match is used on a <> field, it will perform a semantic query on the field. Match can use <> to specify additional options for the match query. All <> are supported. For a simplified syntax, you can use the <> `:` operator instead of `MATCH`. `MATCH` returns true if the provided query matches the row. diff --git a/docs/reference/esql/functions/kibana/definition/match.json b/docs/reference/esql/functions/kibana/definition/match.json index 245d05d9308b8..439ac6d30111f 100644 --- a/docs/reference/esql/functions/kibana/definition/match.json +++ b/docs/reference/esql/functions/kibana/definition/match.json @@ -2,7 +2,7 @@ "comment" : "This is generated by ESQL's AbstractFunctionTestCase. Do no edit it. See ../README.md for how to regenerate it.", "type" : "eval", "name" : "match", - "description" : "Use `MATCH` to perform a <> on the specified field.\nUsing `MATCH` is equivalent to using the `match` query in the Elasticsearch Query DSL.\n\nMatch can be used on fields from the text family like <> and <>,\nas well as other field types like keyword, boolean, dates, and numeric types.\n\nMatch can use <> to specify additional options for the match query.\nAll <> are supported.\n\nFor a simplified syntax, you can use the <> `:` operator instead of `MATCH`.\n\n`MATCH` returns true if the provided query matches the row.", + "description" : "Use `MATCH` to perform a <> on the specified field.\nUsing `MATCH` is equivalent to using the `match` query in the Elasticsearch Query DSL.\n\nMatch can be used on fields from the text family like <> and <>,\nas well as other field types like keyword, boolean, dates, and numeric types.\nWhen Match is used on a <> field, it will perform a semantic query on the field.\n\nMatch can use <> to specify additional options for the match query.\nAll <> are supported.\n\nFor a simplified syntax, you can use the <> `:` operator instead of `MATCH`.\n\n`MATCH` returns true if the provided query matches the row.", "signatures" : [ { "params" : [ diff --git a/docs/reference/esql/functions/kibana/docs/match.md b/docs/reference/esql/functions/kibana/docs/match.md index a91a28cdeb8af..29d22f65e8e3b 100644 --- a/docs/reference/esql/functions/kibana/docs/match.md +++ b/docs/reference/esql/functions/kibana/docs/match.md @@ -8,6 +8,7 @@ Using `MATCH` is equivalent to using the `match` query in the Elasticsearch Quer Match can be used on fields from the text family like <> and <>, as well as other field types like keyword, boolean, dates, and numeric types. +When Match is used on a <> field, it will perform a semantic query on the field. Match can use <> to specify additional options for the match query. All <> are supported. diff --git a/docs/reference/mapping/types/semantic-text.asciidoc b/docs/reference/mapping/types/semantic-text.asciidoc index f23c624f140cd..8d802c9b257c1 100644 --- a/docs/reference/mapping/types/semantic-text.asciidoc +++ b/docs/reference/mapping/types/semantic-text.asciidoc @@ -18,6 +18,7 @@ If you don’t specify an inference endpoint, the `inference_id` field defaults Using `semantic_text`, you won't need to specify how to generate embeddings for your data, or how to index it. The {infer} endpoint automatically determines the embedding generation, indexing, and query to use. +Newly created indices with `semantic_text` fields using dense embeddings will be <> to `bbq_hnsw` automatically. If you use the preconfigured `.elser-2-elasticsearch` endpoint, you can set up `semantic_text` with the following API request: @@ -225,7 +226,8 @@ In these cases - when you use `sparse_vector` or `dense_vector` field types inst For indices containing `semantic_text` fields, updates that use scripts have the following behavior: * Are supported through the https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-update[Update API]. -* Are not supported through the https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk-1[Bulk API] and will fail. Even if the script targets non-`semantic_text` fields, the update will fail when the index contains a `semantic_text` field. +* Are not supported through the https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-bulk-1[Bulk API] and will fail. +Even if the script targets non-`semantic_text` fields, the update will fail when the index contains a `semantic_text` field. [discrete] [[copy-to-support]] diff --git a/docs/reference/query-dsl/match-query.asciidoc b/docs/reference/query-dsl/match-query.asciidoc index 67300ffca2d26..8500c109a271b 100644 --- a/docs/reference/query-dsl/match-query.asciidoc +++ b/docs/reference/query-dsl/match-query.asciidoc @@ -1,19 +1,18 @@ [[query-dsl-match-query]] === Match query + ++++ Match ++++ -Returns documents that match a provided text, number, date or boolean value. The -provided text is analyzed before matching. - -The `match` query is the standard query for performing a full-text search, -including options for fuzzy matching. +Returns documents that match a provided text, number, date or boolean value. +The provided text is analyzed before matching. -`Match` will also work against <> fields, -however when performing `match` queries against `semantic_text` fields options -that specifically target lexical search such as `fuzziness` or `analyzer` will be ignored. +The `match` query is the standard query for performing a full-text search, including options for fuzzy matching. +`Match` will also work against <> fields. +As `semantic_text` does not support lexical text search, `match` queries against `semantic_text` fields will automatically perform the correct semantic search. +Because of this, options that specifically target lexical search such as `fuzziness` or `analyzer` will be ignored. [[match-query-ex-request]] ==== Example request @@ -32,77 +31,80 @@ GET /_search } -------------------------------------------------- - [[match-top-level-params]] ==== Top-level parameters for `match` ``:: (Required, object) Field you wish to search. - [[match-field-params]] ==== Parameters for `` + `query`:: + -- (Required) Text, number, boolean value or date you wish to find in the provided ``. -The `match` query <> any provided text before performing a -search. This means the `match` query can search <> fields for -analyzed tokens rather than an exact term. +The `match` query <> any provided text before performing a search. +This means the `match` query can search <> fields for analyzed tokens rather than an exact term. -- `analyzer`:: (Optional, string) <> used to convert the text in the `query` -value into tokens. Defaults to the <> mapped for the ``. If no analyzer is mapped, the index's -default analyzer is used. +value into tokens. +Defaults to the <> mapped for the ``. +If no analyzer is mapped, the index's default analyzer is used. `auto_generate_synonyms_phrase_query`:: + -- (Optional, Boolean) If `true`, <> -queries are automatically created for multi-term synonyms. Defaults to `true`. +queries are automatically created for multi-term synonyms. +Defaults to `true`. -See <> for an -example. +See <> for an example. -- `boost`:: + -- (Optional, float) Floating point number used to decrease or increase the -<> of the query. Defaults to `1.0`. +<> of the query. +Defaults to `1.0`. -Boost values are relative to the default value of `1.0`. A boost value between -`0` and `1.0` decreases the relevance score. A value greater than `1.0` +Boost values are relative to the default value of `1.0`. +A boost value between +`0` and `1.0` decreases the relevance score. +A value greater than `1.0` increases the relevance score. -- `fuzziness`:: -(Optional, string) Maximum edit distance allowed for matching. See <> -for valid values and more information. See <> +(Optional, string) Maximum edit distance allowed for matching. +See <> +for valid values and more information. +See <> for an example. `max_expansions`:: -(Optional, integer) Maximum number of terms to which the query will -expand. Defaults to `50`. +(Optional, integer) Maximum number of terms to which the query will expand. +Defaults to `50`. `prefix_length`:: -(Optional, integer) Number of beginning characters left unchanged for fuzzy -matching. Defaults to `0`. +(Optional, integer) Number of beginning characters left unchanged for fuzzy matching. +Defaults to `0`. `fuzzy_transpositions`:: -(Optional, Boolean) If `true`, edits for fuzzy matching include -transpositions of two adjacent characters (ab → ba). Defaults to `true`. +(Optional, Boolean) If `true`, edits for fuzzy matching include transpositions of two adjacent characters (ab → ba). +Defaults to `true`. `fuzzy_rewrite`:: + -- -(Optional, string) Method used to rewrite the query. See the -<> for valid values and more -information. +(Optional, string) Method used to rewrite the query. +See the +<> for valid values and more information. If the `fuzziness` parameter is not `0`, the `match` query uses a `fuzzy_rewrite` method of `top_terms_blended_freqs_${max_expansions}` by default. @@ -110,7 +112,8 @@ method of `top_terms_blended_freqs_${max_expansions}` by default. `lenient`:: (Optional, Boolean) If `true`, format-based errors, such as providing a text -`query` value for a <> field, are ignored. Defaults to `false`. +`query` value for a <> field, are ignored. +Defaults to `false`. `operator`:: + @@ -130,8 +133,8 @@ AND of AND Hungary`. `minimum_should_match`:: + -- -(Optional, string) Minimum number of clauses that must match for a document to -be returned. See the <> for valid values and more information. -- @@ -139,7 +142,8 @@ parameter>> for valid values and more information. + -- (Optional, string) Indicates whether no documents are returned if the `analyzer` -removes all tokens, such as when using a `stop` filter. Valid values are: +removes all tokens, such as when using a `stop` filter. +Valid values are: `none` (Default):: No documents are returned if the `analyzer` removes all tokens. @@ -151,7 +155,6 @@ query. See <> for an example. -- - [[match-query-notes]] ==== Notes @@ -159,7 +162,8 @@ See <> for an example. ===== Short request example You can simplify the match query syntax by combining the `` and `query` -parameters. For example: +parameters. +For example: [source,console] ---- @@ -176,11 +180,11 @@ GET /_search [[query-dsl-match-query-boolean]] ===== How the match query works -The `match` query is of type `boolean`. It means that the text -provided is analyzed and the analysis process constructs a boolean query -from the provided text. The `operator` parameter can be set to `or` or `and` -to control the boolean clauses (defaults to `or`). The minimum number of -optional `should` clauses to match can be set using the +The `match` query is of type `boolean`. +It means that the text provided is analyzed and the analysis process constructs a boolean query from the provided text. +The `operator` parameter can be set to `or` or `and` +to control the boolean clauses (defaults to `or`). +The minimum number of optional `should` clauses to match can be set using the <> parameter. @@ -201,13 +205,11 @@ GET /_search } -------------------------------------------------- -The `analyzer` can be set to control which analyzer will perform the -analysis process on the text. It defaults to the field explicit mapping -definition, or the default search analyzer. +The `analyzer` can be set to control which analyzer will perform the analysis process on the text. +It defaults to the field explicit mapping definition, or the default search analyzer. -The `lenient` parameter can be set to `true` to ignore exceptions caused by -data-type mismatches, such as trying to query a numeric field with a text -query string. Defaults to `false`. +The `lenient` parameter can be set to `true` to ignore exceptions caused by data-type mismatches, such as trying to query a numeric field with a text query string. +Defaults to `false`. [[query-dsl-match-query-fuzziness]] ===== Fuzziness in the match query @@ -218,17 +220,12 @@ See <> for allowed settings. The `prefix_length` and `max_expansions` can be set in this case to control the fuzzy process. If the fuzzy option is set the query will use `top_terms_blended_freqs_${max_expansions}` -as its <> the `fuzzy_rewrite` parameter allows to control how the query will get -rewritten. +as its <> the `fuzzy_rewrite` parameter allows to control how the query will get rewritten. -Fuzzy transpositions (`ab` -> `ba`) are allowed by default but can be disabled -by setting `fuzzy_transpositions` to `false`. +Fuzzy transpositions (`ab` -> `ba`) are allowed by default but can be disabled by setting `fuzzy_transpositions` to `false`. -NOTE: Fuzzy matching is not applied to terms with synonyms or in cases where the -analysis process produces multiple tokens at the same position. Under the hood -these terms are expanded to a special synonym query that blends term frequencies, -which does not support fuzzy expansion. +NOTE: Fuzzy matching is not applied to terms with synonyms or in cases where the analysis process produces multiple tokens at the same position. +Under the hood these terms are expanded to a special synonym query that blends term frequencies, which does not support fuzzy expansion. [source,console] -------------------------------------------------- @@ -247,9 +244,9 @@ GET /_search [[query-dsl-match-query-zero]] ===== Zero terms query -If the analyzer used removes all tokens in a query like a `stop` filter -does, the default behavior is to match no documents at all. In order to -change that the `zero_terms_query` option can be used, which accepts + +If the analyzer used removes all tokens in a query like a `stop` filter does, the default behavior is to match no documents at all. +In order to change that the `zero_terms_query` option can be used, which accepts `none` (default) and `all` which corresponds to a `match_all` query. [source,console] @@ -271,8 +268,8 @@ GET /_search [[query-dsl-match-query-synonyms]] ===== Synonyms -The `match` query supports multi-terms synonym expansion with the <> token filter. When this filter is used, the parser creates a phrase query for each multi-terms synonyms. +The `match` query supports multi-terms synonym expansion with the <> token filter. +When this filter is used, the parser creates a phrase query for each multi-terms synonyms. For example, the following synonym: `"ny, new york"` would produce: `(ny OR ("new york"))` diff --git a/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/Match.java b/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/Match.java index f30e1b054d0a8..0082d34c6ce3c 100644 --- a/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/Match.java +++ b/x-pack/plugin/esql/src/main/java/org/elasticsearch/xpack/esql/expression/function/fulltext/Match.java @@ -148,6 +148,7 @@ public class Match extends FullTextFunction implements OptionalArgument, PostAna Match can be used on fields from the text family like <> and <>, as well as other field types like keyword, boolean, dates, and numeric types. + When Match is used on a <> field, it will perform a semantic query on the field. Match can use <> to specify additional options for the match query. All <> are supported.