Skip to content

Commit 535ad91

Browse files
Refine ESQL limitations (full-text, TEXT fields, unassigned indexes) (#116098)
* Refine ESQL limitations (full-text, TEXT fields, unassigned indexes) This PR refactors a section of the ES|QL Limitations page to: * Refactor both full-text and text-behaves-as-keyword sections to better reflect the new behaviour (the old text implies that no full-text search of any kind exists anywhere, which immediately contradicts the statements directly above it). * Update text-behaves-as-keyword to include my recent work on making all functions return KEYWORD instead of TEXT or SEMANTIC_TEXT * Add a section on multi-index querying to cover two limitations (union types and unassigned indexes). * Fix full-text-search examples
1 parent 6d4e11d commit 535ad91

File tree

1 file changed

+72
-14
lines changed

1 file changed

+72
-14
lines changed

docs/reference/esql/esql-limitations.asciidoc

Lines changed: 72 additions & 14 deletions
Original file line numberDiff line numberDiff line change
@@ -112,31 +112,36 @@ Otherwise, the query will fail with a validation error.
112112
Another limitation is that any <<esql-where>> command containing a full-text search function
113113
cannot also use disjunctions (`OR`).
114114

115-
Because of <<esql-limitations-text-fields,the way {esql} treats `text` values>>,
116-
queries on `text` fields are like queries on `keyword` fields: they are
117-
case-sensitive and need to match the full string.
115+
For example, this query is valid:
118116

119-
For example, after indexing a field of type `text` with the value `Elasticsearch
120-
query language`, the following `WHERE` clause does not match because the `LIKE`
121-
operator is case-sensitive:
122117
[source,esql]
123118
----
124-
| WHERE field LIKE "elasticsearch query language"
119+
FROM books
120+
| WHERE MATCH(author, "Faulkner") AND MATCH(author, "Tolkien")
125121
----
126122

127-
The following `WHERE` clause does not match either, because the `LIKE` operator
128-
tries to match the whole string:
123+
But this query will fail due to the <<esql-stats-by, STATS>> command:
124+
129125
[source,esql]
130126
----
131-
| WHERE field LIKE "Elasticsearch"
127+
FROM books
128+
| STATS AVG(price) BY author
129+
| WHERE MATCH(author, "Faulkner")
132130
----
133131

134-
As a workaround, use wildcards and regular expressions. For example:
132+
And this query will fail due to the disjunction:
133+
135134
[source,esql]
136135
----
137-
| WHERE field RLIKE "[Ee]lasticsearch.*"
136+
FROM books
137+
| WHERE MATCH(author, "Faulkner") OR author LIKE "Hemingway"
138138
----
139139

140+
Note that, because of <<esql-limitations-text-fields,the way {esql} treats `text` values>>,
141+
any queries on `text` fields that do not explicitly use the full-text functions,
142+
<<esql-match>> or <<esql-qstr>>, will behave as if the fields are actually `keyword` fields:
143+
they are case-sensitive and need to match the full string.
144+
140145
[discrete]
141146
[[esql-limitations-text-fields]]
142147
=== `text` fields behave like `keyword` fields
@@ -149,15 +154,68 @@ that. If it's not possible to retrieve a `keyword` subfield, {esql} will get the
149154
string from a document's `_source`. If the `_source` cannot be retrieved, for
150155
example when using synthetic source, `null` is returned.
151156

157+
Once a `text` field is retrieved, if the query touches it in any way, for example passing
158+
it into a function, the type will be converted to `keyword`. In fact, functions that operate on both
159+
`text` and `keyword` fields will perform as if the `text` field was a `keyword` field all along.
160+
161+
For example, the following query will return a column `greatest` of type `keyword` no matter
162+
whether any or all of `field1`, `field2`, and `field3` are of type `text`:
163+
[source,esql]
164+
----
165+
| FROM index
166+
| EVAL greatest = GREATEST(field1, field2, field3)
167+
----
168+
152169
Note that {esql}'s retrieval of `keyword` subfields may have unexpected
153-
consequences. An {esql} query on a `text` field is case-sensitive. Furthermore,
154-
a subfield may have been mapped with a <<normalizer,normalizer>>, which can
170+
consequences. Other than when explicitly using the full-text functions, <<esql-match>> and <<esql-qstr>>,
171+
any {esql} query on a `text` field is case-sensitive.
172+
173+
For example, after indexing a field of type `text` with the value `Elasticsearch
174+
query language`, the following `WHERE` clause does not match because the `LIKE`
175+
operator is case-sensitive:
176+
[source,esql]
177+
----
178+
| WHERE field LIKE "elasticsearch query language"
179+
----
180+
181+
The following `WHERE` clause does not match either, because the `LIKE` operator
182+
tries to match the whole string:
183+
[source,esql]
184+
----
185+
| WHERE field LIKE "Elasticsearch"
186+
----
187+
188+
As a workaround, use wildcards and regular expressions. For example:
189+
[source,esql]
190+
----
191+
| WHERE field RLIKE "[Ee]lasticsearch.*"
192+
----
193+
194+
Furthermore, a subfield may have been mapped with a <<normalizer,normalizer>>, which can
155195
transform the original string. Or it may have been mapped with <<ignore-above>>,
156196
which can truncate the string. None of these mapping operations are applied to
157197
an {esql} query, which may lead to false positives or negatives.
158198

159199
To avoid these issues, a best practice is to be explicit about the field that
160200
you query, and query `keyword` sub-fields instead of `text` fields.
201+
Or consider using one of the <<esql-search-functions,full-text search>> functions.
202+
203+
[discrete]
204+
[[esql-multi-index-limitations]]
205+
=== Using {esql} to query multiple indices
206+
207+
As discussed in more detail in <<esql-multi-index>>, {esql} can execute a single query across multiple indices,
208+
data streams, or aliases. However, there are some limitations to be aware of:
209+
210+
* All underlying indexes and shards must be active. Using admin commands or UI,
211+
it is possible to pause an index or shard, for example by disabling a frozen tier instance,
212+
but then any {esql} query that includes that index or shard will fail, even if the query uses
213+
<<esql-where>> to filter out the results from the paused index.
214+
If you see an error of type `search_phase_execution_exception`,
215+
with the message `Search rejected due to missing shards`, you likely have an index or shard in `UNASSIGNED` state.
216+
* The same field must have the same type across all indexes. If the same field is mapped to different types
217+
it is still possible to query the indexes,
218+
but the field must be <<esql-multi-index-union-types,explicitly converted to a single type>>.
161219

162220
[discrete]
163221
[[esql-tsdb]]

0 commit comments

Comments
 (0)