Skip to content

Commit a66860e

Browse files
Merge branch 'main' into pkar/resolve-index-force-reconn
2 parents 8753e8e + 3d754d2 commit a66860e

File tree

46 files changed

+1196
-551
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

46 files changed

+1196
-551
lines changed

benchmarks/src/main/java/org/elasticsearch/benchmark/_nightly/esql/QueryPlanningBenchmark.java

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,6 +27,7 @@
2727
import org.elasticsearch.xpack.esql.index.EsIndex;
2828
import org.elasticsearch.xpack.esql.index.IndexResolution;
2929
import org.elasticsearch.xpack.esql.inference.InferenceResolution;
30+
import org.elasticsearch.xpack.esql.inference.InferenceSettings;
3031
import org.elasticsearch.xpack.esql.optimizer.LogicalOptimizerContext;
3132
import org.elasticsearch.xpack.esql.optimizer.LogicalPlanOptimizer;
3233
import org.elasticsearch.xpack.esql.parser.EsqlParser;
@@ -126,7 +127,7 @@ public void setup() {
126127
}
127128

128129
private LogicalPlan plan(EsqlParser parser, Analyzer analyzer, LogicalPlanOptimizer optimizer, String query) {
129-
var parsed = parser.parseQuery(query, new QueryParams(), telemetry);
130+
var parsed = parser.parseQuery(query, new QueryParams(), telemetry, new InferenceSettings(Settings.EMPTY));
130131
var analyzed = analyzer.analyze(parsed);
131132
var optimized = optimizer.optimize(analyzed);
132133
return optimized;

docs/changelog/139074.yaml

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,5 @@
1+
pr: 139074
2+
summary: "[ESQL][Inference] Introduce usage limits for COMPLETION and RERANK"
3+
area: ES|QL
4+
type: enhancement
5+
issues: []

docs/reference/query-languages/esql/_snippets/commands/layout/completion.md

Lines changed: 33 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -6,9 +6,38 @@ stack: preview 9.1.0
66
77
The `COMPLETION` command allows you to send prompts and context to a Large Language Model (LLM) directly within your ES|QL queries, to perform text generation tasks.
88

9-
:::{important}
9+
:::::{important}
1010
**Every row processed by the COMPLETION command generates a separate API call to the LLM endpoint.**
1111

12+
::::{tab-set}
13+
14+
:::{tab-item} 9.3.0+
15+
16+
Starting in version 9.3.0, `COMPLETION` automatically limits processing to **100 rows by default** to prevent accidental high consumption and costs. This limit is applied before the `COMPLETION` command executes.
17+
18+
If you need to process more rows, you can adjust the limit using the cluster setting:
19+
```
20+
PUT _cluster/settings
21+
{
22+
"persistent": {
23+
"esql.command.completion.limit": 500
24+
}
25+
}
26+
```
27+
28+
You can also disable the command entirely if needed:
29+
```
30+
PUT _cluster/settings
31+
{
32+
"persistent": {
33+
"esql.command.completion.enabled": false
34+
}
35+
}
36+
```
37+
:::
38+
39+
:::{tab-item} 9.1.x - 9.2.x
40+
1241
Be careful to test with small datasets first before running on production data or in automated workflows, to avoid unexpected costs.
1342

1443
Best practices:
@@ -19,6 +48,9 @@ Best practices:
1948
4. **Monitor usage**: Track your LLM API consumption and costs.
2049
:::
2150

51+
::::
52+
:::::
53+
2254
**Syntax**
2355

2456
::::{tab-set}

docs/reference/query-languages/esql/_snippets/commands/layout/rerank.md

Lines changed: 47 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -7,6 +7,53 @@ stack: preview 9.2.0
77
The `RERANK` command uses an inference model to compute a new relevance score
88
for an initial set of documents, directly within your ES|QL queries.
99

10+
:::::{important}
11+
**RERANK processes each row through an inference model, which impacts performance and costs.**
12+
13+
::::{tab-set}
14+
15+
:::{tab-item} 9.3.0+
16+
17+
Starting in version 9.3.0, `RERANK` automatically limits processing to **1000 rows by default** to prevent accidental high consumption. This limit is applied before the `RERANK` command executes.
18+
19+
If you need to process more rows, you can adjust the limit using the cluster setting:
20+
```
21+
PUT _cluster/settings
22+
{
23+
"persistent": {
24+
"esql.command.rerank.limit": 5000
25+
}
26+
}
27+
```
28+
29+
You can also disable the command entirely if needed:
30+
```
31+
PUT _cluster/settings
32+
{
33+
"persistent": {
34+
"esql.command.rerank.enabled": false
35+
}
36+
}
37+
```
38+
:::
39+
40+
:::{tab-item} 9.2.x
41+
42+
No automatic row limit is applied. **You should always use `LIMIT` before or after `RERANK` to control the number of documents processed**, to avoid accidentally reranking large datasets which can result in high latency and increased costs.
43+
44+
For example:
45+
```esql
46+
FROM books
47+
| WHERE title:"search query"
48+
| SORT _score DESC
49+
| LIMIT 100 // Limit to top 100 results before reranking
50+
| RERANK "search query" ON title WITH { "inference_id" : "my_rerank_endpoint" }
51+
```
52+
:::
53+
54+
::::
55+
:::::
56+
1057
**Syntax**
1158

1259
```esql

docs/reference/query-languages/esql/_snippets/functions/examples/chunk.md

Lines changed: 9 additions & 27 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/reference/query-languages/esql/_snippets/lists/string-functions.md

Lines changed: 1 addition & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -1,5 +1,6 @@
11
* [`BIT_LENGTH`](../../functions-operators/string-functions.md#esql-bit_length)
22
* [`BYTE_LENGTH`](../../functions-operators/string-functions.md#esql-byte_length)
3+
* [`CHUNK`](../../functions-operators/string-functions.md#esql-chunk)
34
* [`CONCAT`](../../functions-operators/string-functions.md#esql-concat)
45
* [`CONTAINS`](../../functions-operators/string-functions.md#esql-contains)
56
* [`ENDS_WITH`](../../functions-operators/string-functions.md#esql-ends_with)

docs/reference/query-languages/esql/kibana/definition/functions/chunk.json

Lines changed: 1 addition & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

docs/reference/query-languages/esql/kibana/docs/functions/chunk.md

Lines changed: 2 additions & 2 deletions
Some generated files are not rendered by default. Learn more about customizing how changed files appear on GitHub.

muted-tests.yml

Lines changed: 9 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -475,6 +475,15 @@ tests:
475475
- class: org.elasticsearch.xpack.esql.ccq.MultiClusterSpecIT
476476
method: test {csv-spec:spatial.ConvertFromStringParseError}
477477
issue: https://github.com/elastic/elasticsearch/issues/139213
478+
- class: org.elasticsearch.test.rest.yaml.RcsCcsCommonYamlTestSuiteIT
479+
method: test {p0=search.vectors/180_update_dense_vector_type/Test update flat --> bbq_flat --> bbq_hnsw}
480+
issue: https://github.com/elastic/elasticsearch/issues/139253
481+
- class: org.elasticsearch.test.rest.yaml.CcsCommonYamlTestSuiteIT
482+
method: test {p0=search.vectors/41_knn_search_half_byte_quantized_bfloat16/Knn search with mip}
483+
issue: https://github.com/elastic/elasticsearch/issues/139254
484+
- class: org.elasticsearch.xpack.security.authc.saml.SamlServiceProviderMetadataIT
485+
method: testAuthenticationWhenMetadataIsUnreliable
486+
issue: https://github.com/elastic/elasticsearch/issues/139067
478487

479488
# Examples:
480489
#

x-pack/plugin/esql-core/src/main/java/org/elasticsearch/xpack/esql/core/type/DataType.java

Lines changed: 5 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -370,7 +370,7 @@ public enum DataType implements Writeable {
370370
builder().esType("exponential_histogram")
371371
.estimatedSize(16 * 160)// guess 160 buckets (OTEL default for positive values only histograms) with 16 bytes per bucket
372372
.docValues()
373-
.underConstruction(DataTypesTransportVersions.RESOLVE_FIELDS_RESPONSE_USED_TV)
373+
.underConstruction(DataTypesTransportVersions.TEXT_SIMILARITY_RANK_DOC_EXPLAIN_CHUNKS_VERSION)
374374
),
375375

376376
/*
@@ -1043,10 +1043,11 @@ public static class DataTypesTransportVersions {
10431043
);
10441044

10451045
/**
1046-
* First transport version after the PR that introduced the exponential histogram data type.
1046+
* First transport version after the PR that introduced the exponential histogram data type which was NOT also backported to 9.2.
1047+
* (Exp. histogram was added as SNAPSHOT-only to 9.3.)
10471048
*/
1048-
public static final TransportVersion RESOLVE_FIELDS_RESPONSE_USED_TV = TransportVersion.fromName(
1049-
"esql_resolve_fields_response_used"
1049+
public static final TransportVersion TEXT_SIMILARITY_RANK_DOC_EXPLAIN_CHUNKS_VERSION = TransportVersion.fromName(
1050+
"text_similarity_rank_docs_explain_chunks"
10501051
);
10511052
}
10521053
}

0 commit comments

Comments
 (0)