You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/reference/query-languages/esql/_snippets/commands/layout/completion.md
+32Lines changed: 32 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,6 +9,35 @@ The `COMPLETION` command allows you to send prompts and context to a Large Langu
9
9
:::{important}
10
10
**Every row processed by the COMPLETION command generates a separate API call to the LLM endpoint.**
11
11
12
+
::::{tab-set}
13
+
14
+
:::{tab-item} 9.3.0+
15
+
16
+
Starting in version 9.3.0, `COMPLETION` automatically limits processing to **100 rows by default** to prevent accidental high consumption and costs. This limit is applied before the `COMPLETION` command executes.
17
+
18
+
If you need to process more rows, you can adjust the limit using the cluster setting:
19
+
```
20
+
PUT _cluster/settings
21
+
{
22
+
"persistent": {
23
+
"esql.command.completion.limit": 500
24
+
}
25
+
}
26
+
```
27
+
28
+
You can also disable the command entirely if needed:
29
+
```
30
+
PUT _cluster/settings
31
+
{
32
+
"persistent": {
33
+
"esql.command.completion.enabled": false
34
+
}
35
+
}
36
+
```
37
+
:::
38
+
39
+
:::{tab-item} 9.1.x - 9.2.x
40
+
12
41
Be careful to test with small datasets first before running on production data or in automated workflows, to avoid unexpected costs.
13
42
14
43
Best practices:
@@ -19,6 +48,9 @@ Best practices:
19
48
4. **Monitor usage**: Track your LLM API consumption and costs.
Copy file name to clipboardExpand all lines: docs/reference/query-languages/esql/_snippets/commands/layout/rerank.md
+43Lines changed: 43 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,6 +7,49 @@ stack: preview 9.2.0
7
7
The `RERANK` command uses an inference model to compute a new relevance score
8
8
for an initial set of documents, directly within your ES|QL queries.
9
9
10
+
::::{tab-set}
11
+
12
+
:::{tab-item} 9.3.0+
13
+
14
+
Starting in version 9.3.0, `RERANK` automatically limits processing to **1000 rows by default** to prevent accidental high consumption. This limit is applied before the `RERANK` command executes.
15
+
16
+
If you need to process more rows, you can adjust the limit using the cluster setting:
17
+
```
18
+
PUT _cluster/settings
19
+
{
20
+
"persistent": {
21
+
"esql.command.rerank.limit": 5000
22
+
}
23
+
}
24
+
```
25
+
26
+
You can also disable the command entirely if needed:
27
+
```
28
+
PUT _cluster/settings
29
+
{
30
+
"persistent": {
31
+
"esql.command.rerank.enabled": false
32
+
}
33
+
}
34
+
```
35
+
:::
36
+
37
+
:::{tab-item} 9.2.x
38
+
39
+
No automatic row limit is applied. **You should always use `LIMIT` before or after `RERANK` to control the number of documents processed**, to avoid accidentally reranking large datasets which can result in high latency and increased costs.
40
+
41
+
For example:
42
+
```esql
43
+
FROM books
44
+
| WHERE title:"search query"
45
+
| SORT _score DESC
46
+
| LIMIT 100 // Limit to top 100 results before reranking
47
+
| RERANK "search query" ON title WITH { "inference_id" : "my_rerank_endpoint" }
0 commit comments