Update doc to mention implicit limit in 9.3.0

afoucret · afoucret · commit 3fe9f17d871d · 2025-12-08T17:19:47.000+01:00
diff --git a/docs/reference/query-languages/esql/_snippets/commands/layout/completion.md b/docs/reference/query-languages/esql/_snippets/commands/layout/completion.md
@@ -9,6 +9,35 @@ The `COMPLETION` command allows you to send prompts and context to a Large Langu
 :::{important}
 **Every row processed by the COMPLETION command generates a separate API call to the LLM endpoint.**
 
+::::{tab-set}
+
+:::{tab-item} 9.3.0+
+
+Starting in version 9.3.0, `COMPLETION` automatically limits processing to **100 rows by default** to prevent accidental high consumption and costs. This limit is applied before the `COMPLETION` command executes.
+
+If you need to process more rows, you can adjust the limit using the cluster setting:
+```
+PUT _cluster/settings
+{
+  "persistent": {
+    "esql.command.completion.limit": 500
+  }
+}
+```
+
+You can also disable the command entirely if needed:
+```
+PUT _cluster/settings
+{
+  "persistent": {
+    "esql.command.completion.enabled": false
+  }
+}
+```
+:::
+
+:::{tab-item} 9.1.x - 9.2.x
+
 Be careful to test with small datasets first before running on production data or in automated workflows, to avoid unexpected costs.
 
 Best practices:
@@ -19,6 +48,9 @@ Best practices:
 4. **Monitor usage**: Track your LLM API consumption and costs.
 :::
 
+::::
+:::
+
 **Syntax**
 
 ::::{tab-set}
diff --git a/docs/reference/query-languages/esql/_snippets/commands/layout/rerank.md b/docs/reference/query-languages/esql/_snippets/commands/layout/rerank.md
@@ -7,6 +7,49 @@ stack: preview 9.2.0
 The `RERANK` command uses an inference model to compute a new relevance score
 for an initial set of documents, directly within your ES|QL queries.
 
+::::{tab-set}
+
+:::{tab-item} 9.3.0+
+
+Starting in version 9.3.0, `RERANK` automatically limits processing to **1000 rows by default** to prevent accidental high consumption. This limit is applied before the `RERANK` command executes.
+
+If you need to process more rows, you can adjust the limit using the cluster setting:
+```
+PUT _cluster/settings
+{
+  "persistent": {
+    "esql.command.rerank.limit": 5000
+  }
+}
+```
+
+You can also disable the command entirely if needed:
+```
+PUT _cluster/settings
+{
+  "persistent": {
+    "esql.command.rerank.enabled": false
+  }
+}
+```
+:::
+
+:::{tab-item} 9.2.x
+
+No automatic row limit is applied. **You should always use `LIMIT` before or after `RERANK` to control the number of documents processed**, to avoid accidentally reranking large datasets which can result in high latency and increased costs.
+
+For example:
+```esql
+FROM books
+| WHERE title:"search query"
+| SORT _score DESC
+| LIMIT 100  // Limit to top 100 results before reranking
+| RERANK "search query" ON title WITH { "inference_id" : "my_rerank_endpoint" }
+```
+:::
+
+::::
+
 **Syntax**
 
 ```esql