From 15e495efaebdaf02a021646a16d90c2faa8f45d4 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 23 Jul 2025 12:10:32 +0200
Subject: [PATCH 01/15] Adds ELSER on EIS conceptual docs.

---
 explore-analyze/elastic-inference.md          |  8 ++-
 explore-analyze/elastic-inference/eis.md      | 18 +++++++
 .../elastic-inference/inference-api.md        | 53 ++++++++++---------
 explore-analyze/toc.yml                       |  1 +
 4 files changed, 53 insertions(+), 27 deletions(-)
 create mode 100644 explore-analyze/elastic-inference/eis.md

diff --git a/explore-analyze/elastic-inference.md b/explore-analyze/elastic-inference.md
index 9b19f2a4cd..851fb14c5c 100644
--- a/explore-analyze/elastic-inference.md
+++ b/explore-analyze/elastic-inference.md
@@ -7,7 +7,13 @@ navigation_title: Elastic Inference
 
 # Elastic {{infer-cap}}
 
-There are several ways to perform {{infer}} in the {{stack}}. This page provides a brief overview of the different methods:
+## Overview
 
+{{infer-cap}} is a process of using an LLM or a {{ml}} trained model to make predictions or operations - such as text embedding, completion, or reranking - on your data.
+You can use {{infer}} during ingest time (for example, to create embeddings from textual data you ingest) or search time (to perform [semantic search](/solutions/search/semantic-search.md)).
+There are several ways to perform {{infer}} in the {{stack}}:
+
+* [Using the Elastic {{infer-cap}} Service](/elastic-inference/eis.md)
+* [Using `semantic_text` if you want to perform semantic search](/solutions/search/semantic-search/semantic-search-semantic-text.md)
 * [Using the {{infer}} API](elastic-inference/inference-api.md)
 * [Trained models deployed in your cluster](machine-learning/nlp/ml-nlp-overview.md)
diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
new file mode 100644
index 0000000000..28886427aa
--- /dev/null
+++ b/explore-analyze/elastic-inference/eis.md
@@ -0,0 +1,18 @@
+---
+navigation_title: Elastic Inference Service (EIS)
+applies_to:
+  stack: ga 9.0
+  serverless: ga
+---
+
+# Elastic {{infer-cap}} Service [elastic-inference-service-eis]
+
+The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster.
+With EIS, you don't need to manage the infrastructure and resources required for {{ml}} {{infer}} by adding, configuring, and scaling {{ml}} nodes.
+Instead, you can use {{ml}} models for ingest, search and chat independently of your {{es}} infrastructure.
+
+## AI features powered by EIS [ai-features-powered-by-eis]
+
+* Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
+
+* applies_to`stack ga 9.1` You can use [ELSER](explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
diff --git a/explore-analyze/elastic-inference/inference-api.md b/explore-analyze/elastic-inference/inference-api.md
index 0f8ba78092..0c00f3483b 100644
--- a/explore-analyze/elastic-inference/inference-api.md
+++ b/explore-analyze/elastic-inference/inference-api.md
@@ -9,18 +9,28 @@ products:
   - id: kibana
 ---
 
-# Inference integrations
+# {{infer-cap}} integrations
 
-{{es}} provides a machine learning [inference API](https://www.elastic.co/docs/api/doc/elasticsearch/v8/operation/operation-inference-get-1) to create and manage inference endpoints that integrate with services such as Elasticsearch (for built-in NLP models like [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) and [E5](/explore-analyze/machine-learning/nlp/ml-nlp-e5.md)), as well as  popular third-party services like Amazon Bedrock, Anthropic, Azure AI Studio, Cohere, Google AI, Mistral, OpenAI, Hugging Face, and more.
+{{es}} provides a machine learning [{{infer}} API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/group/endpoint-inference) to create and manage {{infer}} endpoints that integrate with services such as {{es}} (for built-in NLP models like [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) and [E5](/explore-analyze/machine-learning/nlp/ml-nlp-e5.md)), as well as  popular third-party services like Amazon Bedrock, Anthropic, Azure AI Studio, Cohere, Google AI, Mistral, OpenAI, Hugging Face, and more.
 
-You can create a new inference endpoint:
+You can use the default {{infer}} endpoints your deployment contains or create a new {{infer}} endpoint:
 
-- using the [Create an inference endpoint API](https://www.elastic.co/docs/api/doc/elasticsearch/v8/operation/operation-inference-put-1)
+- using the [Create an inference endpoint API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-inference-put)
 - through the [Inference endpoints UI](#add-inference-endpoints).
 
-## Inference endpoints UI [inference-endpoints]
+## Default {{infer}} endpoints [default-enpoints]
+
+Your {{es}} deployment contains preconfigured {{infer}} endpoints which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. The following list contains the default {{infer}} endpoints listed by `inference_id`:
+
+- applies_to`stack preview 9.1` `.elser-2-elastic`: uses the [ELSER](explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
+- `.elser-2-elasticsearch`: uses the [ELSER](explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2_linux-x86_64`.
+- `.multilingual-e5-small-elasticsearch`: uses the [E5](../../explore-analyze/machine-learning/nlp/ml-nlp-e5.md) built-in trained model for `text_embedding` tasks (recommended for non-English language texts). The `model_id` is `.e5_model_2_linux-x86_64`.
+
+Use the `inference_id` of the endpoint in a [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field definition or when creating an [{{infer}} processor](elasticsearch://reference/enrich-processor/inference-processor.md). The API call will automatically download and deploy the model which might take a couple of minutes. Default {{infer}} enpoints have adaptive allocations enabled. For these models, the minimum number of allocations is `0`. If there is no {{infer}} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.
+
+## {{infer-cap}} endpoints UI [inference-endpoints]
 
-The **Inference endpoints** page provides an interface for managing inference endpoints.
+The **{{infer-cap}} endpoints** page provides an interface for managing {{infer}} endpoints.
 
 :::{image} /explore-analyze/images/kibana-inference-endpoints-ui.png
 :alt: Inference endpoints UI
@@ -29,31 +39,31 @@ The **Inference endpoints** page provides an interface for managing inference en
 
 Available actions:
 
-* Add new endpoint
-* View endpoint details
-* Copy the inference endpoint ID
-* Delete endpoints
+- Add new endpoint
+- View endpoint details
+- Copy the inference endpoint ID
+- Delete endpoints
 
-## Add new inference endpoint [add-inference-endpoints]
+## Add new {{infer}} endpoint [add-inference-endpoints]
 
-To add a new interference endpoint using the UI:
+To add a new {{infer}} endpoint using the UI:
 
 1. Select the **Add endpoint** button.
 1. Select a service from the drop down menu.
 1. Provide the required configuration details.
 1. Select **Save** to create the endpoint.
 
-If your inference endpoint uses a model deployed in Elastic’s infrastructure, such as ELSER, E5, or a model uploaded through Eland, you can configure [adaptive allocations](#adaptive-allocations) to dynamically adjust resource usage based on the current demand.
+If your {{infer}} endpoint uses a model deployed in Elastic’s infrastructure, such as ELSER, E5, or a model uploaded through Eland, you can configure [adaptive allocations](#adaptive-allocations) to dynamically adjust resource usage based on the current demand.
 
 ## Adaptive allocations [adaptive-allocations]
 
-Adaptive allocations allow inference services to dynamically adjust the number of model allocations based on the current load.
-This feature is only supported for models deployed in Elastic’s infrastructure, such as ELSER, E5, or models uploaded through Eland. It is not available for third-party services (for example, Alibaba Cloud, Cohere, or OpenAI), because those models are hosted externally and not deployed within your Elasticsearch cluster.
+Adaptive allocations allow {{infer}} services to dynamically adjust the number of model allocations based on the current load.
+This feature is only supported for models deployed in Elastic’s infrastructure, such as ELSER, E5, or models uploaded through Eland. It is not available for models used through the Elastic {{infer-cap}} Service (EIS) and third-party services (for example, Alibaba Cloud, Cohere, or OpenAI), because those models are not deployed within your Elasticsearch cluster.
 
 When adaptive allocations are enabled:
 
-* The number of allocations scales up automatically when the load increases.
-* Allocations scale down to a minimum of 0 when the load decreases, saving resources.
+- The number of allocations scales up automatically when the load increases.
+- Allocations scale down to a minimum of 0 when the load decreases, saving resources.
 
 ### Allocation scaling behavior
 
@@ -71,15 +81,6 @@ However, setting the `min_number_of_allocations` to a value greater than `0` kee
 
 For more information about adaptive allocations and resources, refer to the [trained model autoscaling](/deploy-manage/autoscaling/trained-model-autoscaling.md) documentation.
 
-## Default {{infer}} endpoints [default-enpoints]
-
-Your {{es}} deployment contains preconfigured {{infer}} endpoints which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. The following list contains the default {{infer}} endpoints listed by `inference_id`:
-
-* `.elser-2-elasticsearch`: uses the [ELSER](../../explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language tex). The `model_id` is `.elser_model_2_linux-x86_64`.
-* `.multilingual-e5-small-elasticsearch`: uses the [E5](../../explore-analyze/machine-learning/nlp/ml-nlp-e5.md) built-in trained model for `text_embedding` tasks (recommended for non-English language texts). The `model_id` is `.e5_model_2_linux-x86_64`.
-
-Use the `inference_id` of the endpoint in a [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field definition or when creating an [{{infer}} processor](elasticsearch://reference/enrich-processor/inference-processor.md). The API call will automatically download and deploy the model which might take a couple of minutes. Default {{infer}} enpoints have adaptive allocations enabled. For these models, the minimum number of allocations is `0`. If there is no {{infer}} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.
-
 ## Configuring chunking [infer-chunking-config]
 
 {{infer-cap}} endpoints have a limit on the amount of text they can process at once, determined by the model's input capacity. Chunking is the process of splitting the input text into pieces that remain within these limits.
diff --git a/explore-analyze/toc.yml b/explore-analyze/toc.yml
index 0e0de372d3..87db28a021 100644
--- a/explore-analyze/toc.yml
+++ b/explore-analyze/toc.yml
@@ -122,6 +122,7 @@ toc:
       - file: transforms/transform-limitations.md
   - file: elastic-inference.md
     children:
+      - file: elastic-inference/eis.md
       - file: elastic-inference/inference-api.md
   - file: machine-learning.md
     children:

From 1208409586b8f5dd6acb0109e0064d04531a368b Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 23 Jul 2025 12:36:57 +0200
Subject: [PATCH 02/15] Fixes links.

---
 explore-analyze/elastic-inference.md               | 2 +-
 explore-analyze/elastic-inference/eis.md           | 2 +-
 explore-analyze/elastic-inference/inference-api.md | 4 ++--
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/explore-analyze/elastic-inference.md b/explore-analyze/elastic-inference.md
index 851fb14c5c..7f95f79a64 100644
--- a/explore-analyze/elastic-inference.md
+++ b/explore-analyze/elastic-inference.md
@@ -13,7 +13,7 @@ navigation_title: Elastic Inference
 You can use {{infer}} during ingest time (for example, to create embeddings from textual data you ingest) or search time (to perform [semantic search](/solutions/search/semantic-search.md)).
 There are several ways to perform {{infer}} in the {{stack}}:
 
-* [Using the Elastic {{infer-cap}} Service](/elastic-inference/eis.md)
+* [Using the Elastic {{infer-cap}} Service](elastic-inference/eis.md)
 * [Using `semantic_text` if you want to perform semantic search](/solutions/search/semantic-search/semantic-search-semantic-text.md)
 * [Using the {{infer}} API](elastic-inference/inference-api.md)
 * [Trained models deployed in your cluster](machine-learning/nlp/ml-nlp-overview.md)
diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 28886427aa..aafd783dfb 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -15,4 +15,4 @@ Instead, you can use {{ml}} models for ingest, search and chat independently of
 
 * Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
 
-* applies_to`stack ga 9.1` You can use [ELSER](explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
+* applies_to`stack ga 9.1` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
diff --git a/explore-analyze/elastic-inference/inference-api.md b/explore-analyze/elastic-inference/inference-api.md
index 0c00f3483b..b35129494f 100644
--- a/explore-analyze/elastic-inference/inference-api.md
+++ b/explore-analyze/elastic-inference/inference-api.md
@@ -22,8 +22,8 @@ You can use the default {{infer}} endpoints your deployment contains or create a
 
 Your {{es}} deployment contains preconfigured {{infer}} endpoints which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. The following list contains the default {{infer}} endpoints listed by `inference_id`:
 
-- applies_to`stack preview 9.1` `.elser-2-elastic`: uses the [ELSER](explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
-- `.elser-2-elasticsearch`: uses the [ELSER](explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2_linux-x86_64`.
+- applies_to`stack preview 9.1` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
+- `.elser-2-elasticsearch`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2_linux-x86_64`.
 - `.multilingual-e5-small-elasticsearch`: uses the [E5](../../explore-analyze/machine-learning/nlp/ml-nlp-e5.md) built-in trained model for `text_embedding` tasks (recommended for non-English language texts). The `model_id` is `.e5_model_2_linux-x86_64`.
 
 Use the `inference_id` of the endpoint in a [`semantic_text`](elasticsearch://reference/elasticsearch/mapping-reference/semantic-text.md) field definition or when creating an [{{infer}} processor](elasticsearch://reference/enrich-processor/inference-processor.md). The API call will automatically download and deploy the model which might take a couple of minutes. Default {{infer}} enpoints have adaptive allocations enabled. For these models, the minimum number of allocations is `0`. If there is no {{infer}} activity that uses the endpoint, the number of allocations will scale down to `0` automatically after 15 minutes.

From fd3310881b70bfcf16c694e6451f0b6229d2ca95 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 23 Jul 2025 12:58:13 +0200
Subject: [PATCH 03/15] adds limitations.

---
 explore-analyze/elastic-inference/eis.md      | 32 ++++++++++++++++++-
 .../elastic-inference/inference-api.md        |  2 +-
 2 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index aafd783dfb..f67e70ad77 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -15,4 +15,34 @@ Instead, you can use {{ml}} models for ingest, search and chat independently of
 
 * Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
 
-* applies_to`stack ga 9.1` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
+* {applies_to}`stack preview 9.1` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
+
+## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
+
+{applies_to}`stack preview 9.1`
+{applies_to}`serverless preview`
+
+ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure.
+
+### Limitations
+
+#### Access
+
+This feature is being gradually rolled out to Serverless and Cloud Hosted customers.
+It may not be available to all users at launch.
+
+#### Uptime
+
+There are no uptime guarantees during the Technical Preview.
+While Elastic will address issues promptly, the feature may be unavailable for extended periods.
+
+#### Throughput and latency
+
+{{infer-cap}} throughput via this endpoint is expected to exceed that of {{infer}} operations on an ML node.
+However, throughput and latency are not guaranteed.
+Performance may vary during the Technical Preview.
+
+#### Batch size
+
+Batches are limited to a maximum of 16 documents.
+This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion.
diff --git a/explore-analyze/elastic-inference/inference-api.md b/explore-analyze/elastic-inference/inference-api.md
index b35129494f..3709c155d1 100644
--- a/explore-analyze/elastic-inference/inference-api.md
+++ b/explore-analyze/elastic-inference/inference-api.md
@@ -22,7 +22,7 @@ You can use the default {{infer}} endpoints your deployment contains or create a
 
 Your {{es}} deployment contains preconfigured {{infer}} endpoints which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. The following list contains the default {{infer}} endpoints listed by `inference_id`:
 
-- applies_to`stack preview 9.1` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
+- {applies_to}`stack preview 9.1` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
 - `.elser-2-elasticsearch`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2_linux-x86_64`.
 - `.multilingual-e5-small-elasticsearch`: uses the [E5](../../explore-analyze/machine-learning/nlp/ml-nlp-e5.md) built-in trained model for `text_embedding` tasks (recommended for non-English language texts). The `model_id` is `.e5_model_2_linux-x86_64`.
 

From e486c9fe1b3b4d6f68e88d3e8df84e273f0462e9 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 23 Jul 2025 13:01:53 +0200
Subject: [PATCH 04/15] Updates tags.

---
 explore-analyze/elastic-inference/eis.md           | 6 +++---
 explore-analyze/elastic-inference/inference-api.md | 2 +-
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index f67e70ad77..28178123e3 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -15,12 +15,12 @@ Instead, you can use {{ml}} models for ingest, search and chat independently of
 
 * Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
 
-* {applies_to}`stack preview 9.1` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
+* {applies_to}`stack: preview 9.1` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
 
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 
-{applies_to}`stack preview 9.1`
-{applies_to}`serverless preview`
+{applies_to}`stack: preview 9.1`
+{applies_to}`serverless: preview`
 
 ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure.
 
diff --git a/explore-analyze/elastic-inference/inference-api.md b/explore-analyze/elastic-inference/inference-api.md
index 3709c155d1..f2aec33516 100644
--- a/explore-analyze/elastic-inference/inference-api.md
+++ b/explore-analyze/elastic-inference/inference-api.md
@@ -22,7 +22,7 @@ You can use the default {{infer}} endpoints your deployment contains or create a
 
 Your {{es}} deployment contains preconfigured {{infer}} endpoints which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. The following list contains the default {{infer}} endpoints listed by `inference_id`:
 
-- {applies_to}`stack preview 9.1` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
+- {applies_to}`stack: preview 9.1` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
 - `.elser-2-elasticsearch`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2_linux-x86_64`.
 - `.multilingual-e5-small-elasticsearch`: uses the [E5](../../explore-analyze/machine-learning/nlp/ml-nlp-e5.md) built-in trained model for `text_embedding` tasks (recommended for non-English language texts). The `model_id` is `.e5_model_2_linux-x86_64`.
 

From e7c45ef00b6a7fb4bae3cd32acdb2dd3e93d6f38 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Wed, 23 Jul 2025 15:25:32 +0200
Subject: [PATCH 05/15] More edits.

---
 explore-analyze/elastic-inference/eis.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 28178123e3..1a6e5631e7 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -22,7 +22,7 @@ Instead, you can use {{ml}} models for ingest, search and chat independently of
 {applies_to}`stack: preview 9.1`
 {applies_to}`serverless: preview`
 
-ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure.
+ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure and with that, it simplifies the semantic search and hybrid search experience.
 
 ### Limitations
 

From 35cf819a94b272d5710ac5150fcfa95ef97d8fd2 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Thu, 24 Jul 2025 12:27:25 +0200
Subject: [PATCH 06/15] Adds more details to EIS docs.

---
 explore-analyze/elastic-inference/eis.md | 9 +++++++++
 1 file changed, 9 insertions(+)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 1a6e5631e7..001af627a5 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -17,6 +17,15 @@ Instead, you can use {{ml}} models for ingest, search and chat independently of
 
 * {applies_to}`stack: preview 9.1` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
 
+## Region and hosting [eis-regions]
+
+The EIS requests are currently proxying to AWS Bedrock in AWS US regions, beginning with `us-east-1`.
+The request routing does not restrict the location of your deployments.
+
+For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).
+
+<iframe src="https://docs.google.com/forms/d/e/1FAIpQLSfp2rLsayhw6pLVQYYp4KM6BFtaaljplWdYowJfflpOICgViA/viewform?embedded=true" width="640" height="936" frameborder="0" marginheight="0" marginwidth="0">Loading…</iframe>
+
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 
 {applies_to}`stack: preview 9.1`

From d78de498c5cc0d17a2db9afb4ed1091b13509457 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Thu, 24 Jul 2025 12:35:37 +0200
Subject: [PATCH 07/15] Adds google form link.

---
 explore-analyze/elastic-inference/eis.md | 8 +++++---
 1 file changed, 5 insertions(+), 3 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 001af627a5..a69fbd8b4c 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -24,15 +24,17 @@ The request routing does not restrict the location of your deployments.
 
 For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).
 
-<iframe src="https://docs.google.com/forms/d/e/1FAIpQLSfp2rLsayhw6pLVQYYp4KM6BFtaaljplWdYowJfflpOICgViA/viewform?embedded=true" width="640" height="936" frameborder="0" marginheight="0" marginwidth="0">Loading…</iframe>
-
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 
-{applies_to}`stack: preview 9.1`
+{applies_to}`stack: preview 9.1, serverless: preview`
 {applies_to}`serverless: preview`
 
 ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure and with that, it simplifies the semantic search and hybrid search experience.
 
+### Private preview access
+
+Private preview access is available by submitting the form provided [here](https://docs.google.com/forms/d/e/1FAIpQLSfp2rLsayhw6pLVQYYp4KM6BFtaaljplWdYowJfflpOICgViA/viewform).
+
 ### Limitations
 
 #### Access

From 35f174601544b6c76aadc49187132de34009be07 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Thu, 24 Jul 2025 12:39:29 +0200
Subject: [PATCH 08/15] More edits.

---
 explore-analyze/elastic-inference/eis.md | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index a69fbd8b4c..81d26a66cd 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -9,13 +9,13 @@ applies_to:
 
 The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster.
 With EIS, you don't need to manage the infrastructure and resources required for {{ml}} {{infer}} by adding, configuring, and scaling {{ml}} nodes.
-Instead, you can use {{ml}} models for ingest, search and chat independently of your {{es}} infrastructure.
+Instead, you can use {{ml}} models for ingest, search, and chat independently of your {{es}} infrastructure.
 
 ## AI features powered by EIS [ai-features-powered-by-eis]
 
 * Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
 
-* {applies_to}`stack: preview 9.1` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
+* {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
 
 ## Region and hosting [eis-regions]
 
@@ -26,8 +26,7 @@ For more details on AWS regions, refer to the [AWS Global Infrastructure](https:
 
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 
-{applies_to}`stack: preview 9.1, serverless: preview`
-{applies_to}`serverless: preview`
+{applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
 
 ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure and with that, it simplifies the semantic search and hybrid search experience.
 

From 0914720a10b41c82d10643f213052b7f1bcd836e Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Thu, 24 Jul 2025 12:46:00 +0200
Subject: [PATCH 09/15] Fine-tunes content.

---
 explore-analyze/elastic-inference/eis.md           | 2 --
 explore-analyze/elastic-inference/inference-api.md | 2 +-
 2 files changed, 1 insertion(+), 3 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 81d26a66cd..91ee559312 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -22,8 +22,6 @@ Instead, you can use {{ml}} models for ingest, search, and chat independently of
 The EIS requests are currently proxying to AWS Bedrock in AWS US regions, beginning with `us-east-1`.
 The request routing does not restrict the location of your deployments.
 
-For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/).
-
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 
 {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
diff --git a/explore-analyze/elastic-inference/inference-api.md b/explore-analyze/elastic-inference/inference-api.md
index f2aec33516..e523eaf6f2 100644
--- a/explore-analyze/elastic-inference/inference-api.md
+++ b/explore-analyze/elastic-inference/inference-api.md
@@ -22,7 +22,7 @@ You can use the default {{infer}} endpoints your deployment contains or create a
 
 Your {{es}} deployment contains preconfigured {{infer}} endpoints which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. The following list contains the default {{infer}} endpoints listed by `inference_id`:
 
-- {applies_to}`stack: preview 9.1` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
+- {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
 - `.elser-2-elasticsearch`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2_linux-x86_64`.
 - `.multilingual-e5-small-elasticsearch`: uses the [E5](../../explore-analyze/machine-learning/nlp/ml-nlp-e5.md) built-in trained model for `text_embedding` tasks (recommended for non-English language texts). The `model_id` is `.e5_model_2_linux-x86_64`.
 

From cc0f8df5a310c5db4847d73266f135d13e8576f8 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 28 Jul 2025 11:42:27 +0200
Subject: [PATCH 10/15] Addresses feedback.

---
 explore-analyze/elastic-inference.md               |  2 +-
 explore-analyze/elastic-inference/eis.md           |  4 +++-
 explore-analyze/elastic-inference/inference-api.md | 13 ++++++++++++-
 3 files changed, 16 insertions(+), 3 deletions(-)

diff --git a/explore-analyze/elastic-inference.md b/explore-analyze/elastic-inference.md
index 7f95f79a64..db6893977a 100644
--- a/explore-analyze/elastic-inference.md
+++ b/explore-analyze/elastic-inference.md
@@ -9,7 +9,7 @@ navigation_title: Elastic Inference
 
 ## Overview
 
-{{infer-cap}} is a process of using an LLM or a {{ml}} trained model to make predictions or operations - such as text embedding, completion, or reranking - on your data.
+{{infer-cap}} is a process of using a {{ml}} trained model to make predictions or operations - such as text embedding, or reranking - on your data.
 You can use {{infer}} during ingest time (for example, to create embeddings from textual data you ingest) or search time (to perform [semantic search](/solutions/search/semantic-search.md)).
 There are several ways to perform {{infer}} in the {{stack}}:
 
diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 91ee559312..4aae269d10 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -19,9 +19,11 @@ Instead, you can use {{ml}} models for ingest, search, and chat independently of
 
 ## Region and hosting [eis-regions]
 
-The EIS requests are currently proxying to AWS Bedrock in AWS US regions, beginning with `us-east-1`.
+Requests through the Elastic Managed LLM are currently proxying to AWS Bedrock in AWS US regions, beginning with `us-east-1`.
 The request routing does not restrict the location of your deployments.
 
+ELSER requests are managed by Elastic own EIS infrastructure.
+
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 
 {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
diff --git a/explore-analyze/elastic-inference/inference-api.md b/explore-analyze/elastic-inference/inference-api.md
index e523eaf6f2..b41a05dffa 100644
--- a/explore-analyze/elastic-inference/inference-api.md
+++ b/explore-analyze/elastic-inference/inference-api.md
@@ -20,9 +20,20 @@ You can use the default {{infer}} endpoints your deployment contains or create a
 
 ## Default {{infer}} endpoints [default-enpoints]
 
-Your {{es}} deployment contains preconfigured {{infer}} endpoints which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. The following list contains the default {{infer}} endpoints listed by `inference_id`:
+Your {{es}} deployment contains preconfigured {{infer}} endpoints, which makes them easier to use when defining `semantic_text` fields or using {{infer}} processors. These endpoints come in two forms:
+
+- **Elastic Inference Service (EIS) endpoints**, which provide {{infer}} as a managed service and do not consume resources from your own nodes.
+
+- **ML node-based endpoints**, which run on your dedicated {{ml}} nodes.
+
+The following section lists the default {{infer}} endpoints, identified by their `inference_id`, grouped by whether they are EIS- or ML node–based.
+
+### Default endpoints for Elastic {{infer-cap}} Service (EIS)
 
 - {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
+
+### Default endpoints used on ML-nodes
+
 - `.elser-2-elasticsearch`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) built-in trained model for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2_linux-x86_64`.
 - `.multilingual-e5-small-elasticsearch`: uses the [E5](../../explore-analyze/machine-learning/nlp/ml-nlp-e5.md) built-in trained model for `text_embedding` tasks (recommended for non-English language texts). The `model_id` is `.e5_model_2_linux-x86_64`.
 

From 30be44d0b15e25658ceda67662e55a2b389e20d5 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 28 Jul 2025 14:48:09 +0200
Subject: [PATCH 11/15] Addresses feedback.

---
 explore-analyze/elastic-inference.md     | 17 +++++++++++------
 explore-analyze/elastic-inference/eis.md |  4 ++--
 2 files changed, 13 insertions(+), 8 deletions(-)

diff --git a/explore-analyze/elastic-inference.md b/explore-analyze/elastic-inference.md
index db6893977a..6fbe1978c2 100644
--- a/explore-analyze/elastic-inference.md
+++ b/explore-analyze/elastic-inference.md
@@ -10,10 +10,15 @@ navigation_title: Elastic Inference
 ## Overview
 
 {{infer-cap}} is a process of using a {{ml}} trained model to make predictions or operations - such as text embedding, or reranking - on your data.
-You can use {{infer}} during ingest time (for example, to create embeddings from textual data you ingest) or search time (to perform [semantic search](/solutions/search/semantic-search.md)).
-There are several ways to perform {{infer}} in the {{stack}}:
+You can use {{infer}} during ingest time (for example, to create embeddings from textual data you ingest) or search time (to perform [semantic search](/solutions/search/semantic-search.md) based on the embeddings created previously).
+There are several ways to perform {{infer}} in the {{stack}}, depending on the underlying {{infer}} infrastructure and the interface you use:
 
-* [Using the Elastic {{infer-cap}} Service](elastic-inference/eis.md)
-* [Using `semantic_text` if you want to perform semantic search](/solutions/search/semantic-search/semantic-search-semantic-text.md)
-* [Using the {{infer}} API](elastic-inference/inference-api.md)
-* [Trained models deployed in your cluster](machine-learning/nlp/ml-nlp-overview.md)
+- **{{infer-cap}} infrastructure:**
+
+  - [Elastic {{infer-cap}} Service](elastic-inference/eis.md): a managed service that runs {infer} outside your cluster resources.
+  - [Trained models deployed in your cluster](machine-learning/nlp/ml-nlp-overview.md): models run on your own {{ml}} nodes
+
+- **Access methods:**
+
+  - [The `semantic_text` workflow](/solutions/search/semantic-search/semantic-search-semantic-text.md): a simplified method that uses the {{infer}} API behind the scenes to enable semantic search.
+  - [The {{infer}} API](elastic-inference/inference-api.md): a general-purpose API that enables you to run {{infer}} using EIS, your own models, or third-party services.
diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 4aae269d10..1ad2791941 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -19,10 +19,10 @@ Instead, you can use {{ml}} models for ingest, search, and chat independently of
 
 ## Region and hosting [eis-regions]
 
-Requests through the Elastic Managed LLM are currently proxying to AWS Bedrock in AWS US regions, beginning with `us-east-1`.
+Requests through the `Elastic Managed LLM` are currently proxying to AWS Bedrock in AWS US regions, beginning with `us-east-1`.
 The request routing does not restrict the location of your deployments.
 
-ELSER requests are managed by Elastic own EIS infrastructure.
+ELSER requests are managed by Elastic's own EIS infrastructure.
 
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 

From 147b18a5dd9c32b7283c6f408bc29b41702784fc Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 28 Jul 2025 14:48:35 +0200
Subject: [PATCH 12/15] Update explore-analyze/elastic-inference/eis.md

Co-authored-by: Max Jakob <max.jakob@elastic.co>
---
 explore-analyze/elastic-inference/eis.md | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 1ad2791941..4843950b63 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -36,6 +36,8 @@ Private preview access is available by submitting the form provided [here](https
 
 ### Limitations
 
+While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview.
+
 #### Access
 
 This feature is being gradually rolled out to Serverless and Cloud Hosted customers.

From c06268d75e96f3402b1b2067b324aa5e7a410c2f Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 28 Jul 2025 14:55:21 +0200
Subject: [PATCH 13/15] Adds note.

---
 explore-analyze/machine-learning/nlp/ml-nlp-elser.md | 5 ++++-
 1 file changed, 4 insertions(+), 1 deletion(-)

diff --git a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md
index c4898ca846..c940762052 100644
--- a/explore-analyze/machine-learning/nlp/ml-nlp-elser.md
+++ b/explore-analyze/machine-learning/nlp/ml-nlp-elser.md
@@ -32,7 +32,10 @@ This approach provides a more understandable search experience compared to vecto
 To use ELSER, you must have the [appropriate subscription](https://www.elastic.co/subscriptions) level for semantic search or the trial period activated.
 
 ::::{note}
-The minimum dedicated ML node size for deploying and using the ELSER model is 4 GB in {{ech}} if [deployment autoscaling](../../../deploy-manage/autoscaling.md) is turned off. Turning on autoscaling is recommended because it allows your deployment to dynamically adjust resources based on demand. Better performance can be achieved by using more allocations or more threads per allocation, which requires bigger ML nodes. Autoscaling provides bigger nodes when required. If autoscaling is turned off, you must provide suitably sized nodes yourself.
+
+- You can use the ELSER model through the [Elastic {{infer-cap}} Service (EIS)](/explore-analyze/elastic-inference/eis.md). If you use ELSER on EIS, you don't need to manage the infrastructure and resources required by the ELSER model as it doesn't use the resources of your nodes.
+
+- The minimum dedicated ML node size for deploying and using the ELSER model is 4 GB in {{ech}} if [deployment autoscaling](../../../deploy-manage/autoscaling.md) is turned off. Turning on autoscaling is recommended because it allows your deployment to dynamically adjust resources based on demand. Better performance can be achieved by using more allocations or more threads per allocation, which requires bigger ML nodes. Autoscaling provides bigger nodes when required. If autoscaling is turned off, you must provide suitably sized nodes yourself.
 ::::
 
 Enabling trained model autoscaling for your ELSER deployment is recommended. Refer to [*Trained model autoscaling*](../../../deploy-manage/autoscaling/trained-model-autoscaling.md) to learn more.

From c33cccaa9336dc05f65bad6f21a6c473c5109784 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Mon, 28 Jul 2025 16:49:03 +0200
Subject: [PATCH 14/15] Update explore-analyze/elastic-inference/eis.md

---
 explore-analyze/elastic-inference/eis.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 4843950b63..6d8916ba53 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -24,7 +24,7 @@ The request routing does not restrict the location of your deployments.
 
 ELSER requests are managed by Elastic's own EIS infrastructure.
 
-## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
+## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS) [elser-on-eis]
 
 {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
 

From 742f2ee28a5454254a1c353f1241ca972a1f9e33 Mon Sep 17 00:00:00 2001
From: =?UTF-8?q?Istv=C3=A1n=20Zolt=C3=A1n=20Szab=C3=B3?=
 <szabosteve@gmail.com>
Date: Tue, 29 Jul 2025 10:47:59 +0200
Subject: [PATCH 15/15] Relocates applies to tags.

---
 explore-analyze/elastic-inference/eis.md           | 7 +++++--
 explore-analyze/elastic-inference/inference-api.md | 2 +-
 2 files changed, 6 insertions(+), 3 deletions(-)

diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
index 4843950b63..dacb526c48 100644
--- a/explore-analyze/elastic-inference/eis.md
+++ b/explore-analyze/elastic-inference/eis.md
@@ -15,7 +15,7 @@ Instead, you can use {{ml}} models for ingest, search, and chat independently of
 
 * Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
 
-* {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview` You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS).
+* You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS). {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
 
 ## Region and hosting [eis-regions]
 
@@ -26,7 +26,10 @@ ELSER requests are managed by Elastic's own EIS infrastructure.
 
 ## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS)
 
-{applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
+```{applies_to}
+stack: preview 9.1
+serverless: preview
+```
 
 ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure and with that, it simplifies the semantic search and hybrid search experience.
 
diff --git a/explore-analyze/elastic-inference/inference-api.md b/explore-analyze/elastic-inference/inference-api.md
index b41a05dffa..0ae5bffa0b 100644
--- a/explore-analyze/elastic-inference/inference-api.md
+++ b/explore-analyze/elastic-inference/inference-api.md
@@ -30,7 +30,7 @@ The following section lists the default {{infer}} endpoints, identified by their
 
 ### Default endpoints for Elastic {{infer-cap}} Service (EIS)
 
-- {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview` `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`.
+- `.elser-2-elastic`: uses the [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) trained model as an Elastic {{infer-cap}} Service for `sparse_embedding` tasks (recommended for English language text). The `model_id` is `.elser_model_2`. {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
 
 ### Default endpoints used on ML-nodes