elastic · szabosteve · Mar 11, 2025 · Mar 11, 2025 · Mar 11, 2025 · Mar 11, 2025
@@ -5,6 +5,59 @@ applies_to:
 navigation_title: Elastic Inference Service (EIS)
 ---
 
-# Elastic {{infer-cap}} Service
+# Elastic {{infer-cap}} Service [elastic-inference-service-eis]
 
-This is the documentation of the Elastic Inference Service.
+The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster.
+With EIS, you don't need to manage the infrastructure and resources required for large language models (LLMs) by adding, configuring, and scaling {{ml}} nodes.
+Instead, you can use {{ml}} models in high-throughput, low-latency scenarios independently of your {{es}} infrastructure.
+
+Currently, you can perform chat completion tasks through EIS using the {{infer}} API.
+
+% TO DO: Link to the EIS inference endpoint reference docs when it's added to the OpenAPI spec. (Comming soon) %
+
+## Default EIS endpoints [default-eis-inference-endpoints]
+
+Your {{es}} deployment includes a preconfigured EIS endpoint, making it easier to use chat completion via the {{infer}} API:
+
+* `rainbow-sprinkles-elastic`: uses Anthropic's Claude Sonnet 3.5 model for chat completion {{infer}} tasks.
+
+::::{note}
+
+* The model appears as `Elastic LLM` in the AI Assistant, Attack Discovery UI, preconfigured connectors list, and the Search Playground.
+* To fine-tune prompts sent to `rainbow-sprinkles-elastic`, optimize them for Claude Sonnet 3.5.
+
+::::
+
+% TO DO: Link to the AI assistant documentation in the different solutions and possibly connector docs. %
+
+## Regions [eis-regions]
+
+EIS is currently running on AWS and in the following regions:
+
+* `us-east-1`
+* `us-west-2`
+
+For more details on AWS regions, refer to the [AWS Global Infrastructure](https://aws.amazon.com/about-aws/global-infrastructure/regions_az/) and the [supported cross-region {{infer}} profiles](https://docs.aws.amazon.com/bedrock/latest/userguide/inference-profiles-support.html) documentation.
+
+## LLM hosts [llm-hosts]
+
+The LLM used with EIS is hosted by [Amazon Bedrock](https://aws.amazon.com/bedrock/).
+
+## Examples
+
+The following example demostrates how to perform a `chat_completion` task through EIS by using the `.rainbow-sprinkles-elastic` default {{infer}} endpoint.
+
+```json
+POST /_inference/chat_completion/.rainbow-sprinkles-elastic/_stream
+{
+    "messages": [
+        {
+            "role": "user",
+            "content": "Say yes if it works."
+        }
+    ],
+    "temperature": 0.7,
+    "max_completion_tokens": 300
+    }
+}
+```
@@ -15,12 +15,10 @@ Refer to the [{{infer-cap}} APIs](https://www.elastic.co/docs/api/doc/elasticsea
 
 Creates an {{infer}} endpoint to perform an {{infer}} task with the `elastic` service.
 
-
 ## {{api-request-title}} [infer-service-elastic-api-request] 
 
 `PUT /_inference/<task_type>/<inference_id>`
 
-
 ## {{api-path-parms-title}} [infer-service-elastic-api-path-params] 
 
 `<inference_id>`
@@ -34,16 +32,13 @@ Creates an {{infer}} endpoint to perform an {{infer}} task with the `elastic` se
     * `chat_completion`,
     * `sparse_embedding`.
 
-
 ::::{note} 
 The `chat_completion` task type only supports streaming and only through the `_stream` API.
 
 For more information on how to use the `chat_completion` task type, please refer to the [chat completion documentation](chat-completion-inference-api.md).
 
 ::::
 
-
-
 ## {{api-request-body-title}} [infer-service-elastic-api-request-body] 
 
 `max_chunk_size`
@@ -64,7 +59,6 @@ For more information on how to use the `chat_completion` task type, please refer
     `service_settings`
     :   (Required, object) Settings used to install the {{infer}} model.
 
-
 `model_id`
 :   (Required, string) The name of the model to use for the {{infer}} task.
 
@@ -77,9 +71,7 @@ For more information on how to use the `chat_completion` task type, please refer
     }
     ```
 
-
-
-## Elastic {{infer-cap}} Service example [inference-example-elastic] 
+## Elastic {{infer-cap}} Service example [inference-example-elastic]
 
 The following example shows how to create an {{infer}} endpoint called `elser-model-eis` to perform a `text_embedding` task type.
 
@@ -104,4 +96,3 @@ PUT /_inference/chat_completion/chat-completion-endpoint
     }
 }
 ```
-