diff --git a/deploy-manage/cloud-organization/billing/elastic-observability-billing-dimensions.md b/deploy-manage/cloud-organization/billing/elastic-observability-billing-dimensions.md
index 8188db28d0..343fbbe5d7 100644
--- a/deploy-manage/cloud-organization/billing/elastic-observability-billing-dimensions.md
+++ b/deploy-manage/cloud-organization/billing/elastic-observability-billing-dimensions.md
@@ -22,11 +22,4 @@ Data volumes for ingest and retention are based on the fully enriched normalized
 
 [Synthetic monitoring](../../../solutions/observability/apps/synthetic-monitoring.md) is an optional add-on to Observability Serverless projects that allows you to periodically check the status of your services and applications. In addition to the core ingest and retention dimensions, there is a charge to execute synthetic monitors on our testing infrastructure. Browser (journey) based tests are charged per-test-run, and ping (lightweight) tests have an all-you-can-use model per location used.
 
-## Elastic Inference Service [EIS-billing]
-[Elastic Inference Service (EIS)](../../../explore-analyze/elastic-inference/eis.md) enables you to leverage AI-powered search as a service without deploying a model in your serverless project. EIS is configured as a default LLM for use with the Observability AI Assistant (for all observability projects).
-
-:::{note}
-Use of the Observability AI Assistant uses EIS tokens and incurs related token-based add-on billing for your serverless project.
-:::
-
 Refer to [Serverless billing dimensions](serverless-project-billing-dimensions.md) and the [{{ecloud}} pricing table](https://cloud.elastic.co/cloud-pricing-table?productType=serverless&project=observability) for more details about {{obs-serverless}} billing dimensions and rates.
diff --git a/explore-analyze/elastic-inference.md b/explore-analyze/elastic-inference.md
index debda95960..9b19f2a4cd 100644
--- a/explore-analyze/elastic-inference.md
+++ b/explore-analyze/elastic-inference.md
@@ -9,6 +9,5 @@ navigation_title: Elastic Inference
 
 There are several ways to perform {{infer}} in the {{stack}}. This page provides a brief overview of the different methods:
 
-* [Using EIS (Elastic Inference Service)](elastic-inference/eis.md)
 * [Using the {{infer}} API](elastic-inference/inference-api.md)
 * [Trained models deployed in your cluster](machine-learning/nlp/ml-nlp-overview.md)
diff --git a/explore-analyze/elastic-inference/eis.md b/explore-analyze/elastic-inference/eis.md
deleted file mode 100644
index 9a28237447..0000000000
--- a/explore-analyze/elastic-inference/eis.md
+++ /dev/null
@@ -1,10 +0,0 @@
----
-applies_to:
-  stack: ga
-  serverless: ga
-navigation_title: Elastic Inference Service (EIS)
----
-
-# Elastic {{infer-cap}} Service
-
-This is the documentation of the Elastic Inference Service.
diff --git a/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md b/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md
deleted file mode 100644
index 3c30ebb6d7..0000000000
--- a/explore-analyze/elastic-inference/inference-api/elastic-inference-service-eis.md
+++ /dev/null
@@ -1,107 +0,0 @@
----
-navigation_title: "Elastic Inference Service"
-mapped_pages:
-  - https://www.elastic.co/guide/en/elasticsearch/reference/master/infer-service-elastic.html
-applies_to:
-  stack:
-  serverless:
----
-
-# Elastic Inference Service (EIS) [infer-service-elastic]
-
-:::{tip} Inference API reference  
-Refer to the [{{infer-cap}} APIs](https://www.elastic.co/docs/api/doc/elasticsearch/group/endpoint-inference) for further information.  
-:::
-
-Creates an {{infer}} endpoint to perform an {{infer}} task with the `elastic` service.
-
-
-## {{api-request-title}} [infer-service-elastic-api-request] 
-
-`PUT /_inference/<task_type>/<inference_id>`
-
-
-## {{api-path-parms-title}} [infer-service-elastic-api-path-params] 
-
-`<inference_id>`
-:   (Required, string) The unique identifier of the {{infer}} endpoint.
-
-`<task_type>`
-:   (Required, string) The type of the {{infer}} task that the model will perform.
-
-    Available task types:
-
-    * `chat_completion`,
-    * `sparse_embedding`.
-
-
-::::{note} 
-The `chat_completion` task type only supports streaming and only through the `_stream` API.
-
-For more information on how to use the `chat_completion` task type, refer to the [chat completion documentation](https://www.elastic.co/docs/api/doc/elasticsearch/operation/operation-inference-stream-inference).
-
-::::
-
-
-
-## {{api-request-body-title}} [infer-service-elastic-api-request-body] 
-
-`max_chunk_size`
-:   (Optional, integer) Specifies the maximum size of a chunk in words. Defaults to `250`. This value cannot be higher than `300` or lower than `20` (for `sentence` strategy) or `10` (for `word` strategy).
-
-`overlap`
-:   (Optional, integer) Only for `word` chunking strategy. Specifies the number of overlapping words for chunks. Defaults to `100`. This value cannot be higher than the half of `max_chunk_size`.
-
-`sentence_overlap`
-:   (Optional, integer) Only for `sentence` chunking strategy. Specifies the numnber of overlapping sentences for chunks. It can be either `1` or `0`. Defaults to `1`.
-
-`strategy`
-:   (Optional, string) Specifies the chunking strategy. It could be either `sentence` or `word`.
-
-    `service`
-    :   (Required, string) The type of service supported for the specified task type. In this case, `elastic`.
-
-    `service_settings`
-    :   (Required, object) Settings used to install the {{infer}} model.
-
-
-`model_id`
-:   (Required, string) The name of the model to use for the {{infer}} task.
-
-`rate_limit`
-:   (Optional, object) By default, the `elastic` service sets the number of requests allowed per minute to `1000` in case of `sparse_embedding` and `240` in case of `chat_completion`. This helps to minimize the number of rate limit errors returned. To modify this, set the `requests_per_minute` setting of this object in your service settings:
-
-    ```text
-    "rate_limit": {
-        "requests_per_minute": <<number_of_requests>>
-    }
-    ```
-
-
-
-## Elastic {{infer-cap}} Service example [inference-example-elastic] 
-
-The following example shows how to create an {{infer}} endpoint called `elser-model-eis` to perform a `text_embedding` task type.
-
-```console
-PUT _inference/sparse_embedding/elser-model-eis
-{
-    "service": "elastic",
-    "service_settings": {
-        "model_name": "elser"
-    }
-}
-```
-
-The following example shows how to create an {{infer}} endpoint called `chat-completion-endpoint` to perform a `chat_completion` task type.
-
-```console
-PUT /_inference/chat_completion/chat-completion-endpoint
-{
-    "service": "elastic",
-    "service_settings": {
-        "model_id": "model-1"
-    }
-}
-```
-
diff --git a/explore-analyze/toc.yml b/explore-analyze/toc.yml
index 5685008ace..ba04dba75b 100644
--- a/explore-analyze/toc.yml
+++ b/explore-analyze/toc.yml
@@ -118,10 +118,7 @@ toc:
       - file: transforms/transform-limitations.md
   - file: elastic-inference.md
     children:
-      - file: elastic-inference/eis.md
       - file: elastic-inference/inference-api.md
-        children:
-          - file: elastic-inference/inference-api/elastic-inference-service-eis.md
   - file: machine-learning.md
     children:
       - file: machine-learning/setting-up-machine-learning.md
diff --git a/solutions/observability/observability-ai-assistant.md b/solutions/observability/observability-ai-assistant.md
index 5c3b794d4c..4afeb6df99 100644
--- a/solutions/observability/observability-ai-assistant.md
+++ b/solutions/observability/observability-ai-assistant.md
@@ -16,7 +16,7 @@ You can [interact with the AI Assistant](#obs-ai-interact) in two ways:
 * **Contextual insights**: Embedded assistance throughout Elastic UIs that explains errors and messages with suggested remediation steps.
 * **Chat interface**: A conversational experience where you can ask questions and receive answers about your data. The assistant uses function calling to request, analyze, and visualize information based on your needs.
 
-By default, AI Assistant uses a [preconfigured LLM](#preconfigured-llm-ai-assistant) connector that works out of the box. You can also connect to third-party LLM providers.
+The AI Assistant integrates with your large language model (LLM) provider through our supported {{stack}} connectors:
 
 ## Use cases
 
@@ -28,11 +28,6 @@ The {{obs-ai-assistant}} helps you:
 * **Build and execute queries**: Build Elasticsearch queries from natural language, convert Query DSL to ES|QL syntax, and execute queries directly from the chat interface 
 * **Visualize data**: Create time-series charts and distribution graphs from your Elasticsearch data
 
-## Preconfigured LLM [preconfigured-llm-ai-assistant]
-
-:::{include} ../_snippets/elastic-llm.md
-:::
-
 ## Requirements [obs-ai-requirements]
 
 The AI assistant requires the following:
@@ -45,7 +40,7 @@ The AI assistant requires the following:
   
     - To run {{obs-ai-assistant}} on a self-hosted Elastic stack, you need an [appropriate license](https://www.elastic.co/subscriptions).
  
-- If not using the [default preconfigured LLM](#preconfigured-llm-ai-assistant), you need an account with a third-party generative AI provider that preferably supports function calling. If your provider does not support function calling, you can configure AI Assistant settings under **Stack Management** to simulate function calling, but this might affect performance.
+- An account with a third-party generative AI provider that preferably supports function calling. If your AI provider does not support function calling, you can configure AI Assistant settings under **Stack Management** to simulate function calling, but this might affect performance.
 
   - The free tier offered by third-party generative AI provider may not be sufficient for the proper functioning of the AI assistant. In most cases, a paid subscription to one of the supported providers is required.
 
@@ -76,10 +71,6 @@ It's important to understand how your data is handled when using the AI Assistan
 
 ## Set up the AI Assistant [obs-ai-set-up]
 
-:::{note}
-If you use [the preconfigured LLM](#preconfigured-llm-ai-assistant) connector, you can skip this step. Your LLM connector is ready to use.
-:::
-
 The AI Assistant connects to one of these supported LLM providers:
 
 | Provider | Configuration | Authentication |
diff --git a/solutions/search/rag/playground.md b/solutions/search/rag/playground.md
index 9f5c13161c..bec09040f2 100644
--- a/solutions/search/rag/playground.md
+++ b/solutions/search/rag/playground.md
@@ -59,11 +59,6 @@ Here’s a simpified overview of how Playground works:
 
     * User can also **Download the code** to integrate into application
 
-## Elastic LLM [preconfigured-llm-playground]
-
-:::{include} ../../_snippets/elastic-llm.md
-:::
-
 ## Availability and prerequisites [playground-availability-prerequisites]
 
 For Elastic Cloud and self-managed deployments Playground is available in the **Search** space in {{kib}}, under **Content** > **Playground**.
@@ -77,7 +72,7 @@ To use Playground, you’ll need the following:
 
     * See [ingest data](playground.md#playground-getting-started-ingest) if you’d like to ingest sample data.
 
-3. If not using the default preconfigured LLM connector, you will need an account with a supported LLM provider:
+3. An account with a **supported LLM provider**. Playground supports the following:
 
     * **Amazon Bedrock**
 
@@ -119,11 +114,6 @@ You can also use locally hosted LLMs that are compatible with the OpenAI SDK. On
 
 ### Connect to LLM provider [playground-getting-started-connect]
 
-:::{note}
-If you use [the preconfigured LLM](#preconfigured-llm-playground) connector, you can skip this step. Your LLM connector is ready to use.
-
-:::
-
 To get started with Playground, you need to create a [connector](../../../deploy-manage/manage-connectors.md) for your LLM provider. You can also connect to [locally hosted LLMs](playground.md#playground-local-llms) which are compatible with the OpenAI API, by using the OpenAI connector.
 
 To connect to an LLM provider, follow these steps on the Playground landing page: