You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: explore-analyze/elastic-inference/eis.md
+11-16Lines changed: 11 additions & 16 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -7,15 +7,15 @@ applies_to:
7
7
8
8
# Elastic {{infer-cap}} Service [elastic-inference-service-eis]
9
9
10
-
The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your cluster.
10
+
The Elastic {{infer-cap}} Service (EIS) enables you to leverage AI-powered search as a service without deploying a model in your environment.
11
11
With EIS, you don't need to manage the infrastructure and resources required for {{ml}} {{infer}} by adding, configuring, and scaling {{ml}} nodes.
12
12
Instead, you can use {{ml}} models for ingest, search, and chat independently of your {{es}} infrastructure.
13
13
14
14
## AI features powered by EIS [ai-features-powered-by-eis]
15
15
16
16
* Your Elastic deployment or project comes with a default [`Elastic Managed LLM` connector](https://www.elastic.co/docs/reference/kibana/connectors-kibana/elastic-managed-llm). This connector is used in the AI Assistant, Attack Discovery, Automatic Import and Search Playground.
17
17
18
-
* You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS). {applies_to}`stack: preview 9.1` {applies_to}`serverless: preview`
18
+
* You can use [ELSER](/explore-analyze/machine-learning/nlp/ml-nlp-elser.md) to perform semantic search as a service (ELSER on EIS). {applies_to}`stack: preview 9.1, ga 9.2` {applies_to}`serverless: ga`
19
19
20
20
## Region and hosting [eis-regions]
21
21
@@ -27,25 +27,20 @@ ELSER requests are managed by Elastic's own EIS infrastructure.
27
27
## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS) [elser-on-eis]
28
28
29
29
```{applies_to}
30
-
stack: preview 9.1
31
-
serverless: preview
30
+
stack: preview 9.1, ga 9.2
31
+
serverless: ga
32
32
```
33
33
34
-
ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput and latency than ML nodes, and will continue to benchmark, remove limitations and address concerns as we move towards General Availability.
34
+
ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput and latency than ML nodes, and will continue to benchmark, remove limitations and address concerns.
35
35
36
-
### Limitations
37
-
38
-
While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview.
36
+
### Pricing
39
37
40
-
#### Access
38
+
ELSER on EIS usage is billed separately from your other Elastic deployment resources.
39
+
For details about request-based pricing and billing dimensions, refer to the [ELSER on GPU item on the pricing page](https://www.elastic.co/pricing/serverless-search).
41
40
42
-
This feature is being gradually rolled out to Serverless and Cloud Hosted customers.
43
-
It may not be available to all users at launch.
44
-
45
-
#### Uptime
41
+
### Limitations
46
42
47
-
There are no uptime guarantees during the Technical Preview.
48
-
While Elastic will address issues promptly, the feature may be unavailable for extended periods.
43
+
Elastic is continuously working to remove these constraints and further improve performance and scalability.
49
44
50
45
#### Throughput and latency
51
46
@@ -58,6 +53,6 @@ Performance may vary during the Technical Preview.
58
53
Batches are limited to a maximum of 16 documents.
59
54
This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion.
60
55
61
-
#### Rate Limits
56
+
#### Rate Limits
62
57
63
58
Rate limit for search and ingest is currently at 2000 requests per minute.
0 commit comments