Skip to content

Commit 41e3e84

Browse files
authored
Update limitations and improve wording
Rate limits documented seperately
1 parent f15b11c commit 41e3e84

File tree

1 file changed

+4
-5
lines changed
  • explore-analyze/elastic-inference

1 file changed

+4
-5
lines changed

explore-analyze/elastic-inference/eis.md

Lines changed: 4 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,7 @@ stack: preview 9.1
3131
serverless: preview
3232
```
3333

34-
ELSER on EIS enables you to use the ELSER model without using ML nodes in your infrastructure and with that, it simplifies the semantic search and hybrid search experience.
34+
ELSER on EIS enables you to use the ELSER model on GPUs, without having to manage your own ML nodes. We expect better performance for throughput and latency than ML nodes, and will continue to benchmark, remove limitations and address concerns as we move towards General Availability.
3535

3636
### Private preview access
3737

@@ -40,10 +40,6 @@ Private preview access is available by submitting the form provided [here](https
4040
### Limitations
4141

4242
While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview.
43-
The known limitations include
44-
- Maximum batch size is 16 for ingest requests
45-
- Rate limit for search and ingest at 2000 tokens per minute
46-
- We do not support autoscaling at this point, so many parallel requests will result in performance degradations. Autoscaling coming soon.
4743

4844
#### Access
4945

@@ -65,3 +61,6 @@ Performance may vary during the Technical Preview.
6561

6662
Batches are limited to a maximum of 16 documents.
6763
This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion.
64+
65+
#### Rate Limits
66+
Rate limit for search and ingest is currently at 2000 tokens per minute

0 commit comments

Comments
 (0)