Skip to content

Conversation

maxjakob
Copy link
Contributor

The Elastic Inference Service (EIS) is going GA in 9.2. Hence we remove the limitations that apply to tech preview.

We leave in the batch size limitation as it currently still exists.

@maxjakob maxjakob requested a review from seanhandley October 16, 2025 13:18
@maxjakob maxjakob marked this pull request as ready for review October 16, 2025 13:19
@maxjakob maxjakob requested review from a team as code owners October 16, 2025 13:19
@github-actions
Copy link

github-actions bot commented Oct 16, 2025

🔍 Preview links for changed docs

@florent-leborgne
Copy link
Contributor

florent-leborgne commented Oct 16, 2025

Hey @maxjakob Thanks for these updates. Can you confirm a couple of things:

  • Currently with your edits we're leaving some "preview" information. Should we update this to indicate that this is becoming GA in 9.2? If so, this is done like this:
## ELSER via Elastic {{infer-cap}} Service (ELSER on EIS) [elser-on-eis]
```{applies_to}
stack: preview 9.1, ga 9.2
serverless: ga
```
  • Are these limitations still valid for users that are still on 9.1 and aren't upgrading to 9.2? You're also mentioning in the description of your PR that the batch size limitation is still in effect in 9.2 but the changes in the file are removing it, so should we rather keep it?

If my assumptions are correct, then we can instead update this doc to use tabs, like this:

### Limitations

::::{tab-set}

:::{tab-item} Serverless and 9.2
- Batch size
   Batches are limited to a maximum of 16 documents.
   This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion.
:::

:::{tab-item} 9.1
This feature is in technical preview in this version. While we do encourage experimentation, we do not recommend implementing production use cases on top of this feature while it is in Technical Preview.

- Uptime
   There are no uptime guarantees during the Technical Preview.
   While Elastic will address issues promptly, the feature may be unavailable for extended periods.

- Throughput and latency
   {{infer-cap}} throughput via this endpoint is expected to exceed that of {{infer}} operations on an ML node.
   However, throughput and latency are not guaranteed.
   Performance may vary during the Technical Preview.

- Batch size
   Batches are limited to a maximum of 16 documents.
   This is particularly relevant when using the [_bulk API](https://www.elastic.co/docs/api/doc/elasticsearch/v9/operation/operation-bulk) for data ingestion.

- Rate limits 
   Rate limit for search and ingest is currently at 500 requests per minute. This allows you to ingest approximately 8000 documents per minute at 16 documents per request.
:::

::::

@leemthompo
Copy link
Contributor

FYI Istvan had a PR lined up that I think achieves the same goal here: #3014

Might be good to reconcile which one to maintain and merge next week and which one to close :)

@maxjakob
Copy link
Contributor Author

Should we update this to indicate that this is becoming GA in 9.2?

Yes, thanks for showing me how to tag this!

Are these [batch size] limitations still valid for users that are still on 9.1 and aren't upgrading to 9.2?

The limitation is in the EIS service, therefore it is not dependent on any Elasticsearch version. We will remove/change this restriction at some point in EIS at which point it will be changed for all Elasticsearch clusters of all versions.

batch size limitation is still in effect [...] but the changes in the file are removing it

This was an oversight on my part.

Thanks @leemthompo, I will close this PR in favor of #3014.

@maxjakob maxjakob closed this Oct 16, 2025
@maxjakob maxjakob deleted the eis-remove-limitations branch October 16, 2025 14:17
@seanhandley seanhandley restored the eis-remove-limitations branch October 20, 2025 20:15
@seanhandley seanhandley reopened this Oct 20, 2025
@seanhandley seanhandley reopened this Oct 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants