Skip to content

Conversation

shubhaat
Copy link
Contributor

Explain Ingest, Search, ML VCU consumption better. I see this confusion on -

Are search VCUs directly related to number of searches? Are ingest VCUs directly related to number of ingest operations?

I have added info to break down what search/ingest/ML VCU consumption is driven by.

Explain Ingest, Search, ML VCU consumption better. I see this confusion on - 

Are search VCUs directly related to number of searches? 
Are ingest VCUs directly related to number of ingest operations? 

I have added info to break down what search/ingest/ML VCU consumption is driven by.
@shubhaat shubhaat requested review from ppf2 and petegaleotti August 15, 2025 20:41
@shubhaat shubhaat requested a review from a team as a code owner August 15, 2025 20:41
Copy link

github-actions bot commented Aug 15, 2025

🔍 Preview links for changed docs

@shubhaat shubhaat requested a review from kilfoyle August 18, 2025 23:27
Copy link
Contributor

@kilfoyle kilfoyle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! 🚀
Thanks @shubhaat

@shubhaat shubhaat enabled auto-merge (squash) August 22, 2025 02:59
@shubhaat shubhaat merged commit b69b60b into main Aug 22, 2025
6 of 7 checks passed
@shubhaat shubhaat deleted the shubhaat-patch-1 branch August 22, 2025 03:02
* **Machine learning:** The VCUs used to perform inference, NLP tasks, and other ML activities.
* **Indexing:** The VCUs used to index incoming documents. Indexing VCUs account for compute resources consumed for ingestion. This is based on ingestion rate, and amount of data ingested at any given time. Transforms and ingest pipelines also contribute to ingest VCU consumption.
* **Search:** The VCUs used to return search results, with the latency and queries per second (QPS) you require. Search VCUs are calculated as a factor of the compute resources needed to run search queries, search throughput and latency. Search VCUs are not charged per search request, but instead are a factor of the compute resources that scale up and down based on amount of searchable data, search load (QPS) and performance (latency and availability).
* **Machine learning:** The VCUs used to perform inference, NLP tasks, and other ML activities. ML VCUs are a factor of the models deployed, and number of ML operations such as inference for search and ingest. ML VCUs are typically consumed for generating embeddings during ingestion, and during semantic search or reranking.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@shubhaat Catching up on GH notifications :D I wonder if we want to mention that ML VCUs only apply if they use models deployed on ML nodes, i.e. if they use Elastic Managed LLMs, there is no ML VCU costs, only token charges.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants