Skip to content

Commit a76c53e

Browse files
committed
[ML] Explain serverless pricing
Add a blurb about how we calculate VCUs for ML: - Trained Models are mostly based on vCPU consumed, 1 allocation * 1 thread = 1 vCPU = 8 VCU - Jobs are mostly based on memory consumed, 1 GB = 1 VCU
1 parent 676e020 commit a76c53e

File tree

1 file changed

+3
-0
lines changed

1 file changed

+3
-0
lines changed

serverless/pages/ml-nlp-auto-scale.asciidoc

Lines changed: 3 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -95,6 +95,9 @@ The used resources for trained model deployments depend on three factors:
9595
* the use case you optimize the model deployment for (ingest or search)
9696
* whether model autoscaling is enabled with adaptive allocations/resources to have dynamic resources, or disabled for static resources
9797

98+
VCUs for ML are based on the amount of vCPU and Memory consumed. For ML, `1` VCU equals `0.125` of vCPU and `1GB` of memory, where vCPUs is measured by allocations multiplied by threads, and where memory is the amount consumed by trained models or ML jobs.
99+
As a math formula, `VCUs = 8 * allocations * threads`, or `1` VCU for every `1GB` of memory consumed, whichever is greater.
100+
98101
The following tables show you the number of allocations, threads, and VCUs available on Serverless when adaptive resources are enabled or disabled.
99102

100103
[discrete]

0 commit comments

Comments
 (0)