Skip to content

Commit 3bff231

Browse files
sydneemayersaahill
andauthored
Update articles/ai-services/openai/how-to/spillover-traffic-management.md
Co-authored-by: aahill <[email protected]>
1 parent a327f38 commit 3bff231

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

articles/ai-services/openai/how-to/spillover-traffic-management.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -21,7 +21,7 @@ Spillover manages traffic fluctuations on provisioned deployments by routing ove
2121
- The data processing level of your standard deployment must match your provisioned deployment (e.g. global provisioned deployment must be used with a global standard spillover deployment).
2222

2323
## When to enable spillover on provisioned deployments
24-
To maximize the utilization of your provisioned deployment, it is recommended to enable spillover for all global and data zone provisioned deployments. With spillover, bursts or fluctuations in traffic can be automatically managed by the service. This capability reduces the risk of experience disruptions when a provisioned deployment is fully utilized. Alternatively, spillover is configurable per-request to provide flexibility across different scenarios and workloads.
24+
To maximize the utilization of your provisioned deployment, it is recommended to enable spillover for all global and data zone provisioned deployments. With spillover, bursts or fluctuations in traffic can be automatically managed by the service. This capability reduces the risk of experiencing disruptions when a provisioned deployment is fully utilized. Alternatively, spillover is configurable per-request to provide flexibility across different scenarios and workloads.
2525

2626
## When does spillover come into effect?
2727
When spillover is enabled for a deployment or configured for a given inference request, spillover is initiated when a non-200 response code is received for a given inference request. When a request results in a non-200 response code, the Azure OpenAI Service automatically sends the request from your provisioned deployment to your standard deployment to be processed. Even if a subset of requests is routed to the standard deployment, the service prioritizes sending requests to the provisioned deployment before sending any overage requests to the standard deployment.

0 commit comments

Comments
 (0)