Skip to content

Commit 61cc33c

Browse files
committed
update
1 parent fb9bc4a commit 61cc33c

File tree

4 files changed

+5
-5
lines changed

4 files changed

+5
-5
lines changed

articles/ai-services/openai/how-to/business-continuity-disaster-recovery.md

Lines changed: 5 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -24,13 +24,13 @@ By default, the Azure OpenAI service provides a [default SLA](https://www.micros
2424
## Standard Deployments
2525

2626
1. For Standard Deployments default to Data Zone deployment (US/EU options).
27-
- If you can use Global Standard deployments, you should. Data Zone deployments are the next best option for organizations requiring data processing to happen entirely within a geographic boundary.
27+
- If you can use Global Standard deployments, you should. Data Zone deployments are the next best option for organizations requiring data processing to happen entirely within a geographic boundary.
2828
1. You should deploy two Azure OpenAI Service resources in the Azure Subscription. One resource should be deployed in your preferred region and the other should be deployed in your secondary/failover region. The Azure OpenAI service allocates quota at the subscription + region level, so they can live in the same subscription with no impact on quota.
2929
1. You should have one deployment for each model you plan to use deployed to the Azure OpenAI Service resource in your preferred Azure region and you should duplicate these model deployments in the secondary/failover region. Allocate the full quota available in your Standard deployment to each of these endpoints. This provides the highest throughput rate when compared to splitting quota across multiple deployments.
3030
1. Select the deployment region based on your network topology. You can deploy an Azure OpenAI Service resource to any supported region and then create a Private Endpoint for that resource in your preferred region.
3131
- Once within the Azure OpenAI Service boundary, the Azure OpenAI Service optimizes routing and processing across available compute in the data zone.
3232
- Using data zones is more efficient and simpler than self-managed load balancing across multiple regional deployments.
33-
1. If there's a regional outage where the deployment is in an unusable state, you can use the other deployment in the secondary/passive region within the same subscription.
33+
1. If there's a regional outage where the deployment is in an unusable state, you can use the other deployment in the secondary/passive region within the same subscription.
3434
- Because both the primary and secondary deployments are Zone deployments, they draw from the same Zone capacity pool which draws from all available regions in the Zone. The secondary deployment is protecting against the primary Azure OpenAI endpoint being unreachable.
3535
- Use a Generative AI Gateway that supports load balancing and circuit breaker pattern such as API Management in front of the Azure OpenAI Service endpoints so disruption during a regional outage is minimized to consuming applications.
3636
- If the quota within a given subscription is exhausted, a new subscription can be deployed in the same manner as above and its endpoint deployed behind the Generative AI Gateway.
@@ -52,11 +52,11 @@ By default, the Azure OpenAI service provides a [default SLA](https://www.micros
5252
1. The workload and enterprise PTU pool deployments should protect against regional failures. You could do this by placing the workload PTU pool in Region A and the enterprise PTU pool in Region B.
5353
1. This deployment should failover first to the Enterprise PTU Pool and then to the Standard deployment. This implies that when utilization of the workload PTU deployment exceeds 100%, requests would still be serviced by PTU endpoints, enabling a higher latency SLA for that application.
5454

55-
{bcdr_diagram_one}
55+
:::image type="content" source="../how-to/media/disaster-recovery/disaster-recovery-diagram.jpg" alt-text="Disaster recovery architectural diagram" lightbox="../how-to/media/disaster-recovery/disaster-recovery-diagram.jpg":::
5656

5757
The additional benefit of this architecture is that it allows you to stack Standard deployments with Provisioned Deployments so that you can dial in your preferred level of performance and resiliency. This allows you to use PTU for your baseline demand across workloads and leverage pay-as-you-go for spikes in traffic.
5858

59-
{bcdr_diagram_two}
59+
:::image type="content" source="../how-to/media/disaster-recovery/recovery.jpg" alt-text="Failover architectural diagram" lightbox="../how-to/mediadisaster-recovery/recovery.jpg":::
6060

6161
## Supporting Infrastructure
6262

@@ -69,7 +69,7 @@ Organizations consuming the service through the Microsoft public backbone should
6969
1. The Generative AI Gateway should be deployed in manner that ensures it's available in the event of an Azure regional outage. If using APIM (Azure API Management), this can be done by deploying separate APIM instances in multiple regions or using the [multi-region gateway feature of APIM](/azure/api-management/api-management-howto-deploy-multi-region).
7070
1. A public global server load balancer should be used to load balance across the multiple Generative AI Gateway instances in either an active/active or active/passive manner. [Azure FrontDoor](/azure/traffic-manager/traffic-manager-routing-methods) can be used to fulfill this role depending on the organization’s requirements.
7171

72-
{bcdr_diagram_three}
72+
:::image type="content" source="../how-to/media/disaster-recovery/scaling.jpg" alt-text="Provisioned scaling diagram" lightbox="../how-to/mediadisaster-recovery/scaling.jpg":::
7373

7474
### Designing for consumption through the private networking
7575

75.5 KB
Loading
49.5 KB
Loading
17.8 KB
Loading

0 commit comments

Comments
 (0)