You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/ai-services/openai/how-to/business-continuity-disaster-recovery.md
+7-7Lines changed: 7 additions & 7 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -59,22 +59,22 @@ The additional benefit of this architecture is that it allows you to stack Stand
59
59
60
60
## Supporting Infrastructure
61
61
62
-
The infrastructure that supports the Azure OpenAI architecture needs to be considered in designs. The infrastructure components involved in the architecture vary depending on if the applications consume the Azure OpenAI service over the Internet or over a private network. The architecture discussed in this article assumes the organization has implemented a [Generative AI Gateway](https://learn.microsoft.com/en-us/ai/playbook/technology-guidance/generative-ai/dev-starters/genai-gateway/). Organizations with a mature Azure footprint and hybrid connectivity should consume the service through a private network while organizations without hybrid connectivity, or with applications in another cloud such as GCP or AWS, will consume the service through the Microsoft public backbone.
62
+
The infrastructure that supports the Azure OpenAI architecture needs to be considered in designs. The infrastructure components involved in the architecture vary depending on if the applications consume the Azure OpenAI service over the Internet or over a private network. The architecture discussed in this article assumes the organization has implemented a [Generative AI Gateway](/ai/playbook/technology-guidance/generative-ai/dev-starters/genai-gateway/). Organizations with a mature Azure footprint and hybrid connectivity should consume the service through a private network while organizations without hybrid connectivity, or with applications in another cloud such as GCP or AWS, will consume the service through the Microsoft public backbone.
63
63
64
64
### Designing for consumption through the Microsoft public backbone
65
65
66
66
Organizations consuming the service through the Microsoft public backbone should consider the following design elements:
67
67
68
-
1. The Generative AI Gateway should be deployed in manner that ensures it's available in the event of an Azure regional outage. If using APIM (Azure API Management), this can be done by deploying separate APIM instances in multiple regions or using the [multi-region gateway feature of APIM](https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-deploy-multi-region).
69
-
1. A public global server load balancer should be used to load balance across the multiple Generative AI Gateway instances in either an active/active or active/passive manner. [Azure FrontDoor](https://learn.microsoft.com/en-us/azure/architecture/web-apps/app-service/architectures/multi-region) or [Azure Traffic Manager](https://learn.microsoft.com/en-us/azure/traffic-manager/traffic-manager-routing-methods) can be used to fulfill this role depending on the organization’s requirements.
68
+
1. The Generative AI Gateway should be deployed in manner that ensures it's available in the event of an Azure regional outage. If using APIM (Azure API Management), this can be done by deploying separate APIM instances in multiple regions or using the [multi-region gateway feature of APIM](/azure/api-management/api-management-howto-deploy-multi-region).
69
+
1. A public global server load balancer should be used to load balance across the multiple Generative AI Gateway instances in either an active/active or active/passive manner. [Azure FrontDoor](/azure/traffic-manager/traffic-manager-routing-methods) can be used to fulfill this role depending on the organization’s requirements.
70
70
71
71
{bcdr_diagram_three}
72
72
73
73
### Designing for consumption through the private networking
74
74
75
75
Organizations consuming the service through a private network should consider the following design elements:
76
76
77
-
1. Hybrid connectivity should be deployed in a way that it protects against the failure of an Azure region. The underlining components supporting hybrid connectivity consist of the organization’s on-premises network infrastructure and Microsoft ExpressRoute (https://learn.microsoft.com/en-us/azure/expressroute/designing-for-high-availability-with-expressroute) or VPN (https://learn.microsoft.com/en-us/azure/vpn-gateway/vpn-gateway-highlyavailable).
78
-
1. The Generative AI Gateway should be deployed in manner that ensures it's available in the event of an Azure regional outage. If using APIM (Azure API Management), this can be done by deploying separate APIM instances in multiple regions or using the multi-region gateway feature of APIM (https://learn.microsoft.com/en-us/azure/api-management/api-management-howto-deploy-multi-region).
79
-
1. Azure Private Link Private Endpoints should be deployed for each Azure OpenAI Service instance in each Azure region. For Azure Private DNS, a split-brain DNS approach can be used if all application access to the Azure OpenAI Service is done through the Generative AI Gateway to provide for additional protection against a regional failure. If not, Private DNS records will need to be manually modified in the event of a loss of an Azure region (https://learn.microsoft.com/en-us/azure/azure-resource-manager/management/overview#what-location-should-i-use-for-my-resource-group),
80
-
1. A private global server load balancer should be used to load balance across the multiple Generative AI Gateway instances in either an active/active or active/passive manner. Azure doesn't have a native service for global server load balancer for workloads that require private DNS resolution (https://github.com/adstuart/azure-crossregion-private-lb). In lieu of a global server load balancer, organizations can achieve an active/passive pattern through toggling the DNS record for the Generative AI Gateway.
77
+
1. Hybrid connectivity should be deployed in a way that it protects against the failure of an Azure region. The underlining components supporting hybrid connectivity consist of the organization’s on-premises network infrastructure and Microsoft ExpressRoute (/azure/expressroute/designing-for-high-availability-with-expressroute) or VPN (/azure/vpn-gateway/vpn-gateway-highlyavailable).
78
+
1. The Generative AI Gateway should be deployed in manner that ensures it's available in the event of an Azure regional outage. If using APIM (Azure API Management), this can be done by deploying separate APIM instances in multiple regions or using the multi-region gateway feature of APIM (/azure/api-management/api-management-howto-deploy-multi-region).
79
+
1. Azure Private Link Private Endpoints should be deployed for each Azure OpenAI Service instance in each Azure region. For Azure Private DNS, a split-brain DNS approach can be used if all application access to the Azure OpenAI Service is done through the Generative AI Gateway to provide for additional protection against a regional failure. If not, Private DNS records will need to be manually modified in the event of a loss of an Azure region (/azure/azure-resource-manager/management/overview#what-location-should-i-use-for-my-resource-group),
80
+
1. A private global server load balancer should be used to load balance across the multiple Generative AI Gateway instances in either an active/active or active/passive manner. Azure doesn't have a native service for global server load balancer for workloads that require private DNS resolution. For additional background on this topic you can refer to this unofficial guide: https://github.com/adstuart/azure-crossregion-private-lb. In lieu of a global server load balancer, organizations can achieve an active/passive pattern through toggling the DNS record for the Generative AI Gateway.
0 commit comments