Skip to content

Commit e4ca2ec

Browse files
authored
Merge pull request #264719 from MicrosoftDocs/main
1/30/2024 PM Publish
2 parents 4712346 + bbdef26 commit e4ca2ec

File tree

142 files changed

+2215
-1254
lines changed

Some content is hidden

Large Commits have some content hidden by default. Use the searchbox below for content that may be hidden.

142 files changed

+2215
-1254
lines changed

.openpublishing.redirection.json

Lines changed: 5 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -14407,6 +14407,11 @@
1440714407
"source_path_from_root": "/articles/automation/dsc-linux-powershell.md",
1440814408
"redirect_url": "/azure/automation/automation-dsc-overview"
1440914409
},
14410+
{
14411+
"source_path_from_root": "/articles/aks/operator-best-practices-multi-region.md",
14412+
"redirect_url": "/azure/aks/ha-dr-overview",
14413+
"redirect_document_id": false
14414+
},
1441014415
{
1441114416
"source_path_from_root": "/articles/virtual-machines/extensions/dsc-linux.md",
1441214417
"redirect_url": "/azure/virtual-machines/extensions/dsc-overview"

articles/active-directory-b2c/add-password-reset-policy.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -39,6 +39,10 @@ The default name of the **Change email** button in *selfAsserted.html* is **chan
3939

4040
[!INCLUDE [active-directory-b2c-customization-prerequisites](../../includes/active-directory-b2c-customization-prerequisites.md)]
4141

42+
43+
- The B2C Users need to have an authentication method specified for self-service password reset. Select the B2C User, in the left menu under **Manage**, select **Authentication methods**, ensure **Authentication contact info** is set. B2C users created via a SignUp flow will have this set by default. For users created via Azure Portal or by Graph API need to have this set for SSPR to work.
44+
45+
4246
## Self-service password reset (recommended)
4347

4448
The new password reset experience is now part of the sign-up or sign-in policy. When the user selects the **Forgot your password?** link, they are immediately sent to the Forgot Password experience. Your application no longer needs to handle the [AADB2C90118 error code](#password-reset-policy-legacy), and you don't need a separate policy for password reset.

articles/ai-services/openai/concepts/use-your-data.md

Lines changed: 23 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -29,7 +29,7 @@ One of the key features of Azure OpenAI on your data is its ability to retrieve
2929
To get started, [connect your data source](../use-your-data-quickstart.md) using Azure OpenAI Studio and start asking questions and chatting on your data.
3030

3131
> [!NOTE]
32-
> To get started, you need to already have been approved for [Azure OpenAI access](../overview.md#how-do-i-get-access-to-azure-openai) and have an [Azure OpenAI Service resource](../how-to/create-resource.md) with either the gpt-35-turbo or the gpt-4 models deployed.
32+
> To get started, you need to already have been approved for [Azure OpenAI access](../overview.md#how-do-i-get-access-to-azure-openai) and have an [Azure OpenAI Service resource](../how-to/create-resource.md) deployed in a [supported region](#azure-openai-on-your-data-regional-availability) with either the gpt-35-turbo or the gpt-4 models.
3333
3434
## Data formats and file types
3535

@@ -570,6 +570,28 @@ class TokenEstimator(object):
570570
token_output = TokenEstimator.estimate_tokens(input_text)
571571
```
572572

573+
## Azure OpenAI on your data regional availability
574+
575+
You can use Azure OpenAI on your data with an Azure OpenAI resource in the following regions:
576+
* Australia East
577+
* Brazil South
578+
* Canada East
579+
* East US
580+
* East US 2
581+
* France Central
582+
* Japan East
583+
* North Central US
584+
* Norway East
585+
* South Central US
586+
* South India
587+
* Sweden Central
588+
* Switzerland North
589+
* UK South
590+
* West Europe
591+
* West US
592+
593+
If your Azure OpenAI resource is in another region, you won't be able to use Azure OpenAI on your data.
594+
573595
## Next steps
574596
* [Get started using your data with Azure OpenAI](../use-your-data-quickstart.md)
575597

articles/ai-services/openai/use-your-data-quickstart.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -51,7 +51,7 @@ In this quickstart you can use your own data with Azure OpenAI models. Using Azu
5151

5252
Azure OpenAI requires registration and is currently only available to approved enterprise customers and partners. [See Limited access to Azure OpenAI Service](/legal/cognitive-services/openai/limited-access?context=/azure/ai-services/openai/context/context) for more information. You can apply for access to Azure OpenAI by completing the form at <a href="https://aka.ms/oai/access" target="_blank">https://aka.ms/oai/access</a>. Open an issue on this repo to contact us if you have an issue.
5353

54-
- An Azure OpenAI resource with a chat model deployed (for example, GPT-3 or GPT-4). For more information about model deployment, see the [resource deployment guide](./how-to/create-resource.md).
54+
- An Azure OpenAI resource in a [supported region](./concepts/use-your-data.md#azure-openai-on-your-data-regional-availability) with a chat model deployed (for example, GPT-3 or GPT-4). For more information about model deployment, see the [resource deployment guide](./how-to/create-resource.md).
5555

5656
- Your chat model can use version `gpt-35-turbo (0301)`, `gpt-35-turbo-16k`, `gpt-4`, and `gpt-4-32k`. You can view or change your model version in [Azure OpenAI Studio](./how-to/working-with-models.md#model-updates).
5757

articles/aks/TOC.yml

Lines changed: 14 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -158,6 +158,20 @@
158158
href: best-practices.md
159159
- name: Baseline architecture for an AKS cluster
160160
href: /azure/architecture/reference-architectures/containers/aks/secure-baseline-aks?toc=/azure/aks/toc.json&bc=/azure/aks/breadcrumb/toc.json
161+
- name: High availability disaster recovery
162+
items:
163+
- name: Overview
164+
href: ha-dr-overview.md
165+
required: yes
166+
limit: 1
167+
- name: Solutions
168+
items:
169+
- name: Active-active
170+
href: active-active-solution.md
171+
- name: Active-passive
172+
href: active-passive-solution.md
173+
- name: Passive-cold
174+
href: passive-cold-solution.md
161175
- name: Security
162176
items:
163177
- name: Authentication and authorization
@@ -180,8 +194,6 @@
180194
href: operator-best-practices-network.md
181195
- name: Storage
182196
href: operator-best-practices-storage.md
183-
- name: Business continuity (BC) and disaster recovery (DR)
184-
href: operator-best-practices-multi-region.md
185197
- name: Performance and scaling
186198
items:
187199
- name: For small to medium workloads
Lines changed: 84 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,84 @@
1+
---
2+
title: Recommended active-active high availability solution overview for Azure Kubernetes Service (AKS)
3+
description: Learn about the recommended active-active high availability solution overview for Azure Kubernetes Service (AKS).
4+
author: schaffererin
5+
ms.author: schaffererin
6+
ms.service: azure-kubernetes-service
7+
ms.topic: concept-article
8+
ms.date: 01/30/2024
9+
---
10+
11+
# Recommended active-active high availability solution overview for Azure Kubernetes Service (AKS)
12+
13+
When you create an application in Azure Kubernetes Service (AKS) and choose an Azure region during resource creation, it's a single-region app. In the event of a disaster that causes the region to become unavailable, your application also becomes unavailable. If you create an identical deployment in a secondary Azure region, your application becomes less susceptible to a single-region disaster, which guarantees business continuity, and any data replication across the regions lets you recover your last application state.
14+
15+
While there are multiple patterns that can provide recoverability for an AKS solution, this guide outlines the recommended active-active high availability solution for AKS. Within this solution, we deploy two independent and identical AKS clusters into two paired Azure regions with both clusters actively serving traffic.
16+
17+
> [!NOTE]
18+
> The following use case can be considered standard practice within AKS. It has been reviewed internally and vetted in conjunction with our Microsoft partners.
19+
20+
## Active-active high availability solution overview
21+
22+
This solution relies on two identical AKS clusters configured to actively serve traffic. You place a global traffic manager, such as [Azure Front Door](../frontdoor/front-door-overview.md), in front of the two clusters to distribute traffic across them. You must consistently configure the clusters to host an instance of all applications required for the solution to function.
23+
24+
Availability zones are another way to ensure high availability and fault tolerance for your AKS cluster within the same region. Availability zones allow you to distribute your cluster nodes across multiple isolated locations within an Azure region. This way, if one zone goes down due to a power outage, hardware failure, or network issue, your cluster can continue to run and serve your applications. Availability zones also improve the performance and scalability of your cluster by reducing the latency and contention among nodes. To set up availability zones for your AKS cluster, you need to specify the zone numbers when creating or updating your node pools. For more information, see [What are Azure availability zones?](../reliability/availability-zones-overview.md)
25+
26+
> [!NOTE]
27+
> Many regions support availability zones. Consider using regions with availability zones to provide more resiliency and availability for your workloads. For more information, see [Recover from a region-wide service disruption](/azure/architecture/resiliency/recovery-loss-azure-region).
28+
29+
## Scenarios and configurations
30+
31+
This solution is best implemented when hosting stateless applications and/or with other technologies also deployed across both regions, such as horizontal scaling. In scenarios where the hosted application is reliant on resources, such as databases, that are actively in only one region, we recommend instead implementing an [active-passive solution](./active-passive-solution.md) for potential cost savings, as active-passive has more downtime than active-active.
32+
33+
## Components
34+
35+
The active-active high availability solution uses many Azure services. This section covers only the components unique to this multi-cluster architecture. For more information on the remaining components, see the [AKS baseline architecture](/azure/architecture/reference-architectures/containers/aks/baseline-aks?toc=%2Fazure%2Faks%2Ftoc.json&bc=%2Fazure%2Faks%2Fbreadcrumb%2Ftoc.json).
36+
37+
**Multiple clusters and regions**: You deploy multiple AKS clusters, each in a separate Azure region. During normal operations, your Azure Front Door configuration routes network traffic between all regions. If one region becomes unavailable, traffic routes to a region with the fastest load time for the user.
38+
39+
**Hub-spoke network per region**: A regional hub-spoke network pair is deployed for each regional AKS instance. [Azure Firewall Manager](../firewall-manager/overview.md) policies manage the firewall policies across all regions.
40+
41+
**Regional key store**: You provision [Azure Key Vault](../key-vault/general/overview.md) in each region to store sensitive values and keys specific to the AKS instance and to support services found in that region.
42+
43+
**Azure Front Door**: [Azure Front Door](../frontdoor/front-door-overview.md) load balances and routes traffic to a regional [Azure Application Gateway](../application-gateway/overview.md) instance, which sits in front of each AKS cluster. Azure Front Door allows for *layer seven* global routing.
44+
45+
**Log Analytics**: Regional [Log Analytics](../azure-monitor/logs/log-analytics-overview.md) instances store regional networking metrics and diagnostic logs. A shared instance stores metrics and diagnostic logs for all AKS instances.
46+
47+
**Container Registry**: The container images for the workload are stored in a managed container registry. With this solution, a single [Azure Container Registry](../container-registry/container-registry-intro.md) instance is used for all Kubernetes instances in the cluster. Geo-replication for Azure Container Registry enables you to replicate images to the selected Azure regions and provides continued access to images even if a region experiences an outage.
48+
49+
## Failover process
50+
51+
If a service or service component becomes unavailable in one region, traffic should be routed to a region where that service is available. A multi-region architecture includes many different failure points. In this section, we cover the potential failure points.
52+
53+
### Application Pods (Regional)
54+
55+
A Kubernetes deployment object creates multiple replicas of a pod (*ReplicaSet*). If one is unavailable, traffic is routed between the remaining replicas. The Kubernetes *ReplicaSet* attempts to keep the specified number of replicas up and running. If one instance goes down, a new instance should be recreated. [Liveness probes](../container-instances/container-instances-liveness-probe.md) can check the state of the application or process running in the pod. If the pod is unresponsive, the liveness probe removes the pod, which forces the *ReplicaSet* to create a new instance.
56+
57+
For more information, see [Kubernetes ReplicaSet](https://kubernetes.io/docs/concepts/workloads/controllers/replicaset/).
58+
59+
### Application Pods (Global)
60+
61+
When an entire region becomes unavailable, the pods in the cluster are no longer available to serve requests. In this case, the Azure Front Door instance routes all traffic to the remaining health regions. The Kubernetes clusters and pods in these regions continue to serve requests. To compensate for increased traffic and requests to the remaining cluster, keep in mind the following guidance:
62+
63+
- Make sure network and compute resources are right sized to absorb any sudden increase in traffic due to region failover. For example, when using Azure Container Network Interface (CNI), make sure you have a subnet that can support all pod IPs with a spiked traffic load.
64+
- Use the [Horizontal Pod Autoscaler](./concepts-scale.md#horizontal-pod-autoscaler) to increase the pod replica count to compensate for the increased regional demand.
65+
- Use the AKS [Cluster Autoscaler](./cluster-autoscaler.md) to increase the Kubernetes instance node counts to compensate for the increased regional demand.
66+
67+
### Kubernetes node pools (Regional)
68+
69+
Occasionally, localized failure can occur to compute resources, such as power becoming unavailable in a single rack of Azure servers. To protect your AKS nodes from becoming a single point regional failure, use [Azure Availability Zones](./availability-zones.md). Availability zones ensure that AKS nodes in each availability zone are physically separated from those defined in another availability zones.
70+
71+
### Kubernetes node pools (Global)
72+
73+
In a complete regional failure, Azure Front Door routes traffic to the remaining healthy regions. Again, make sure to compensate for increased traffic and requests to the remaining cluster.
74+
75+
## Failover testing strategy
76+
77+
While there are no mechanisms currently available within AKS to take down an entire region of deployment for testing purposes, [Azure Chaos Studio](../chaos-studio/chaos-studio-overview.md) offers the ability to create a chaos experiment on your cluster.
78+
79+
## Next steps
80+
81+
If you're considering a different solution, see the following articles:
82+
83+
- [Active passive disaster recovery solution overview for Azure Kubernetes Service (AKS)](./active-passive-solution.md)
84+
- [Passive cold solution overview for Azure Kubernetes Service (AKS)](./passive-cold-solution.md)

0 commit comments

Comments
 (0)