Skip to content

Commit be63e1c

Browse files
Merge pull request #279967 from EldertGrootenboer/geo-replication-docs-updates
Geo replication docs updates
2 parents 41a5e2a + b4c2c99 commit be63e1c

File tree

2 files changed

+68
-12
lines changed

2 files changed

+68
-12
lines changed

articles/service-bus-messaging/monitor-service-bus-reference.md

Lines changed: 8 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -78,11 +78,19 @@ The following two types of errors are classified as **user errors**:
7878
|Memory size usage per namespace| No | Memory Usage | Percent | The percentage memory usage of the namespace. | Replica |
7979

8080
### Error metrics
81+
8182
| Metric Name | Exportable via diagnostic settings | Unit | Aggregation type | Description | Dimensions |
8283
| ------------------- | ----------------- | --- | --- | --- | --- |
8384
|Server Errors| No | Count | Total | The number of requests not processed because of an error in the Service Bus service over a specified period. | Entity name<br/><br/>Operation Result |
8485
|User Errors | No | Count | Total | The number of requests not processed because of user errors over a specified period. | Entity name<br/><br/>Operation Result|
8586

87+
### Geo-Replication metrics
88+
89+
| Metric Name | Exportable via diagnostic settings | Unit | Aggregation type | Description | Dimensions |
90+
| ------------------- | ----------------- | --- | --- | --- | --- |
91+
|Replication Lag Duration| No | Seconds | Max | The offset in seconds between the latest action on the primary and the secondary regions. | |
92+
|Replication Lag Count | No | Count | Max | The offset in number of operations between the latest action on the primary and the secondary regions. | |
93+
8694
## Metric dimensions
8795

8896
Azure Service Bus supports the following dimensions for metrics in Azure Monitor. Adding dimensions to your metrics is optional. If you don't add dimensions, metrics are specified at the namespace level.

articles/service-bus-messaging/service-bus-geo-replication.md

Lines changed: 60 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -48,20 +48,20 @@ This feature allows promoting any secondary region to primary, at any time. Prom
4848
The Geo-replication feature can be used to implement different scenarios, as described here.
4949

5050
### Disaster recovery
51-
Data and metadata are continuously synchronized between the primary and secondary regions. If a region lags or is unavailable, it is possible to promote a secondary region as the primary. This promotion allows for the uninterrupted operation of workloads in the newly promoted region. Such a promotion may be necessitated by degradation of Service Bus or other services within your workload, particularly if you aim to run the various components together. Depending on the severity and impacted services, the promotion could either be planned or forced. In case of planned promotion in-flight messages are replicated before finalizing the promotion, while with forced promotion this is immediately executed.
51+
Data and metadata are continuously synchronized between the primary and secondary regions. If a region lags or is unavailable, it's possible to promote a secondary region as the primary. This promotion allows for the uninterrupted operation of workloads in the newly promoted region. Such a promotion may be necessitated by degradation of Service Bus or other services within your workload, particularly if you aim to run the various components together. Depending on the severity and impacted services, the promotion could either be planned or forced. In case of planned promotion in-flight messages are replicated before finalizing the promotion, while with forced promotion this is immediately executed.
5252

5353
### Region migration
5454
There are times when you want to migrate your Service Bus workloads to run in a different region. For example, when Azure adds a new region that is geographically closer to your location, users, or other services. Alternatively, you might want to migrate when the regions where most of your workloads run is shifted. The Geo-Replication feature also provides a good solution in these cases. In this case, you would set up Geo-Replication on your existing namespace with the desired new region as secondary region and wait for the synchronization to complete. At this point, you would start a planned promotion, allowing any in-flight messages to be replicated. Once the promotion is completed you can now optionally remove the old region, which is now the secondary region, and continue running your workloads in the desired region.
5555

5656
## Basic concepts
5757

58-
The Geo-Replication feature implements metadata and data replication in a primary-secondary replication model. At a given time there’s a single primary region, which is serving both producers and consumers. The secondaries act as hot stand-by regions, meaning that it is not possible to interact with these secondary regions. However, they run in the same configuration as the primary region, allowing for fast promotion, and meaning they your workloads can immediately continue running after promotion has been completed. The Geo-Replication feature is available for the [Premium tier](service-bus-premium-messaging.md).
58+
The Geo-Replication feature implements metadata and data replication in a primary-secondary replication model. At a given time there’s a single primary region, which is serving both producers and consumers. The secondaries act as hot stand-by regions, meaning that it isn't possible to interact with these secondary regions. However, they run in the same configuration as the primary region, allowing for fast promotion, and meaning they your workloads can immediately continue running after promotion has been completed. The Geo-Replication feature is available for the [Premium tier](service-bus-premium-messaging.md).
5959

6060
Some of the key aspects of Geo-Replication feature are:
6161
- Service Bus services perform fully managed replication of metadata, message data, and message state and property changes across regions adhering to the replication consistency configured at the namespace.
6262
- Single namespace hostname; Upon successful configuration of a Geo-Replication enabled namespace, users can use the namespace hostname in their client application. The hostname behaves agnostic of the configured primary and secondary regions, and always points to the primary region.
6363
- When a customer initiates a promotion, the hostname points to the region selected to be the new primary region. The old primary becomes a secondary region.
64-
- It is not possible to read or write on the secondary regions.
64+
- It isn't possible to read or write on the secondary regions.
6565
- Synchronous and asynchronous replication modes, further described [here](#replication-modes).
6666
- Customer-managed promotion from primary to secondary region, providing full ownership and visibility for outage resolution. Metrics are available, which can help to automate the promotion from customer side.
6767
- Secondary regions can be added or removed at the customer's discretion.
@@ -102,7 +102,7 @@ As such, it doesn’t have the absolute guarantee that all regions have the data
102102
The replication mode can be changed after configuring Geo-Replication. You can go from synchronous to asynchronous or from asynchronous to synchronous. If you go from asynchronous to synchronous, your secondary will be configured as synchronous after lag reaches zero. If you're running with a continual lag for whatever reason, then you may need to pause your publishers in order for lag to reach zero and your mode to be able to switch to synchronous. The reasons to have synchronous replication enabled, instead of asynchronous replication, are tied to the importance of the data, specific business needs, or compliance reasons, rather than availability of your application.
103103

104104
> [!NOTE]
105-
> In case a secondary region lags or becomes unavailable, the application will no longer be able to replicate to this region and will start throttling once the replication lag is reached. To continue using the namespace in the primary location, the afflicted secondary region can be removed. If no more secondary regions are configured, the namespace will continue without Geo-Replication enabled. It is possible to add additional secondary regions at any time.
105+
> In case a secondary region lags or becomes unavailable, the application will no longer be able to replicate to this region and will start throttling once the replication lag is reached. To continue using the namespace in the primary location, the afflicted secondary region can be removed. If no more secondary regions are configured, the namespace will continue without Geo-Replication enabled. It's possible to add additional secondary regions at any time.
106106
107107
## Secondary region selection
108108

@@ -120,7 +120,9 @@ The Geo-Replication feature enables customers to configure a secondary region to
120120

121121
## Setup
122122

123-
The following section is an overview to set up the Geo-Replication feature on a new namespace.
123+
### Using Azure portal
124+
125+
The following section is an overview to set up the Geo-Replication feature on a new namespace through the Azure portal.
124126
> [!NOTE]
125127
> This experience might change during public preview. We'll update this document accordingly.
126128
@@ -130,6 +132,42 @@ The following section is an overview to set up the Geo-Replication feature on a
130132
1. Either check the **Synchronous replication** checkbox, or specify a value for the **Async Replication - Max Replication lag** value in seconds.
131133
:::image type="content" source="./media/service-bus-geo-replication/create-namespace-with-geo-replication.png" alt-text="Screenshot showing the Create Namespace experience with Geo-Replication enabled.":::
132134

135+
### Using Bicep template
136+
137+
To create a namespace with the Geo-Replication feature enabled, add the *geoDataReplication* properties section.
138+
139+
```bicep
140+
param serviceBusName string
141+
param primaryLocation string
142+
param secondaryLocation string
143+
param maxReplicationLagInSeconds int
144+
145+
resource sb 'Microsoft.ServiceBus/namespaces@2023-01-01-preview' = {
146+
name: serviceBusName
147+
location: primaryLocation
148+
sku: {
149+
name: 'Premium'
150+
tier: 'Premium'
151+
capacity: 1
152+
}
153+
properties: {
154+
geoDataReplication: {
155+
maxReplicationLagDurationInSeconds: maxReplicationLagInSeconds
156+
locations: [
157+
{
158+
locationName: primaryLocation
159+
roleType: 'Primary'
160+
}
161+
{
162+
locationName: secondaryLocation
163+
roleType: 'Secondary'
164+
}
165+
]
166+
}
167+
}
168+
}
169+
```
170+
133171
## Management
134172

135173
Once you create a namespace with the Geo-Replication feature enabled, you can manage the feature from the **Replication (preview)** blade.
@@ -146,15 +184,11 @@ To remove a secondary region, click on the **...**-ellipsis next to the region,
146184

147185
### Promotion flow
148186

149-
A promotion is triggered manually by the customer (either explicitly through a command, or through client owned business logic that triggers the command) and never by Azure. It gives the customer full ownership and visibility for outage resolution on Azure's backbone. In the portal, click on the **Promote** icon, and follow the instructions in the pop-up blade to delete the region.
150-
151-
When choosing **Planned** promotion, the service waits to catch up the replication lag before initiating the promotion. On the other hand, when choosing **Forced** promotion, the service immediately initiates the promotion. The namespace will be placed in read-only mode from the time that a promotion is requested, until the time that the promotion has completed. It is possible to do a forced promotion at any time after a planned promotion has been initiated. This puts the user in control to expedite the promotion, when a planned failover takes longer than desired.
187+
A promotion is triggered manually by the customer (either explicitly through a command, or through client owned business logic that triggers the command) and never by Azure. It gives the customer full ownership and visibility for outage resolution on Azure's backbone. When choosing **Planned** promotion, the service waits to catch up the replication lag before initiating the promotion. On the other hand, when choosing **Forced** promotion, the service immediately initiates the promotion. The namespace will be placed in read-only mode from the time that a promotion is requested, until the time that the promotion has completed. It is possible to do a forced promotion at any time after a planned promotion has been initiated. This puts the user in control to expedite the promotion, when a planned failover takes longer than desired.
152188

153189
> [!IMPORTANT]
154190
> When using **Forced** promotion, any data that has not been replicated may be lost.
155191
156-
:::image type="content" source="./media/service-bus-geo-replication/promote-secondary-region.png" alt-text="Screeshot showing the flow to promote secondary region." lightbox="./media/service-bus-geo-replication/promote-secondary-region.png":::
157-
158192
After the promotion is initiated:
159193

160194
1. The hostname is updated to point to the secondary region, which can take up to a few minutes.
@@ -163,11 +197,25 @@ After the promotion is initiated:
163197
> ping *your-namespace-fully-qualified-name*
164198
165199
1. Clients automatically reconnect to the secondary region.
166-
:::image type="content" source="./media/service-bus-geo-replication/promotion-flow.png" alt-text="Screenshot of the portal showing the flow of promotion from primary to secondary region." lightbox="./media/service-bus-geo-replication/promotion-flow.png":::
167200

201+
:::image type="content" source="./media/service-bus-geo-replication/promotion-flow.png" alt-text="Screenshot of the portal showing the flow of promotion from primary to secondary region." lightbox="./media/service-bus-geo-replication/promotion-flow.png":::
168202

169203
You can automate promotion either with monitoring systems, or with custom-built monitoring solutions. However, such automation takes extra planning and work, which is out of the scope of this article.
170204

205+
### Using Azure portal
206+
207+
In the portal, click on the **Promote** icon, and follow the instructions in the pop-up blade to delete the region.
208+
209+
:::image type="content" source="./media/service-bus-geo-replication/promote-secondary-region.png" alt-text="Screeshot showing the flow to promote secondary region." lightbox="./media/service-bus-geo-replication/promote-secondary-region.png":::
210+
211+
### Using Azure CLI
212+
213+
Execute the Azure CLI command to initiate the promotion. The **Force** property is optional, and defaults to **false**.
214+
215+
```azurecli
216+
az rest --method post --url https://management.azure.com/subscriptions/<subscriptionId>/resourceGroups/<resourceGroup>/providers/Microsoft.ServiceBus/namespaces/<namespaceName>/failover?api-version=2023-01-01-preview --body "{'properties': {'PrimaryLocation': '<newPrimaryocation>', 'api-version':'2023-01-01-preview', 'Force':'false'}}"
217+
```
218+
171219
### Monitoring data replication
172220
Users can monitor the progress of the replication job by monitoring the replication lag metric in Log Analytics.
173221
- Enable Metrics logs in your Service Bus namespace as described at [Monitor Azure Service Bus](/azure/service-bus-messaging/monitor-service-bus).
@@ -199,7 +247,7 @@ Note the following considerations to keep in mind with this release:
199247
- Promoting a complex distributed infrastructure should be [rehearsed](/azure/architecture/reliability/disaster-recovery#disaster-recovery-plan) at least once.
200248

201249
## Pricing
202-
The Premium tier for Service Bus is priced per [Messaging Unit](service-bus-premium-messaging.md#how-many-messaging-units-are-needed). With the Geo-Replication feature, secondary regions run on the same number of MUs as the primary region, and the pricing is calculated over the total number of MUs. Additionally, there is a charge for based on the published bandwidth times the number of secondary regions. During the early public preview, this charge is waived.
250+
The Premium tier for Service Bus is priced per [Messaging Unit](service-bus-premium-messaging.md#how-many-messaging-units-are-needed). With the Geo-Replication feature, secondary regions run on the same number of MUs as the primary region, and the pricing is calculated over the total number of MUs. Additionally, there's a charge for based on the published bandwidth times the number of secondary regions. During the early public preview, this charge is waived.
203251

204252
## Next steps
205253

0 commit comments

Comments
 (0)