Skip to content

Commit 18f9a68

Browse files
author
Jill Grant
authored
Merge pull request #292924 from flang-msft/fxl----Enterprise]-Geo--replication-healthy-metric-documentation--30414607ado--
fxl----Enterprise-Geo--replication-healthy-metric-documentation--ado-30414607
2 parents 3f26b07 + 15d5c3f commit 18f9a68

File tree

2 files changed

+104
-20
lines changed

2 files changed

+104
-20
lines changed

articles/azure-cache-for-redis/cache-how-to-active-geo-replication.md

Lines changed: 38 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,7 @@ title: Configure active geo-replication for Enterprise Azure Cache for Redis ins
33
description: Learn how to replicate your Azure Cache for Redis Enterprise instances across Azure regions.
44
ms.custom: devx-track-azurecli, ignite-2024
55
ms.topic: conceptual
6-
ms.date: 11/11/2024
6+
ms.date: 01/10/2025
77
---
88

99
# Configure active geo-replication for Enterprise Azure Cache for Redis instances
@@ -62,7 +62,7 @@ To remove a cache instance from an active geo-replication group, you just delete
6262

6363
In case one of the caches in your replication group is unavailable due to region outage, you can forcefully remove the unavailable cache from the replication group. After you apply **Force-unlink** to a cache, you can't sync any data that is written to that cache back to the replication group after force-unlinking.
6464

65-
You should remove the unavailable cache because the remaining caches in the replication group start storing the metadata that hasn’t been shared to the unavailable cache. When this happens, the available caches in your replication group might run out of memory.
65+
You should remove the unavailable cache because the remaining caches in the replication group start storing the metadata that wasn't shared to the unavailable cache. When this happens, the available caches in your replication group might run out of memory.
6666

6767
1. Go to Azure portal and select one of the caches in the replication group that is still available.
6868

@@ -154,7 +154,7 @@ Let's say you want to scale up each instance in this geo-replication group to an
154154

155155
At this point, the `Redis01` and `Redis02` instances can only scale up to an Enterprise E20 instance. All other scaling operations are blocked.
156156
>[!NOTE]
157-
> The `Redis00` instance is not blocked from scaling further at this point. But it will be blocked once either `Redis01` or `Redis02` is scaled to be an Enterprise E20.
157+
> The `Redis00` instance isn't blocked from scaling further at this point. But it's blocked once either `Redis01` or `Redis02` is scaled to be an Enterprise E20.
158158
>
159159
160160
Once each instance is scaled to the same tier and size, all scaling locks are removed:
@@ -169,6 +169,41 @@ Due to the potential for inadvertent data loss, you can't use the `FLUSHALL` and
169169

170170
:::image type="content" source="media/cache-how-to-active-geo-replication/cache-active-flush.png" alt-text="Screenshot showing Active geo-replication selected in the Resource menu and the Flush cache feature has a red box around it.":::
171171

172+
## Geo-replication Metric
173+
174+
The _Geo Replication Healthy_ metric in the Enterprise tier of Azure Cache for Redis helps monitor the health of geo-replicated clusters. You use this metric to monitor the sync status among geo-replicas.
175+
176+
To monitor the _Geo Replication Healthy_ metric in the Azure portal:
177+
178+
1. Open the Azure portal and select your Azure Cache for Redis instance.
179+
180+
1. On the Resource menu, select **Metrics** under the **Monitoring** section.
181+
182+
1. Select **Add Metric** and select the **Geo Replication Healthy** metric.
183+
184+
1. If needed, apply filters for specific geo-replicas.
185+
186+
1. You can configure an alert to notify you if the **Geo replication Healthy** metric emits an unhealthy value (0) continuously for over 60 minutes.
187+
188+
1. Select **New Alert Rule**.
189+
190+
1. Define the condition to trigger if the metric value is 0 for at least 60 minutes, the recommended time.
191+
192+
1. Add action groups for notifications, for example: email, SMS, and others.
193+
194+
1. Save the alert.
195+
196+
1. For more information on how to setup alerts for you Redis Enterprise cache, see the alert section in [Monitor Redis Caches](/azure/azure-cache-for-redis/monitor-cache?tabs=enterprise-enterprise-flash).
197+
198+
> [!IMPORTANT]
199+
> This metric might temporarily show as unhealthy due to routine operations like maintenance events or scaling, initiated either by Azure or the customer. To avoid false alarms, we recommend setting up an observation window of 60 minutes, where the metric continues to stay unhealthy as the appropriate time for generating an alert as it might indicate a problem that requires intervention.
200+
201+
## Common Client-side issues that can cause sync issues among geo-replicas
202+
203+
- Use of custom Hash tags – Using custom hashtags in Redis can lead to uneven distribution of data across shards, which might cause performance issues and synchronization problems in geo-replicas therefore avoid using custom hashtags unless the database needs to perform multiple key operations.
204+
205+
- Large Key Size - Large keys can create synchronization issues among geo-replicas. To maintain smooth performance and reliable replication, we recommend keeping key sizes under 500MB when using geo-replication. If individual key size gets close to 2GB the cache faces geo-replication health issues.
206+
172207
### Flush caches using Azure CLI or PowerShell
173208

174209
The Azure CLI and PowerShell can also be used to trigger a flush operation. For more information on using Azure CLI, see [az redisenterprise database flush](/cli/azure/redisenterprise#az-redisenterprise-database-flush). For more information on using PowerShell, see [Invoke-AzRedisEnterpriseCacheDatabaseFlush](/powershell/module/az.redisenterprisecache/invoke-azredisenterprisecachedatabaseflush).

0 commit comments

Comments
 (0)