You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/advisor/advisor-reference-reliability-recommendations.md
+71-9Lines changed: 71 additions & 9 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -92,7 +92,7 @@ Learn more about [App Service Certificate - ASCDomainVerificationRequired (Domai
92
92
93
93
## Cache
94
94
95
-
### Availability may be impacted from high memory fragmentation. Increase fragmentation memory reservation to avoid potential impact.
95
+
### Availability may be impacted from high memory fragmentation. Increase fragmentation memory reservation to avoid potential impact
96
96
97
97
Fragmentation and memory pressure can cause availability incidents during a failover or management operations. Increasing reservation of memory for fragmentation helps in reducing the cache failures when running under high memory pressure. Memory for fragmentation can be increased via maxfragmentationmemory-reserved setting available in advanced settings blade.
98
98
@@ -247,7 +247,7 @@ Virtual machines in an Availability Set with disks that share either storage acc
247
247
248
248
Learn more about [Availability set - ManagedDisksAvSet (Use Managed Disks to improve data reliability)](https://aka.ms/aa_avset_manageddisk_learnmore).
249
249
250
-
### Check Point Virtual Machine may lose Network Connectivity.
250
+
### Check Point Virtual Machine may lose Network Connectivity
251
251
252
252
We have identified that your Virtual Machine might be running a version of Check Point image that has been known to lose network connectivity in the event of a platform servicing operation. We recommend that you upgrade to a newer version of the image. Contact Check Point for further instructions on how to upgrade your image.
253
253
@@ -265,6 +265,26 @@ In order for a session host to deploy and register to Azure Virtual Desktop prop
265
265
266
266
Learn more about [Virtual machine - SessionHostNeedsAssistanceForUrlCheck (Access to mandatory URLs missing for your Azure Virtual Desktop environment)](../virtual-desktop/safe-url-list.md).
267
267
268
+
### Clusters having node pools using non-recommended B-Series
269
+
270
+
Cluster has one or more node pools using a non-recommended burstable VM SKU. With burstable VMs, full vCPU capability 100% is unguaranteed. Please make sure B-series VM's are not used in Production environment.
271
+
272
+
Learn more about [Kubernetes service - ClustersUsingBSeriesVMs (Clusters having node pools using non-recommended B-Series)](/azure/virtual-machines/sizes-b-series-burstable).
273
+
274
+
## MySQL
275
+
276
+
### Replication - Add a primary key to the table that currently does not have one
277
+
278
+
Based on our internal monitoring, we have observed significant replication lag on your replica server. This lag is occurring because the replica server is replaying relay logs on a table that lacks a primary key. To ensure that the replica server can effectively synchronize with the primary and keep up with changes, we highly recommend adding primary keys to the tables in the primary server and subsequently recreating the replica server.
279
+
280
+
Learn more about [Azure Database for MySQL flexible server - MySqlFlexibleServerReplicaMissingPKfb41 (Replication - Add a primary key to the table that currently does not have one)](/azure/mysql/how-to-troubleshoot-replication-latency#no-primary-key-or-unique-key-on-a-table).
281
+
282
+
### High Availability - Add primary key to the table that currently does not have one
283
+
284
+
Our internal monitoring system has identified significant replication lag on the High Availability standby server. This lag is primarily caused by the standby server replaying relay logs on a table that lacks a primary key. To address this issue and adhere to best practices, it is recommended to add primary keys to all tables. Once this is done, proceed to disable and then re-enable High Availability to mitigate the problem.
285
+
286
+
Learn more about [Azure Database for MySQL flexible server - MySqlFlexibleServerHAMissingPKcf38 (High Availability - Add primary key to the table that currently does not have one.)](/azure/mysql/how-to-troubleshoot-replication-latency#no-primary-key-or-unique-key-on-a-table).
287
+
268
288
## PostgreSQL
269
289
270
290
### Improve PostgreSQL availability by removing inactive logical replication slots
@@ -275,7 +295,7 @@ Learn more about [PostgreSQL server - OrcasPostgreSqlLogicalReplicationSlots (Im
275
295
276
296
### Improve PostgreSQL availability by removing inactive logical replication slots
277
297
278
-
Our internal telemetry indicates that your PostgreSQL flexible server may have inactive logical replication slots. THIS NEEDS IMMEDIATE ATTENTION. This can result in degraded server performance and unavailability due to WAL file retention and buildup of snapshot files. To improve performance and availability, we STRONGLY recommend that you IMMEDIATELY either delete the inactive replication slots, or start consuming the changes from these slots so that the slots' Log Sequence Number (LSN) advances and is close to the current LSN of the server.
298
+
Our internal telemetry indicates that your PostgreSQL flexible server may have inactive logical replication slots. THIS NEEDS IMMEDIATE ATTENTION. Inactive logical replication slots can result in degraded server performance and unavailability due to WAL file retention and buildup of snapshot files. To improve performance and availability, we STRONGLY recommend that you IMMEDIATELY either delete the inactive replication slots, or start consuming the changes from these slots so that the slots' Log Sequence Number (LSN) advances and is close to the current LSN of the server.
279
299
280
300
Learn more about [Azure Database for PostgreSQL flexible server - OrcasPostgreSqlFlexibleServerLogicalReplicationSlots (Improve PostgreSQL availability by removing inactive logical replication slots)](https://aka.ms/azure_postgresql_flexible_server_logical_decoding).
281
301
@@ -287,6 +307,36 @@ Some or all of your devices are using outdated SDK and we recommend you upgrade
287
307
288
308
Learn more about [IoT hub - UpgradeDeviceClientSdk (Upgrade device client SDK to a supported version for IotHub)](https://aka.ms/iothubsdk).
289
309
310
+
### IoT Hub Potential Device Storm Detected
311
+
312
+
This is when two or more devices are trying to connect to the IoT Hub using the same device ID credentials. When the second device (B) connects, it causes the first one (A) to become disconnected. Then (A) attempts to reconnect again, which causes (B) to get disconnected.
313
+
314
+
Learn more about [IoT hub - IoTHubDeviceStorm (IoT Hub Potential Device Storm Detected)](https://aka.ms/IotHubDeviceStorm).
315
+
316
+
### Upgrade Device Update for IoT Hub SDK to a supported version
317
+
318
+
Your Device Update for IoT Hub Instance is using an outdated version of the SDK. We recommend upgrading to the latest version for the latest fixes, performance improvements, and new feature capabilities.
319
+
320
+
Learn more about [IoT hub - DU_SDK_Advisor_Recommendation (Upgrade Device Update for IoT Hub SDK to a supported version)](/azure/iot-hub-device-update/understand-device-update).
321
+
322
+
### IoT Hub Quota Exceeded Detected
323
+
324
+
We have detected that your IoT Hub has exceeded its daily message quota. Consider adding units or increasing the SKU level to prevent this in the future.
325
+
326
+
Learn more about [IoT hub - IoTHubQuotaExceededAdvisor (IoT Hub Quota Exceeded Detected)](/azure/iot-hub/troubleshoot-error-codes#403002-iothubquotaexceeded).
327
+
328
+
### Upgrade device client SDK to a supported version for IotHub
329
+
330
+
Some or all of your devices are using outdated SDK and we recommend you upgrade to a supported version of SDK. See the details in the recommendation.
331
+
332
+
Learn more about [IoT hub - UpgradeDeviceClientSdk (Upgrade device client SDK to a supported version for IotHub)](https://aka.ms/iothubsdk).
333
+
334
+
### Upgrade Edge Device Runtime to a supported version for Iot Hub
335
+
336
+
Some or all of your Edge devices are using outdated versions and we recommend you upgrade to the latest supported version of the runtime. See the details in the recommendation.
337
+
338
+
Learn more about [IoT hub - UpgradeEdgeSdk (Upgrade Edge Device Runtime to a supported version for Iot Hub)](https://aka.ms/IOTEdgeSDKCheck).
339
+
290
340
## Azure Cosmos DB
291
341
292
342
### Configure Consistent indexing mode on your Azure Cosmos DB container
@@ -432,6 +482,12 @@ Learn more about [Machine - Azure Arc - ArcServerAgentVersion (Upgrade to the la
432
482
433
483
## Kubernetes
434
484
485
+
### Upgrade to Standard tier for mission-critical and production clusters
486
+
487
+
This cluster has more than 10 nodes and has not enabled the Standard tier. The Kubernetes Control Plane on the Free tier comes with limited resources and is not intended for production use or any cluster with 10 or more nodes.
488
+
489
+
Learn more about [Kubernetes service - UseStandardpricingtier (Upgrade to Standard tier for mission-critical and production clusters)](/azure/aks/uptime-sla).
490
+
435
491
### Pod Disruption Budgets Recommended
436
492
437
493
Pod Disruption Budgets Recommended. Improve service high availability.
@@ -446,7 +502,7 @@ Learn more about [Kubernetes - Azure Arc - Arc-enabled K8s agent version upgrade
446
502
447
503
## Media Services
448
504
449
-
### Increase Media Services quotas or limits to ensure continuity of service.
505
+
### Increase Media Services quotas or limits to ensure continuity of service
450
506
451
507
Your media account is about to hit its quota limits. Review current usage of Assets, Content Key Policies and Stream Policies for the media account. To avoid any disruption of service, you should request quota limits to be increased for the entities that are closer to hitting quota limit. You can request quota limits to be increased by opening a ticket and adding relevant details to it. Don't create additional Azure Media accounts in an attempt to obtain higher limits.
452
508
@@ -551,26 +607,32 @@ Learn more about [Recovery Services vault - Enable CRR (Enable Cross Region Rest
551
607
552
608
## Search
553
609
554
-
### You are close to exceeding storage quota of 2GB. Create a Standard search service.
610
+
### You are close to exceeding storage quota of 2GB. Create a Standard search service
555
611
556
612
You're close to exceeding storage quota of 2GB. Create a Standard search service. Indexing operations stop working when storage quota is exceeded.
557
613
558
614
Learn more about [Service limits in Azure Cognitive Search](/azure/search/search-limits-quotas-capacity).
559
615
560
-
### You are close to exceeding storage quota of 50MB. Create a Basic or Standard search service.
616
+
### You are close to exceeding storage quota of 50MB. Create a Basic or Standard search service
561
617
562
618
You're close to exceeding storage quota of 50MB. Create a Basic or Standard search service. Indexing operations stop working when storage quota is exceeded.
563
619
564
620
Learn more about [Service limits in Azure Cognitive Search](/azure/search/search-limits-quotas-capacity).
565
621
566
-
### You are close to exceeding your available storage quota. Add additional partitions if you need more storage.
622
+
### You are close to exceeding your available storage quota. Add additional partitions if you need more storage
567
623
568
624
you're close to exceeding your available storage quota. Add additional partitions if you need more storage. After exceeding storage quota, you can still query, but indexing operations no longer work.
569
625
570
626
Learn more about [Service limits in Azure Cognitive Search](/azure/search/search-limits-quotas-capacity).
571
627
572
628
## Storage
573
629
630
+
### You have ADLS Gen1 Accounts Which Need to be Migrated to ADLS Gen2
631
+
632
+
As previously announced, Azure Data Lake Storage Gen1 will be retired on February 29, 2024. We highly recommend migrating your data lake to Azure Data Lake Storage Gen2, which offers advanced capabilities specifically designed for big data analytics and is built on top of Azure Blob Storage.
633
+
634
+
Learn more about [Data lake store account - ADLSGen1_Deprecation (You have ADLS Gen1 Accounts Which Needs to be Migrated to ADLS Gen2)](https://azure.microsoft.com/updates/action-required-switch-to-azure-data-lake-storage-gen2-by-29-february-2024/).
635
+
574
636
### Enable Soft Delete to protect your blob data
575
637
576
638
After enabling Soft Delete, deleted data transitions to a soft deleted state instead of being permanently deleted. When data is overwritten, a soft deleted snapshot is generated to save the state of the overwritten data. You can configure the amount of time soft deleted data is recoverable before it permanently expires.
@@ -627,13 +689,13 @@ You have deployed your application multiple times over the last week. Deployment
627
689
628
690
Learn more about [App service - AppServiceStandardOrHigher (Move your App Service resource to Standard or higher and use deployment slots)](https://aka.ms/ant-staging).
629
691
630
-
### Consider scaling out your App Service Plan to optimize user experience and availability.
692
+
### Consider scaling out your App Service Plan to optimize user experience and availability
631
693
632
694
Consider scaling out your App Service Plan to at least two instances to avoid cold start delays and service interruptions during routine maintenance.
633
695
634
696
Learn more about [App Service plan - AppServiceNumberOfInstances (Consider scaling out your App Service Plan to optimize user experience and availability.)](https://aka.ms/appsvcnuminstances).
635
697
636
-
### Consider upgrading the hosting plan of the Static Web App(s) in this subscription to Standard SKU.
698
+
### Consider upgrading the hosting plan of the Static Web App(s) in this subscription to Standard SKU
637
699
638
700
The combined bandwidth used by all the Free SKU Static Web Apps in this subscription is exceeding the monthly limit of 100GB. Consider upgrading these apps to Standard SKU to avoid throttling.
0 commit comments