You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: learn-pr/azure/intro-to-azure-incident-readiness/includes/3-prepare-for-unexpected.md
+8-8Lines changed: 8 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,4 +1,4 @@
1
-
To ensure preparedness and minimize the impact of incidents, it's essential to follow the proactive recommendations outlined in this unit. These actions help you understand our incident communication process, locate pertinent information, and configure notifications to receive timely updates. Evaluating the resilience of your applications and implementing recommended measures create more reliable workloads, which reduces the potential impact of an incident. Reviewing and implementing security best practices fortify your environment and mitigate risks.
1
+
To ensure preparedness and minimize the impact of incidents, it's essential to follow the proactive recommendations outlined in this unit. These actions help you understand our incident communication process, locate pertinent information, and configure notifications to receive timely updates. Evaluating the resilience of your applications and implementing recommended measures create more reliable workloads, which reduce the potential impact of an incident. Reviewing and implementing security best practices fortify your environment and mitigate risks.
2
2
3
3
## To stay informed, mitigate impact, and protect your investment, we recommend the following five actions:
4
4
@@ -28,13 +28,13 @@ Key Features of the Services Issues pane:
28
28
29
29
-**Links and Downloadable Explanations**: Generate a link for the issue to use in your problem management system. Download PDF or CSV files to share comprehensive explanations with stakeholders who don't have access to the Azure portal. You can request a Post Incident Review (PIR) for any issues that affected your resources, previously known as Root Cause Analyses (RCAs).
30
30
31
-
#### Security Advisories pane
31
+
#### Security advisories pane
32
32
33
33
The Security advisories pane focuses on urgent security-related information affecting the health of your subscriptions and tenants. It provides insights into platform vulnerabilities, security incidents, and privacy breaches.
34
34
35
35
:::image type="content" source="../media/azure-service-health-security-advisories.png" alt-text="Screenshot of Azure Service Health security advisories.":::
36
36
37
-
Key Features of the Security Advisories Pane:
37
+
Key Features of the Security advisories pane:
38
38
39
39
-**Real-Time Security Insights**: Gain immediate visibility into Azure security incidents relevant to your subscriptions and tenants.
40
40
@@ -111,7 +111,7 @@ By configuring Service Health alerts and action groups effectively, you can ensu
111
111
112
112
> [!NOTE]
113
113
>
114
-
> Looking for assistance in what to monitor and which alerts you should configure for what? Look no further than the [Azure Monitor Baseline Alerts](https://aka.ms/alz/monitor/repo) solution. It provides comprehensive guidance and code for implementing a baseline of platform alerts and service health alert by using policies and initiatives in Azure environments. It offers options for automated or manual deployment.
114
+
> Looking for assistance in what to monitor and which alerts you should configure for what? Look no further than the [Azure Monitor Baseline Alerts](https://aka.ms/alz/monitor/repo) solution. It provides comprehensive guidance and code for implementing a baseline of platform alerts and service health alerts by using policies and initiatives in Azure environments. It offers options for automated or manual deployment.
115
115
>
116
116
> The solution includes predefined policies to automatically create alerts for all service health event types (service issue, planned maintenance, health advisories, & security advisories), action groups, and alert processing rules for various Azure resource types. While the focus is on monitoring Azure Landing Zones (ALZ) architected environments, it also offers guidance for brownfield customers who aren't currently aligned to the ALZ architecture brownfield.
117
117
@@ -129,7 +129,7 @@ You can also create [resource health alerts programmatically](/azure/service-hea
129
129
130
130
#### Scheduled events for virtual machines, avoiding impact
131
131
132
-
[Scheduled events](/azure/virtual-machines/linux/scheduled-events) is another great tool. Both alerts types decribed previously notify people or systems, but scheduled events notify the resources themselves. This approach can give your application time to prepare for virtual machine maintenance or one of our automated service healing events. It provides a signal about an imminent maintenance event, for example, an upcoming reboot, so that your application can know that and then act to limit disruption. Your application might drop itself out of the pool or otherwise degrade gracefully. Scheduled events are available for all Azure Virtual Machine types including PaaS and IaaS on both Windows and Linux.
132
+
[Scheduled events](/azure/virtual-machines/linux/scheduled-events) is another great tool. Both alerts types described previously notify people or systems, but scheduled events notify the resources themselves. This approach can give your application time to prepare for virtual machine maintenance or one of our automated service healing events. It provides a signal about an imminent maintenance event, for example, an upcoming reboot, so that your application can know that and then act to limit disruption. Your application might drop itself out of the pool or otherwise degrade gracefully. Scheduled events are available for all Azure Virtual Machine types including PaaS and IaaS on both Windows and Linux.
133
133
134
134
> [!NOTE]
135
135
>
@@ -161,19 +161,19 @@ To complement your work with the WAF, consider implementing the following top re
161
161
162
162
- Use the integrated [Reliability workbook](https://ms.portal.azure.com/#view/Microsoft_Azure_Expert/AdvisorMenuBlade/%7E/workbooks) in the Azure portal under the Azure Advisor page to assess the reliability posture of your applications, identify potential risks, and plan and implement improvements.
163
163
164
-
- Enhance business continuity and disaster recovery (BCDR) by deploying your workloads and resources across multiple regions. Ror optimal cross-region deployment options, see the comprehensive list of [Azure region pairs](/azure/reliability/cross-region-replication-azure#azure-cross-region-replication-pairings-for-all-geographies).
164
+
- Enhance business continuity and disaster recovery (BCDR) by deploying your workloads and resources across multiple regions. For optimal cross-region deployment options, see the comprehensive list of [Azure region pairs](/azure/reliability/cross-region-replication-azure#azure-cross-region-replication-pairings-for-all-geographies).
165
165
166
166
- Maximize availability within a region by distributing workload/resource deployments across [Availability Zones](/azure/reliability/availability-zones-overview).
167
167
168
168
- Consider using isolated virtual machine sizes in Azure for your business-critical workloads that require a high level of isolation. These sizes guarantee that your virtual machine is dedicated to a specific hardware type and operates independently. For more information, see [Virtual machine isolation in Azure](/azure/virtual-machines/isolation).
169
169
170
-
- Consider using [Maintenance Configurations](/azure/virtual-machines/maintenance-configurations#scopes) to have better control and management over updates for your Azure virtual machines. This feature allows you to schedule and manage updates, which ensures minimal disruption to sensitive workloads that can't tolerate downtime during maintenance activities.
170
+
- Consider using [Maintenance Configurations](/azure/virtual-machines/maintenance-configurations#scopes) to have better control and management over updates for your Azure virtual machines. This feature allows you to schedule and manage updates, which ensure minimal disruption to sensitive workloads that can't tolerate downtime during maintenance activities.
171
171
172
172
- Enhance redundancy by implementing inter or intra-region redundancy. For guidance, see the example of a [Highly available zone-redundant web application](/azure/architecture/reference-architectures/app-service-web-app/zone-redundant).
173
173
174
174
- Enhance the resilience of your applications by using [Azure Chaos Studio](https://azure.microsoft.com/products/chaos-studio/). With this tool, you can deliberately introduce controlled faults to your Azure applications. This tool allows you to assess their resilience and observe how they respond to various disruptions such as network latency, storage outages, expiring secrets, and datacenter failures.
175
175
176
-
- Use the [Service Retirement workbook](/azure/advisor/advisor-how-to-plan-migration-workloads-service-retirement) available in the Azure portal under the Azure Advisor page. This integrated tool helps you stay informed about any service retirements that might affect your critical workloads, which enables you to effectively plan and execute necessary migrations.
176
+
- Use the [Service Retirement workbook](/azure/advisor/advisor-how-to-plan-migration-workloads-service-retirement) available in the Azure portal under the Azure Advisor page. This integrated tool helps you stay informed about any service retirements that might affect your critical workloads, which enable you to effectively plan and execute necessary migrations.
177
177
178
178
> [!NOTE]
179
179
> Customers who have a Premier/Unified Support agreement can use the Customer Success team to strategize and implement a Well-Architected Framework assessment (WAF).
Copy file name to clipboardExpand all lines: learn-pr/azure/intro-to-azure-incident-readiness/includes/4-what-to-expect.md
+3-3Lines changed: 3 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -28,14 +28,14 @@ Although we generally don't share speculation or the inner workings of troublesh
28
28
29
29
Only a few times in the history of Azure have there been technical issues that prevented posting incident updates on `azure.status.microsoft`. In these extraordinary circumstances, we post incident updates by using X at @AzureSupport.
30
30
31
-
Regardless of the issue, customers should feel free to reach out to @AzureSupport for any questions relating to potential issues they're seeing or with support questions. The @AzureSupport team generally responds in less than 5 minutes. We're very proud of that record! During known issues, for example, if there's an outage listed in Service Health, the incident is already being worked on by the right engineers. There's potentially not much that the @AzureSupport team can do to help, beyond directing customers to the official engineering updates of what's happening.
31
+
Regardless of the issue, customers should feel free to reach out to @AzureSupport for any questions relating to potential issues they're seeing or with support questions. The @AzureSupport team generally responds in less than 5 minutes. We're very proud of that record! During known issues, for example, if there's an outage listed in Service Health, the incident is already being worked on by the right engineers. There's potentially not much that the @AzureSupport team can do, beyond directing customers to the official engineering updates.
32
32
33
33
4.**If your impact/issues don't match the incident (or if these persist after mitigation) [contact support](https://www.aka.ms/AzurePortalSupportRequest)**.
34
34
35
35
This message is the most important note for customers to understand about what to do or not to do during an incident. As mentioned previously, during known issues, such as an outage listed in Service Health, the incident is already being worked on by the right engineers.
36
36
37
-
Customers don't need to contact support for updates. They receive regular updates by using Service Health and their Service health alerts. Support engineers don't have access to any more detailed information than what is provided to affected customers. If customers read the updates from engineering but require support to respond to the incident, such as to implement their failover plans, they can and should raise a support ticket.
37
+
Customers don't need to contact support for updates. They receive regular updates by using Service Health and their Service health alerts. Support engineers don't have access to any more detailed information than what is provided to affected customers. If customers read engineering updates but need support to respond to the incident, such as to implement failover plans, they can and should raise a support ticket.
38
38
39
39
Similarly, if the symptoms they're noticing doesn't seem to 'line up' with the symptoms being described in the issue updates, it might be unrelated. For example, suppose there's a known issue with Redis Cache in US East, but a customer sees issues with a Redis Cache in US East 2. In such a case, the customer can and should raise a support ticket.
40
40
41
-
Finally, if a service issue is resolved or mitigated but the customer still sees issues with their services, support engineers can help them to understand if there's something special going on with their resources. In such a case, customers can and should raise a support ticket.
41
+
Finally, if a service issue is resolved or mitigated but the customer still sees issues with their services, support engineers can help find if there's something special going on with their resources. In such a case, customers can and should raise a support ticket.
Copy file name to clipboardExpand all lines: learn-pr/azure/intro-to-azure-incident-readiness/includes/5-after-an-incident.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,4 +40,4 @@ In order for Microsoft to consider an SLA credit request claim, you must submit
40
40
41
41
Our billing support teams validate which resources, services, and subscriptions were impacted. They calculate and apply any relevant SLA credits. We use commercially reasonable efforts to process claims during the subsequent month and within 45 days of receipt. If we determine that a service credit is owed to you, we apply the service credit to your applicable monthly service fees.
42
42
43
-
Service credits are your sole and exclusive remedy for any performance or availability issues for any service under the agreement the SLA. Previews and online services or service tiers provided free of charge aren't included or eligible for SLA claims or credits. The service credits awarded in any billing month for a particular service or service resource won't, under any circumstance, exceed your monthly service fees for that service or service resource, as applicable, in the billing month.
43
+
Service credits are your sole and exclusive remedy for any performance or availability issues for any service under the agreement the SLA. Previews and online services or service tiers provided free of charge aren't included or eligible for SLA claims or credits. The service credits awarded in any billing month for a particular service or service resource don't under any circumstance exceed your monthly service fees for that service or service resource, as applicable, in the billing month.
0 commit comments