Skip to content

Commit afc2abb

Browse files
authored
Merge pull request #218436 from MGoedtel/task1115
Reintroduce concepts-diagnostics article
2 parents 6f37919 + 1743148 commit afc2abb

File tree

3 files changed

+100
-1
lines changed

3 files changed

+100
-1
lines changed

.openpublishing.redirection.json

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -790,7 +790,7 @@
790790
},
791791
{
792792
"source_path_from_root": "/articles/aks/concepts-diagnostics.md",
793-
"redirect_url": "/troubleshoot/azure/azure-kubernetes/welcome-azure-kubernetes",
793+
"redirect_url": "/azure/aks/aks-diagnostics",
794794
"redirect_document_id": false
795795
},
796796
{

articles/aks/TOC.yml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -186,6 +186,8 @@
186186
href: cluster-configuration.md
187187
- name: Custom node configuration
188188
href: custom-node-configuration.md
189+
- name: Configure AKS Diagnostics
190+
href: aks-diagnostics.md
189191
- name: Integrate ACR with an AKS cluster
190192
href: cluster-container-registry-integration.md
191193
- name: Use Vertical Pod Autoscaler (preview)

articles/aks/aks-diagnostics.md

Lines changed: 97 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,97 @@
1+
---
2+
title: Azure Kubernetes Service (AKS) Diagnostics Overview
3+
description: Learn about self-diagnosing clusters in Azure Kubernetes Service.
4+
services: container-service
5+
ms.topic: conceptual
6+
ms.date: 11/15/2022
7+
---
8+
9+
# Azure Kubernetes Service Diagnostics (preview) overview
10+
11+
Troubleshooting Azure Kubernetes Service (AKS) cluster issues plays an important role in maintaining your cluster, especially if your cluster is running mission-critical workloads. AKS Diagnostics (preview) is an intelligent, self-diagnostic experience that:
12+
13+
* Helps you identify and resolve problems in your cluster.
14+
* Is cloud-native.
15+
* Requires no extra configuration or billing cost.
16+
17+
[!INCLUDE [preview features callout](./includes/preview/preview-callout.md)]
18+
19+
## Open AKS Diagnostics
20+
21+
To access AKS Diagnostics:
22+
23+
1. Sign in to the [Azure portal](https://portal.azure.com)
24+
1. From **All services** in the Azure portal, select **Kubernetes Service**.
25+
1. Select **Diagnose and solve problems** in the left navigation, which opens AKS Diagnostics.
26+
1. Choose a category that best describes the issue of your cluster, like _Cluster Node Issues_, by:
27+
28+
* Using the keywords in the homepage tile.
29+
* Typing a keyword that best describes your issue in the search bar.
30+
31+
![Homepage](./media/concepts-diagnostics/aks-diagnostics-homepage.png)
32+
33+
## View a diagnostic report
34+
35+
After you click on a category, you can view a diagnostic report specific to your cluster. Diagnostic reports intelligently call out any issues in your cluster with status icons. You can drill down on each topic by clicking **More Info** to see a detailed description of:
36+
37+
* Issues
38+
* Recommended actions
39+
* Links to helpful docs
40+
* Related-metrics
41+
* Logging data
42+
43+
Diagnostic reports generate based on the current state of your cluster after running various checks. They can be useful for pinpointing the problem of your cluster and understanding next steps to resolve the issue.
44+
45+
![Diagnostic Report](./media/concepts-diagnostics/diagnostic-report.png)
46+
47+
![Expanded Diagnostic Report](./media/concepts-diagnostics/node-issues.png)
48+
49+
## Cluster insights
50+
51+
The following diagnostic checks are available in **Cluster Insights**.
52+
53+
### Cluster Node Issues
54+
55+
Cluster Node Issues checks for node-related issues that cause your cluster to behave unexpectedly. Specifically:
56+
57+
- Node readiness issues
58+
- Node failures
59+
- Insufficient resources
60+
- Node missing IP configuration
61+
- Node CNI failures
62+
- Node not found
63+
- Node power off
64+
- Node authentication failure
65+
- Node kube-proxy stale
66+
67+
### Create, read, update & delete (CRUD) operations
68+
69+
CRUD Operations checks for any CRUD operations that cause issues in your cluster. Specifically:
70+
71+
- In-use subnet delete operation error
72+
- Network security group delete operation error
73+
- In-use route table delete operation error
74+
- Referenced resource provisioning error
75+
- Public IP address delete operation error
76+
- Deployment failure due to deployment quota
77+
- Operation error due to organization policy
78+
- Missing subscription registration
79+
- VM extension provisioning error
80+
- Subnet capacity
81+
- Quota exceeded error
82+
83+
### Identity and security management
84+
85+
Identity and Security Management detects authentication and authorization errors that prevent communication to your cluster. Specifically,
86+
87+
- Node authorization failures
88+
- 401 errors
89+
- 403 errors
90+
91+
## Next steps
92+
93+
* Collect logs to help you further troubleshoot your cluster issues by using [AKS Periscope](https://aka.ms/aksperiscope).
94+
95+
* Read the [triage practices section](/azure/architecture/operator-guides/aks/aks-triage-practices) of the AKS day-2 operations guide.
96+
97+
* Post your questions or feedback at [UserVoice](https://feedback.azure.com/d365community/forum/aabe212a-f724-ec11-b6e6-000d3a4f0da0) by adding "[Diag]" in the title.

0 commit comments

Comments
 (0)