You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: scenarios/AKSClientSecretError/aks-client-secret-error.md
+2-2Lines changed: 2 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -40,8 +40,8 @@ The issue that generates this service principal alert usually occurs for one of
40
40
Use the following commands to retrieve the service principal profile for your AKS cluster and check the expiration date of the service principal. Make sure to set the appropriate variables for your AKS resource group and cluster name.
41
41
42
42
```azurecli
43
-
SP_ID=$(az aks show --resource-group RESOURCE_GROUP_NAME \
44
-
--name AKS_CLUSTER_NAME \
43
+
SP_ID=$(az aks show --resource-group $RESOURCE_GROUP_NAME \
Copy file name to clipboardExpand all lines: scenarios/AKSHealthProbeMode/aks-health-probe-mode.md
+15-12Lines changed: 15 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -42,7 +42,7 @@ To troubleshoot these issues, follow these steps:
42
42
```azurecli
43
43
export RESOURCE_GROUP="aks-rg"
44
44
export AKS_CLUSTER_NAME="aks-cluster"
45
-
az aks show --resource-group $RESOURCE_GROUP --name $AKS_CLUSTER_NAME --query "loadBalancerProfile"
45
+
az aks show --resource-group $RESOURCE_GROUP --name $AKS_CLUSTER_NAME --query "networkProfile.loadBalancerProfile"
46
46
```
47
47
Results:
48
48
@@ -66,24 +66,27 @@ To troubleshoot these issues, follow these steps:
66
66
}
67
67
```
68
68
69
-
2. Check the *overlaymgr* log to see if the cloud provider secret is updated. The keyword to look for is `cloudConfigSecretResolver`. Or check the contents of the cloud-provider-config secret in the `ccp` namespace. You can use the `kubectl get secret` command to view the secret.
69
+
2. Check the cloud provider configuration. In modern AKS clusters, the cloud provider configuration is managed internally and the `ccp` namespace doesn't exist. Instead, check for cloudprovider related resources and verify the cloud-node-manager pods are running properly:
70
70
71
-
```shell
72
-
kubectl get secret cloud-provider-config -n ccp -o yaml
71
+
72
+
```bash
73
+
# Check for cloud provider related ConfigMaps in kube-system
74
+
kubectl get configmap -n kube-system | grep -i azure
75
+
76
+
# Check if cloud-node-manager pods are running (indicates cloud provider integration is working)
77
+
kubectl get pods -n kube-system | grep cloud-node-manager
78
+
79
+
# Check the azure-ip-masq-agent-config if it exists
80
+
kubectl get configmap azure-ip-masq-agent-config-reconciled -n kube-system -o yaml 2>/dev/null || echo "ConfigMap not found"
3. Check the chart or overlay daemonset cloud-node-manager to see if the health-probe-proxy sidecar container is enabled. You can use the `kubectl get ds` command to view the daemonset.
Copy file name to clipboardExpand all lines: scenarios/AKSPreviewAPILifecycle/aks-preview-api-lifecycle.md
+1-1Lines changed: 1 addition & 1 deletion
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -30,7 +30,7 @@ API version as deprecation approaches.
30
30
If you're unsure what client or tool is using this API version, check the [activity logs](/azure/azure-monitor/essentials/activity-log)
31
31
using the following command:
32
32
33
-
Set the API version you want to inspect for recent usage in the activity log.
33
+
Set the API version you want to inspect for recent usage in the activity log. In this example, we are checking for the `2022-04-01-preview` API version.
Copy file name to clipboardExpand all lines: scenarios/AzureCNIPodSubnet/azure-cni-pod-subnet.md
+2Lines changed: 2 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -23,6 +23,8 @@ Azure CNI Pod Subnet assigns IP addresses to pods from a separate subnet from yo
23
23
- Azure CLI version `2.37.0` or later and the `aks-preview` extension version `2.0.0b2` or later.
24
24
- Register the subscription-level feature flag for your subscription: 'Microsoft.ContainerService/AzureVnetScalePreview'.
25
25
26
+
## Enable Container Insights (AKS monitoring)
27
+
26
28
If you have an existing cluster, you can enable Container Insights (AKS monitoring) using the following command **only if your cluster was created with monitoring enabled or is associated with a valid Log Analytics Workspace in the same region**. Otherwise, refer to Microsoft Docs for additional workspace setup requirements.
Copy file name to clipboardExpand all lines: scenarios/CSEErrorsAKS/cse-errors-aks.md
+3-2Lines changed: 3 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -107,7 +107,8 @@ Set up your custom Domain Name System (DNS) server so that it can do name resolu
107
107
> **Important:** You must specify the `--name` of a valid VM in an availability set in your resource group. Here is a template for running network checks.
108
108
109
109
```azurecli
110
-
export DNS_IP_ADDRESS="10.0.0.10"
110
+
export API_FQDN=$(az aks show --resource-group $RG_NAME --name $CLUSTER_NAME --query fqdn -o tsv)
111
+
111
112
az vm run-command invoke \
112
113
--resource-group $NODE_RESOURCE_GROUP \
113
114
--name $AVAILABILITY_SET_VM \
@@ -121,7 +122,7 @@ Set up your custom Domain Name System (DNS) server so that it can do name resolu
121
122
--command-id RunShellScript \
122
123
--output tsv \
123
124
--query "value[0].message" \
124
-
--scripts "nslookup <api-fqdn> $DNS_IP_ADDRESS"
125
+
--scripts "nslookup $API_FQDN $DNS_IP_ADDRESS"
125
126
```
126
127
127
128
For more information, see [Name resolution for resources in Azure virtual networks](/azure/virtual-network/virtual-networks-name-resolution-for-vms-and-role-instances) and [Hub and spoke with custom DNS](/azure/aks/private-clusters#hub-and-spoke-with-custom-dns).
This command will display binary archive data in the terminal if the download succeeds.
76
+
This command checks if the endpoint is reachable and returns the HTTP headers. If you see a `200 OK` response, it indicates that the endpoint is accessible.
76
77
77
78
Next, attempt a download with validation and save the file locally for further troubleshooting. This will help determine if SSL or outbound connectivity is correctly configured.
-rw-r--r-- 1 user user 6651392 Jun 20 10:30 azure-vnet-cni-linux-amd64-v1.0.25.tgz
115
+
116
+
/tmp/cni-test/azure-vnet-cni-linux-amd64-v1.0.25.tgz: gzip compressed data, from Unix, original size modulo 2^32 20070400
117
+
```
118
+
119
+
Clean up the test files:
120
+
121
+
```bash
122
+
rm -rf /tmp/cni-test/
102
123
```
103
124
104
125
If you can't download these files, make sure that traffic is allowed to the downloading endpoint. For more information, see [Azure Global required FQDN/application rules](/azure/aks/outbound-rules-control-egress#azure-global-required-fqdn--application-rules).
Copy file name to clipboardExpand all lines: scenarios/ForbiddenErrorAKS/forbidden-error-aks.md
+15-2Lines changed: 15 additions & 2 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -44,14 +44,28 @@ az aks show -g $RESOURCE_GROUP -n $CLUSTER_NAME --query aadProfile.enableAzureRb
44
44
45
45
Results:
46
46
47
-
<!-- expected_similarity=0.3 -->
48
47
```output
49
48
false
50
49
```
51
50
51
+
- If the result is **null** or empty, the cluster doesn't have Azure AD integration enabled. See [Solving permission issues in local Kubernetes RBAC clusters](#solving-permissions-issues-in-local-kubernetes-rbac-clusters).
52
52
- If the result is **false**, the cluster uses Kubernetes RBAC. See [Solving permission issues in Kubernetes RBAC-based AKS clusters](#solving-permissions-issues-in-kubernetes-rbac-based-aks-clusters).
53
53
- If the result is **true**, the cluster uses Azure RBAC. See [Solving permission issues in Azure RBAC-based AKS clusters](#solving-permissions-issues-in-azure-rbac-based-aks-clusters).
54
54
55
+
### Solving permissions issues in local Kubernetes RBAC clusters
56
+
57
+
If your cluster doesn't have Azure AD integration (result was null), it uses cluster admin credentials:
58
+
59
+
```bash
60
+
# Get admin credentials for full access
61
+
az aks get-credentials --resource-group $RESOURCE_GROUP --name $CLUSTER_NAME --admin
62
+
63
+
# Verify access
64
+
kubectl get nodes
65
+
```
66
+
67
+
**Warning**: Admin credentials provide full cluster access. Use carefully and consider enabling Azure AD integration for better security.
68
+
55
69
### Solving permissions issues in Kubernetes RBAC-based AKS clusters
56
70
57
71
If the cluster uses Kubernetes RBAC, permissions for the user account are configured through the creation of RoleBinding or ClusterRoleBinding Kubernetes resources. For more information, see [Kubernetes RBAC documentation](https://kubernetes.io/docs/reference/access-authn-authz/rbac/).
@@ -74,7 +88,6 @@ You can create a custom RoleBinding or ClusterRoleBinding resource to grant the
Copy file name to clipboardExpand all lines: scenarios/KubeletIOTroubleshooting/kubelet-io-troubleshooting.md
+11Lines changed: 11 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -16,6 +16,17 @@ keywords:
16
16
17
17
TCP timeouts can be caused by blockages of internal traffic that runs between nodes. To investigate TCP time-outs, verify that this traffic isn't being blocked, for example, by [network security groups](/azure/aks/concepts-security#azure-network-security-groups) (NSGs) on the subnet for your cluster nodes.
18
18
19
+
## Connect to the cluster
20
+
21
+
First, connect to your Azure Kubernetes Service (AKS) cluster by running the following command:
22
+
23
+
```bash
24
+
export RESOURCE_GROUP=<your-resource-group>
25
+
export CLUSTER_NAME=<your-cluster-name>
26
+
27
+
az aks get-credentials --resource-group $RESOURCE_GROUP --name $CLUSTER_NAME
28
+
```
29
+
19
30
## Symptoms
20
31
21
32
Tunnel functionalities, such as `kubectl logs` and code execution, work only for pods that are hosted on nodes on which tunnel service pods are deployed. Pods on other nodes that have no tunnel service pods cannot reach to the tunnel. When viewing the logs of these pods, you receive the following error message:
Copy file name to clipboardExpand all lines: scenarios/NodeNotReadyAKS/node-not-ready-aks.md
+25-6Lines changed: 25 additions & 6 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -51,12 +51,31 @@ The [kubelet](https://kubernetes.io/docs/reference/command-line-tools-reference/
51
51
Examine the output of the `kubectl describe nodes` command to find the [Conditions](https://kubernetes.io/docs/reference/node/node-status/#condition) field and the [Capacity and Allocatable](https://kubernetes.io/docs/reference/node/node-status/#capacity) blocks. Do the content of these fields appear as expected? (For example, in the **Conditions** field, does the `message` property contain the "kubelet is posting ready status" string?) In this case, if you have direct Secure Shell (SSH) access to the node, check the recent events to understand the error. Look within the */var/log/syslog* file instead of */var/log/messages* (not available on all distributions). Or, generate the kubelet and container daemon log files by running the following shell commands:
52
52
53
53
```bash
54
-
# To check syslog file (useful on Ubuntu-based AKS nodes),
55
-
cat /var/log/syslog
56
-
57
-
# To check kubelet and containerd daemon logs,
58
-
journalctl -u kubelet > kubelet.log
59
-
journalctl -u containerd > containerd.log
54
+
# First, identify the NotReady node
55
+
export NODE_NAME=$(kubectl get nodes --no-headers | grep NotReady | awk '{print $1}'| head -1)
0 commit comments