You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
ms.custom: sap:Create, Upgrade, Scale and Delete operations (cluster or nodepool)
11
11
---
12
12
# Pod is stuck in CrashLoopBackOff mode
13
13
14
-
If a pod has a `CrashLoopBackOff` status, then the pod probably failed or exited unexpectedly, and the log contains an exit code that isn't zero. There are several possible reasons why your pod is stuck in `CrashLoopBackOff` mode. Consider the following options and their associated [kubectl](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands) commands.
14
+
If a pod has a `CrashLoopBackOff` status, then the pod probably failed or exited unexpectedly, and the log contains an exit code that isn't zero. Here are several possible reasons why your pod is stuck in `CrashLoopBackOff` mode:
15
+
16
+
1.**Application failure**: The application inside the container crashes shortly after starting, often due to misconfigurations, missing dependencies, or incorrect environment variables.
17
+
2.**Incorrect resource limits**: If the pod exceeds its CPU or memory resource limits, Kubernetes might kill the container. This issue can happen if resource requests or limits are set too low.
18
+
3.**Missing or misconfigured ConfigMaps/Secrets**: If the application relies on configuration files or environment variables stored in ConfigMaps or Secrets but they're missing or misconfigured, the application might crash.
19
+
4.**Image pull issues**: If there's an issue with the image (for example, it's corrupted or has an incorrect tag), the container might not start properly and fail repeatedly.
20
+
5.**Init containers failing**: If the pod has init containers and one or more fail to run properly, the pod will restart.
21
+
6.**Liveness/Readiness probe failures**: If liveness or readiness probes are misconfigured, Kubernetes might detect the container as unhealthy and restart it.
22
+
7.**Application dependencies not ready**: The application might depend on services that aren't yet ready, such as databases, message queues, or other APIs.
23
+
8.**Networking issues**: Network misconfigurations can prevent the application from communicating with necessary services, causing it to fail.
24
+
9.**Invalid commands or arguments**: The container might be started with an invalid `ENTRYPOINT`, command, or argument, leading to a crash.
25
+
26
+
For more information about the container status, see [Pod Lifecycle - Container states](https://kubernetes.io/docs/concepts/workloads/pods/pod-lifecycle/#container-states).
27
+
28
+
Consider the following options and their associated [kubectl](https://kubernetes.io/docs/reference/generated/kubectl/kubectl-commands) commands.
Copy file name to clipboardExpand all lines: support/azure/azure-kubernetes/storage/fail-to-mount-azure-disk-volume.md
+47-21Lines changed: 47 additions & 21 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
---
2
-
title: Unable to mount Azure disk volumes
2
+
title: Unable to Mount Azure Disk Volumes
3
3
description: Describes errors that occur when mounting Azure disk volumes fails, and provides solutions.
4
-
ms.date: 09/06/2024
4
+
ms.date: 03/22/2025
5
5
author: genlin
6
6
ms.author: genli
7
7
ms.reviewer: chiragpa, akscsscic, v-weizhu
@@ -14,17 +14,18 @@ This article provides solutions for errors that cause the mounting of Azure disk
14
14
15
15
## Symptoms
16
16
17
-
You're trying to deploy a Kubernetes resource such as a Deployment or a StatefulSet, in an Azure Kubernetes Service (AKS) environment. The deployment will create a pod that should mount a PersistentVolumeClaim (PVC) referencing an Azure disk.
17
+
You're trying to deploy a Kubernetes resource, such as a Deployment or a StatefulSet, in an Azure Kubernetes Service (AKS) environment. The deployment creates a pod that should mount a PersistentVolumeClaim (PVC) that references an Azure disk.
18
18
19
-
However, the pod stays in the **ContainerCreating** status. When you run the `kubectl describe pods` command, you may see one of the following errors, which causes the mounting operation to fail:
19
+
However, the pod stays in the **ContainerCreating** status. When you run the `kubectl describe pods` command, you may see one of the following errors that cause the mounting operation to fail:
20
20
21
21
-[Disk cannot be attached to the VM because it is not in the same zone as the VM](#error1)
22
22
-[Client '\<client-ID>' with object id '\<object-ID>' doesn't have authorization to perform action over scope '\<disk name>' or scope is invalid](#error2)
23
23
-[Volume is already used by pod](#error3)
24
24
-[StorageAccountType UltraSSD_LRS can be used only when additionalCapabilities.ultraSSDEnabled is set](#error4)
25
25
-[ApplyFSGroup failed for vol](#error5)
26
+
-[Node(s) exceed max volume count](#error6)
26
27
27
-
See the following sections for error details, possible causes and solutions.
28
+
See the following sections for error details, possible causes, and solutions.
28
29
29
30
## <aid="error1"></a>Disk cannot be attached to the VM because it is not in the same zone as the VM
30
31
@@ -47,13 +48,13 @@ RawError:
47
48
48
49
### Cause: Disk and node hosting pod are in different zones
49
50
50
-
In AKS, the default and other built-in StorageClasses for Azure disks use [locally redundant storage (LRS)](/azure/storage/common/storage-redundancy#locally-redundant-storage). These disks are deployed in [availability zones](/azure/aks/availability-zones). If you use the node pool in AKS with availability zones, and the pod is scheduled on a node that's in another availability zone different from the disk, you may get this error.
51
+
In AKS, the default and other built-in storage classes for Azure disks use [locally redundant storage (LRS)](/azure/storage/common/storage-redundancy#locally-redundant-storage). These disks are deployed in [availability zones](/azure/aks/availability-zones). If you use the node pool in AKS together with availability zones, and the pod is scheduled on a node that's in another availability zone that's different from the disk, you might experience this error.
51
52
52
-
To resolve this error, use one of the following solutions:
53
+
To resolve this error, use one of the following solutions.
53
54
54
55
### Solution 1: Ensure disk and node hosting the pod are in the same zone
55
56
56
-
To make sure the disk and node that hosts the pod are in the same availability zone, use [node affinity](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
57
+
To make sure that the disk and node that host the pod are in the same availability zone, use [node affinity](https://kubernetes.io/docs/tasks/configure-pod-container/assign-pods-nodes-using-node-affinity/).
57
58
58
59
Refer to the following script as an example:
59
60
@@ -69,19 +70,19 @@ affinity:
69
70
- <region>-Y
70
71
```
71
72
72
-
\<region> is the region of the AKS cluster. `Y` represents the availability zone of the disk, for example, westeurope-3.
73
+
\<region> is the region of the AKS cluster. `Y` represents the availability zone of the disk (for example, westeurope-3).
73
74
74
75
### Solution 2: Use zone-redundant storage (ZRS) disks
75
76
76
77
[ZRS](/azure/storage/common/storage-redundancy#zone-redundant-storage) disk volumes can be scheduled on all zone and non-zone agent nodes. For more information, see [Azure disk availability zone support](/azure/aks/availability-zones#azure-disk-availability-zone-support).
77
78
78
-
To use a ZRS disk, create a new storage class with `Premium_ZRS` or `StandardSSD_ZRS`, and then deploy the PersistentVolumeClaim (PVC) referencing the storage.
79
+
To use a ZRS disk, create a storage class by using `Premium_ZRS` or `StandardSSD_ZRS`, and then deploy the PersistentVolumeClaim (PVC) that references the storage.
79
80
80
81
For more information about parameters, see [Driver Parameters](/azure/aks/azure-csi-files-storage-provision#storage-class-parameters-for-dynamic-persistentvolumes)
81
82
82
83
### Solution 3: Use Azure Files
83
84
84
-
[Azure Files](/azure/storage/files/storage-files-introduction) is mounted by using NFS or SMB throughout network and it's not associated with availability zones.
85
+
[Azure Files](/azure/storage/files/storage-files-introduction) is mounted by using NFS or SMB throughout network. It's not associated with availability zones.
85
86
86
87
For more information, see the following articles:
87
88
@@ -107,11 +108,11 @@ RawError:
107
108
108
109
### Cause: AKS identity doesn't have required authorization over disk
109
110
110
-
AKS cluster's identity doesn't have the required authorization over the Azure disk. This issue occurs when the disk is created in another resource group other than the infrastructure resource group of the AKS cluster.
111
+
AKS cluster's identity doesn't have the required authorization over the Azure disk. This issue occurs if the disk is created in a resource group other than the infrastructure resource group of the AKS cluster.
111
112
112
113
### Solution: Create role assignment that includes required authorization
113
114
114
-
Create a role assignment that includes the authorization required as per the error. We recommend that you use a [Contributor](/azure/role-based-access-control/built-in-roles/general#contributor) role. If you want to use another built-in role, see [Azure built-in roles](/azure/role-based-access-control/built-in-roles).
115
+
Create a role assignment that includes the authorization required per the error. We recommend that you use a [Contributor](/azure/role-based-access-control/built-in-roles/general#contributor) role. If you want to use another built-in role, see [Azure built-in roles](/azure/role-based-access-control/built-in-roles).
115
116
116
117
To assign a Contributor role, use one of the following methods:
117
118
@@ -135,9 +136,9 @@ Here are details of this error:
135
136
136
137
### Cause: Disk is mounted to multiple pods hosted on different nodes
137
138
138
-
An Azure disk can be mounted only as [ReadWriteOnce](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes), which makes it available to one node in AKS. That means it can be attached to only one node and mounted only to a pod hosted by that node. If you mount the same disk to a pod on another node, you'll get this error because the disk is already attached to a node.
139
+
An Azure disk can be mounted only as [ReadWriteOnce](https://kubernetes.io/docs/concepts/storage/persistent-volumes/#access-modes). This makes it available to one node in AKS. That means that it can be attached to only one node and mounted to only a pod that's hosted by that node. If you mount the same disk to a pod on another node, you experience this error because the disk is already attached to a node.
139
140
140
-
### Solution: Ensure disk isn't mounted by multiple pods hosted on different nodes
141
+
### Solution: Make sure disk isn't mounted by multiple pods hosted on different nodes
141
142
142
143
To resolve this error, refer to [Multi-Attach error](https://github.com/andyzhangx/demo/blob/master/issues/azuredisk-issues.md#25-multi-attach-error).
### Cause: Ultra disk is attached to node pool with ultra disks disabled
165
166
166
-
This error indicates that an [ultra disk](/azure/virtual-machines/disks-enable-ultra-ssd) is trying to be attached to a node pool with ultra disks disabled. By default, an ultra disk is disabled on AKS node pools.
167
+
This error indicates that an [ultra disk](/azure/virtual-machines/disks-enable-ultra-ssd) is trying to be attached to a node pool by having ultra disks disabled. By default, an ultra disk is disabled on AKS node pools.
167
168
168
169
### Solution: Create a node pool that can use ultra disks
169
170
170
-
To use ultra disks on AKS, create a node pool with ultra disks support by using the `--enable-ultra-ssd` flag. For more information, see [Use Azure ultra disks on Azure Kubernetes Service](/azure/aks/use-ultra-disks).
171
+
To use ultra disks on AKS, create a node pool that has ultra disks support by using the `--enable-ultra-ssd` flag. For more information, see [Use Azure ultra disks on Azure Kubernetes Service](/azure/aks/use-ultra-disks).
171
172
172
173
## <a id="error5"></a>ApplyFSGroup failed for vol
173
174
@@ -177,20 +178,45 @@ Here are details of this error:
177
178
178
179
### Cause: Changing ownership and permissions for large volume takes much time
179
180
180
-
When there's a large number of files already present in the volume, if a `securityContext` with `fsGroup` is in place, this error may occur. When there are lots of files and directories under one volume, changing the group ID would consume much time. It's also mentioned in the Kubernetes official documentation [Configure volume permission and ownership change policy for Pods](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods):
181
+
If there are many files already present in the volume, and if a `securityContext` that uses `fsGroup` exists, this error might occur. If there are lots of files and directories in one volume, changing the group ID would consume excessive time. Additionally, the Kubernetes official documentation [Configure volume permission and ownership change policy for Pods](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods) mentions this situation:
181
182
182
183
"By default, Kubernetes recursively changes ownership and permissions for the contents of each volume to match the `fsGroup` specified in a Pod's `securityContext` when that volume is mounted. For large volumes, checking and changing ownership and permissions can take much time, slowing Pod startup. You can use the `fsGroupChangePolicy` field inside a `securityContext` to control the way that Kubernetes checks and manages ownership and permissions for a volume."
183
184
184
185
### Solution: Set fsGroupChangePolicy field to OnRootMismatch
185
186
186
-
To resolve this error, we recommend that you set `fsGroupChangePolicy: "OnRootMismatch"` in the `securityContext` of a Deployment, a StatefulSet or a pod.
187
+
To resolve this error, we recommend that you set `fsGroupChangePolicy: "OnRootMismatch"` in the `securityContext` of a Deployment, a StatefulSet, or a pod.
187
188
188
-
OnRootMismatch: Only change permissions and ownership if permission and ownership of root directory doesn't match with expected permissions of the volume. This setting could help shorten the time it takes to change ownership and permission of a volume.
189
+
OnRootMismatch: Change permissions and ownership only if permission and ownership of the root directory doesn't match the expected permissions of the volume. This setting could help shorten the time that it takes to change ownership and permission of a volume.
189
190
190
191
For more information, see [Configure volume permission and ownership change policy for Pods](https://kubernetes.io/docs/tasks/configure-pod-container/security-context/#configure-volume-permission-and-ownership-change-policy-for-pods).
191
192
192
-
## More information
193
+
## <a id="error6"></a>Node(s) exceed max volume count
194
+
195
+
Here are details of this error:
196
+
197
+
```output
198
+
Events:
199
+
Type Reason Age From Message
200
+
---- ------ ---- ---- -------
201
+
Warning FailedScheduling 25s default-scheduler 0/8 nodes are available: 8 node(s) exceed max volume count. preemption: 0/8 nodes are available: 8 No preemption victims found for incoming pod..
202
+
```
203
+
### Cause: Maximum disk limit is reached
204
+
205
+
The node has reached its maximum disk capacity. In AKS, the number of disks per node depends on the VM size that's configured for the node pool.
206
+
207
+
### Solution
208
+
209
+
To resolve the issue, use one of the following methods:
210
+
211
+
- Add a new node pool with a VM size that supports more disk limit.
212
+
- Scale the node pool.
213
+
- Delete existing disks from the node.
214
+
215
+
Additionally, make sure that the number of disks per node does not exceed the [Kubernetes default limits](https://kubernetes.io/docs/concepts/storage/storage-limits/#kubernetes-default-limits).
216
+
217
+
## More information
193
218
194
219
For more Azure Disk known issues, see [Azure disk plugin known issues](https://github.com/andyzhangx/demo/blob/master/issues/azuredisk-issues.md).
195
220
196
221
[!INCLUDE [Azure Help Support](../../../includes/azure-help-support.md)]
0 commit comments