Skip to content

Commit 6abde9f

Browse files
authored
Update troubleshoot-container-storage.md
Applied suggestions from acrolinx checks.
1 parent def51dc commit 6abde9f

File tree

1 file changed

+11
-17
lines changed

1 file changed

+11
-17
lines changed

articles/storage/container-storage/troubleshoot-container-storage.md

Lines changed: 11 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ ms.topic: how-to
1818

1919
After running `az aks create`, you might see the message *Azure Container Storage failed to install. AKS cluster is created. Please run `az aks update` along with `--enable-azure-container-storage` to enable Azure Container Storage*.
2020

21-
This message means that Azure Container Storage wasn't installed, but your AKS cluster was created properly.
21+
This message means that Azure Container Storage wasn't installed, but your AKS (Azure Kubernetes Service) cluster was created properly.
2222

2323
To install Azure Container Storage on the cluster and create a storage pool, run the following command. Replace `<cluster-name>` and `<resource-group>` with your own values. Replace `<storage-pool-type>` with `azureDisk`, `ephemeraldisk`, or `elasticSan`.
2424

@@ -28,7 +28,7 @@ az aks update -n <cluster-name> -g <resource-group> --enable-azure-container-sto
2828

2929
### Azure Container Storage fails to install due to Azure Policy restrictions
3030

31-
Azure Container Storage might fail to install if Azure Policy restrictions are in place. Specifically, Azure Container Storage relies on privileged containers, which can be blocked by Azure Policy. When this happens, the installation of Azure Container Storage might timeout or fail, and you might see errors in the `gatekeeper-controller` logs such as:
31+
Azure Container Storage might fail to install if Azure Policy restrictions are in place. Specifically, Azure Container Storage relies on privileged containers, which can be blocked by Azure Policy. When they are blocked, the installation of Azure Container Storage might time out or fail, and you might see errors in the `gatekeeper-controller` logs such as:
3232

3333
```output
3434
$ kubectl logs -n gatekeeper-system deployment/gatekeeper-controller
@@ -41,7 +41,7 @@ $ kubectl logs -n gatekeeper-system deployment/gatekeeper-controller
4141
{"level":"info","ts":1722622449.2412128,"logger":"webhook","msg":"denied admission: Privileged container is not allowed: ndm, securityContext: {\"privileged\": true}","hookType":"validation","process":"admission","details":{},"event_type":"violation","constraint_name":"azurepolicy-k8sazurev2noprivilege-686dd8b209a774ba977c","constraint_group":"constraints.gatekeeper.sh","constraint_api_version":"v1beta1","constraint_kind":"K8sAzureV2NoPrivilege","constraint_action":"deny","resource_group":"","resource_api_version":"v1","resource_kind":"Pod","resource_namespace":"acstor","resource_name":"azurecontainerstorage-ndm-b5nfg","request_username":"system:serviceaccount:kube-system:daemon-set-controller"}
4242
```
4343

44-
To resolve this, you’ll need to add the `acstor` namespace to the exclusion list of your Azure Policy. Azure Policy is used to create and enforce rules for managing resources within Azure, including AKS clusters. In some cases, policies might block the creation of Azure Container Storage pods and components. You can find more details on working with Azure Policy for Kubernetes by consulting [Azure Policy for Kubernetes](/azure/governance/policy/concepts/policy-for-kubernetes).
44+
To resolve the blocking, you need to add the `acstor` namespace to the exclusion list of your Azure Policy. Azure Policy is used to create and enforce rules for managing resources within Azure, including AKS clusters. In some cases, policies might block the creation of Azure Container Storage pods and components. You can find more details on working with Azure Policy for Kubernetes by consulting [Azure Policy for Kubernetes](/azure/governance/policy/concepts/policy-for-kubernetes).
4545

4646
To add the `acstor` namespace to the exclusion list, follow these steps:
4747

@@ -55,7 +55,7 @@ To add the `acstor` namespace to the exclusion list, follow these steps:
5555

5656
### Can't install and enable Azure Container Storage in node pools with taints
5757

58-
You may have configured [node taints](/azure/aks/use-node-taints) on the node pools to retrict pods from being scheduled on these node pools. When you try to install and enable Azure Container Storage on these noode pools, it will be blocked because the required pods can't be created in these node pools. This applies to both the system node pool when installing and the user node pools when enabling.
58+
You may have configured [node taints](/azure/aks/use-node-taints) on the node pools to restrict pods from being scheduled on these node pools. When you install and enable Azure Container Storage on these noode pools, it will be blocked because the required pods can't be created in these node pools. The behavior applies to both the system node pool when installing and the user node pools when enabling.
5959

6060
You can check the node taints with the following example:
6161

@@ -89,7 +89,7 @@ $ az aks nodepool list -g $resourceGroup --cluster-name $clusterName --query "[]
8989

9090
```
9191

92-
Retry the installing or enabling after you remove node taints successfully. After it's commpleted successfully, you can configure these node taints back to resume the pod scheduling restaints.
92+
Retry the installing or enabling after you remove node taints successfully. After it's completed successfully, you can configure node taints back to resume the pod scheduling restaints.
9393

9494
### Can't set storage pool type to NVMe
9595

@@ -103,15 +103,15 @@ To check the status of your storage pools, run `kubectl describe sp <storage-poo
103103

104104
### Error when trying to expand an Azure Disks storage pool
105105

106-
If your existing storage pool is less than 4 TiB (4,096 GiB), you can only expand it up to 4,095 GiB. If you try to expand beyond that, the internal PVC will get an error message like "Only Disk CachingType 'None' is supported for disk with size greater than 4095 GB" or ""Disk 'xxx' of size 4096 GB (<=4096 GB) cannot be resized to 16384 GB (>4096 GB) while it is attached to a running VM. Please stop your VM or detach the disk and retry the operation."
106+
If your existing storage pool is less than 4 TiB (4,096 GiB), you can only expand it up to 4,095 GiB. If you try to expand beyond that, the internal PVC will get an error message like "Only Disk CachingType 'None' is supported for disk with size greater than 4095 GB" or "Disk 'xxx' of size 4096 GB (<=4096 GB) cannot be resized to 16384 GB (>4096 GB) while it is attached to a running VM. Please stop your VM or detach the disk and retry the operation."
107107

108108
To avoid errors, don't attempt to expand your current storage pool beyond 4,095 GiB if it is initially smaller than 4 TiB (4,096 GiB). Storage pools larger than 4 TiB can be expanded up to the maximum storage capacity available.
109109

110110
This limitation only applies when using `Premium_LRS`, `Standard_LRS`, `StandardSSD_LRS`, `Premium_ZRS`, and `StandardSSD_ZRS` Disk SKUs.
111111

112112
### Elastic SAN creation fails
113113

114-
If you're trying to create an Elastic SAN storage pool, you might see the message *Azure Elastic SAN creation failed: Maximum possible number of Elastic SAN for the Subscription created already*. This means that you've reached the limit on the number of Elastic SAN resources that can be deployed in a region per subscription. You can check the limit here: [Elastic SAN scalability and performance targets](../elastic-san/elastic-san-scale-targets.md#elastic-san-scale-targets). Consider deleting any existing Elastic SAN resources on the subscription that are no longer being used, or try creating the storage pool in a different region.
114+
If you're trying to create an Elastic SAN storage pool, you might see the message *Azure Elastic SAN creation failed: Maximum possible number of Elastic SAN for the Subscription created already*. This means that you reach the limit on the number of Elastic SAN resources that can be deployed in a region per subscription. You can check the limit here: [Elastic SAN scalability and performance targets](../elastic-san/elastic-san-scale-targets.md#elastic-san-scale-targets). Consider deleting any existing Elastic SAN resources on the subscription that are no longer being used, or try creating the storage pool in a different region.
115115

116116
### No block devices found
117117

@@ -131,12 +131,6 @@ When disabling a storage pool type via `az aks update --disable-azure-container-
131131

132132
If you select Y, an automatic validation runs to ensure that there are no persistent volumes created from the storage pool. Selecting n bypasses this validation and disables the storage pool type, deleting any existing storage pools and potentially affecting your application.
133133

134-
### Can't delete resource group containing AKS cluster
135-
136-
If you created an Elastic SAN storage pool, you might not be able to delete the resource group in which your AKS cluster is located.
137-
138-
To resolve this, sign in to the [Azure portal](https://portal.azure.com?azure-portal=true) and select **Resource groups**. Locate the resource group that AKS created (the resource group name starts with **MC_**). Select the SAN resource object within that resource group. Manually remove all volumes and volume groups. Then retry deleting the resource group that includes your AKS cluster.
139-
140134
## Troubleshoot volume issues
141135

142136
### Pod pending creation due to ephemeral volume size above available capacity
@@ -188,11 +182,11 @@ ephemeraldisk-temp-diskpool-xbtlj 75660001280 75031990272 628011008 5609
188182

189183
In this example, the available capacity of temp disk for a single node is `75031990272` bytes or 69 GiB.
190184

191-
Adjust the volume storage size below available capacity and re-deploy your pod. See [Deploy a pod with a generic ephemeral volume](use-container-storage-with-temp-ssd.md#3-deploy-a-pod-with-a-generic-ephemeral-volume).
185+
Adjust the volume storage size below available capacity and redeploy your pod. See [Deploy a pod with a generic ephemeral volume](use-container-storage-with-temp-ssd.md#3-deploy-a-pod-with-a-generic-ephemeral-volume).
192186

193187
### Volume fails to attach due to metadata store offline
194188

195-
Azure Container Storage uses `etcd`, a distributed, reliable key-value store, to store and manage metadata of volumes to support volume orchestration operations. For high availability and resiliency, `etcd` runs in three pods. When there are less than two `etcd` instances running, Azure Container Storage will halt volume orchestration operations while still allowing data access to the volumes. Azure Container Storage automatically detects when an `etcd` instance is offline and recovers it. However, if you notice volume orchestration errors after restarting an AKS cluster, it's possible that an `etcd` instance failed to auto-recover. Follow the instructions in this section to determine the health status of the `etcd` instances.
189+
Azure Container Storage uses `etcd`, a distributed, reliable key-value store, to store and manage metadata of volumes to support volume orchestration operations. For high availability and resiliency, `etcd` runs in three pods. When there are less than two `etcd` instances running, Azure Container Storage will halt volume orchestration operations while still allowing data access to the volumes. Azure Container Storage automatically detects when an `etcd` instance is offline and recovers it. However, if you notice volume orchestration errors after restarting an AKS cluster, it's possible that an `etcd` instance failed to autorecover. Follow the instructions in this section to determine the health status of the `etcd` instances.
196190

197191
Run the following command to get a list of pods.
198192

@@ -213,7 +207,7 @@ Describe the pod:
213207
kubectl describe pod fiopod
214208
```
215209

216-
Typically, you'll see volume failure messages if the metadata store is offline. In this example, **fiopod** is in **ContainerCreating** status, and the **FailedAttachVolume** warning indicates that the creation is pending due to volume attach failure.
210+
Typically, you see volume failure messages if the metadata store is offline. In this example, **fiopod** is in **ContainerCreating** status, and the **FailedAttachVolume** warning indicates that the creation is pending due to volume attach failure.
217211

218212
```output
219213
Name: fiopod
@@ -243,7 +237,7 @@ etcd-azurecontainerstorage-phf92lmqml 1/1 Running
243237
etcd-azurecontainerstorage-xznvwcgq4p 1/1 Running 0 4d19h
244238
```
245239

246-
If fewer than two instances are shown in the Running state, you can conclude that the volume is failing to attach due to the metadata store being offline, and the automated recovery wasn't successful. If this is the case, file a support ticket with [Azure Support]( https://azure.microsoft.com/support/).
240+
If fewer than two instances are shown in the Running state, you can conclude that the volume is failing to attach due to the metadata store being offline, and the automated recovery wasn't successful. If so, file a support ticket with [Azure Support]( https://azure.microsoft.com/support/).
247241

248242
## See also
249243

0 commit comments

Comments
 (0)