Skip to content

Commit 72f1b32

Browse files
author
Soumya Maitra
committed
Added documentation for resolving csn storage stuck pod
1 parent 2cb07c7 commit 72f1b32

File tree

1 file changed

+61
-0
lines changed

1 file changed

+61
-0
lines changed
Lines changed: 61 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,61 @@
1+
---
2+
title: Troubleshoot CSN storage pod container stuck in creating
3+
description: Learn what to do when you get CSN storage pod container remains in creating state.
4+
ms.service: azure-operator-nexus
5+
ms.custom: troubleshooting
6+
ms.topic: troubleshooting
7+
ms.date: 12/21/2023
8+
ms.author: soumyamaitra
9+
author: neilverse
10+
---
11+
12+
# CSN storage pod container stuck in `ContainerCreating`
13+
14+
## Cause
15+
16+
A runtime-upgrade replaces the operating system of the Baremetal nodes, which recreates the IQN (iSCSI Qualified Name)
17+
and can cause iscsi login failure in rare occasions.
18+
The iscsi failure occurs on particular nodes where portals login isn't successful. This guide provides a solution for this particular issue.
19+
20+
The guide briefly lays down the process to delete Volumeattachment and restart the pod to resolve the issue.
21+
22+
23+
## Process
24+
25+
Check to see why the pod remains in `ContainerCreating` state:
26+
27+
```Warning FailedMapVolume 52s (x19 over 23m) kubelet MapVolume.SetUpDevice failed for volume "pvc-b38dcc54-5e57-435a-88a0-f91eac594e18" : rpc error: code = Internal desc = required at least 2 portals but found 0 portals```
28+
29+
Here we focus only on `baremetal_machine` where the issue has occurred.
30+
31+
Execute the following run command to solve the issue of pod stuck in containerCreating
32+
```azurecli
33+
az networkcloud baremetalmachine run-command --bare-metal-machine-name <control-plane-baremetal-machine> \
34+
--subscription <subscription> \
35+
--resource-group <cluster-managed-resource-group> \
36+
--limit-time-seconds 60 \
37+
--script "cG9kcz0kKGt1YmVjdGwgZ2V0IHBvZHMgLW4gbmMtc3lzdGVtIHxncmVwIC1pIGNvbnRhaW5lcmNyZWF0aW5nIHwgYXdrICd7cHJpbnQgJDF9JykKCmZvciBwb2RuYW1lIGluICRwb2RzOyBkbwogICAga3ViZWN0bCBkZXNjcmliZSBwbyAkcG9kbmFtZSAtbiBuYy1zeXN0ZW0KCiAgICBwdmNuYW1lPSQoa3ViZWN0bCBnZXQgcG8gJHBvZG5hbWUgLW4gbmMtc3lzdGVtIC1vIGpzb24gfCBqcSAtciAnLnNwZWMudm9sdW1lc1swXS5wZXJzaXN0ZW50Vm9sdW1lQ2xhaW0uY2xhaW1OYW1lJykKCiAgICBwdm5hbWU9JChrdWJlY3RsIGdldCBwdmMgJHB2Y25hbWUgLW4gbmMtc3lzdGVtIC1vIGpzb24gfCBqcSAtciAnLnNwZWMudm9sdW1lTmFtZScpCgogICAgbm9kZW5hbWU9JChrdWJlY3RsIGdldCBwbyAkcG9kbmFtZSAtbiBuYy1zeXN0ZW0gLW9qc29uIHwganEgLXIgJy5zcGVjLm5vZGVOYW1lJykKCiAgICB2b2xhdHRhY2hOYW1lPSQoa3ViZWN0bCBnZXQgdm9sdW1lYXR0YWNobWVudCB8IGdyZXAgLWkgJHB2bmFtZSB8IGF3ayAne3ByaW50ICQxfScpCgogICAga3ViZWN0bCBkZWxldGUgdm9sdW1lYXR0YWNobWVudCAkdm9sYXR0YWNoTmFtZQoKICAgIGt1YmVjdGwgY29yZG9uICRub2RlbmFtZSAtbiBuYy1zeXN0ZW07a3ViZWN0bCBkZWxldGUgcG8gLW4gbmMtc3lzdGVtICRwb2RuYW1lCmRvbmU="
38+
```
39+
40+
The run command executes the following script.
41+
42+
``` console
43+
pods=$(kubectl get pods -n nc-system |grep -i containercreating | awk '{print $1}')
44+
45+
for podname in $pods; do
46+
kubectl describe po $podname -n nc-system
47+
48+
pvcname=$(kubectl get po $podname -n nc-system -o json | jq -r '.spec.volumes[0].persistentVolumeClaim.claimName')
49+
50+
pvname=$(kubectl get pvc $pvcname -n nc-system -o json | jq -r '.spec.volumeName')
51+
52+
nodename=$(kubectl get po $podname -n nc-system -ojson | jq -r '.spec.nodeName')
53+
54+
volattachName=$(kubectl get volumeattachment | grep -i $pvname | awk '{print $1}')
55+
56+
kubectl delete volumeattachment $volattachName
57+
58+
kubectl cordon $nodename -n nc-system;kubectl delete po -n nc-system $podname
59+
done
60+
```
61+
The command retrieves the pvc from the pod and then deletes the `volumeattachment` object. It then deletes the pod. The pod later gets recreated on another node along with a successful volume attachment object.

0 commit comments

Comments
 (0)