You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: articles/machine-learning/how-to-troubleshoot-kubernetes-extension.md
+48Lines changed: 48 additions & 0 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -78,6 +78,54 @@ When you request support, we recommend that you run the following command and se
78
78
```bash
79
79
kubectl logs healthcheck -n azureml
80
80
```
81
+
## Extension-operator pod in azure-arc/kube-system namespace is crashing due to OOMKill
82
+
This issue happens if the extension's helm chart size is large and there are multiple Helm releases on the cluster. Here is a sample script to help clean up the helm history on the cluster:
83
+
```
84
+
#!/bin/bash
85
+
86
+
# Set release name and namespace
87
+
RELEASE_NAME=$1
88
+
NAMESPACE=$2
89
+
90
+
# Validate input
91
+
if [[ -z "$RELEASE_NAME" || -z "$NAMESPACE" ]]; then
92
+
echo "Usage: $0 <release-name> <namespace>"
93
+
exit 1
94
+
fi
95
+
96
+
echo "Fetching Helm history for release: $RELEASE_NAME in namespace: $NAMESPACE"
97
+
98
+
# Get stuck revisions (PENDING_ROLLBACK or PENDING_UPGRADE) using grep + awk for accurate parsing
0 commit comments