You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(chart): add automatic Skyhook resource cleanup on helm uninstall
Add pre-delete hook to automatically clean up Skyhook and DeploymentPolicy
resources during helm uninstall, eliminating the manual step previously
required to avoid reinstall issues.
Enabled by default with configurable timeout (120s). Can be disabled with
cleanup.enabled=false.
### ⚠️ Important: Before Uninstalling**(really only important if you want to reinstall)**
144
+
### Uninstalling
145
145
146
-
**Always delete all Skyhook Custom Resources before running `helm uninstall`:**
146
+
**Automatic Cleanup (Default):** By default, the Helm chart includes a pre-delete hook that automatically cleans up all Skyhook and DeploymentPolicy resources before uninstalling:
147
147
148
148
```bash
149
-
# Delete all Skyhook resources first (REQUIRED before uninstall)
150
-
kubectl delete skyhooks --all --all-namespaces
149
+
# Uninstall the chart (cleanup happens automatically)
150
+
helm uninstall skyhook --namespace skyhook
151
+
```
152
+
153
+
The pre-delete hook will:
154
+
- Delete all Skyhook resources
155
+
- Delete all DeploymentPolicy resources
156
+
- Complete quickly if no resources exist
157
+
- Wait for finalizers to be processed if resources exist
158
+
- Proceed with uninstall even if cleanup times out (job deadline: 2 minutes)
159
+
160
+
**Configuration Options:**
161
+
162
+
To disable automatic cleanup and manage resources manually:
If you disabled automatic cleanup or need to clean up resources manually:
178
+
179
+
```bash
180
+
# Delete all Skyhook resources first
181
+
kubectl delete skyhooks --all
182
+
183
+
# Delete all DeploymentPolicy resources
184
+
kubectl delete deploymentpolicies --all
151
185
152
186
# Then uninstall the chart
153
187
helm uninstall skyhook --namespace skyhook
154
188
```
155
189
156
-
**Why?** If you `helm uninstall` while Skyhook CRs still exist, finalizers will leave the CRD in a broken state, causing reinstalls to fail.
190
+
**Why cleanup matters:** If you uninstall while Skyhook CRs with finalizers still exist, it can leave resources in a broken state that may cause reinstall issues.
| imagePullSecret | the secret used to pull the operator controller image, agent image, and package images. | "" |
38
38
| estimatedPackageCount | estimated number of packages to be installed on the cluster, this is used to calculate the resources for the operator controller. | 1 |
39
39
| estimatedNodeCount | estimated number of nodes in the cluster, this is used to calculate the resources for the operator controller | 1 |
40
+
| cleanup.enabled | Automatically delete all Skyhook and DeploymentPolicy resources during helm uninstall. Recommended to prevent orphaned CRs. | true |
41
+
| cleanup.jobTimeoutSeconds | Hard deadline for the entire cleanup job during uninstall. The job will be killed if it exceeds this time. | 120 |
40
42
41
43
### NOTES
42
44
-**estimatedPackageCount** and **estimatedNodeCount** are used to size the resource requirements. Default setting should be good for nodes > 1000 and packages 1-2 or nodes > 500 and packages >= 4. If your approaching this size deployment it would make sense to set these. You can also override them by explicitly with `controllerManager.manager.resources` the values file has an example.
@@ -70,3 +72,52 @@ This Helm chart follows independent versioning from the operator and agent compo
70
72
### Chart Version vs App Version
71
73
- **Chart version** (`version` in Chart.yaml): Tracks changes to chart templates, values, and configuration (NOTE: agent version in set in the values.)
72
74
- **App version** (`appVersion` in Chart.yaml): Recommended stable operator version for this chart release
75
+
76
+
## Uninstalling
77
+
78
+
### Automatic Cleanup (Default Behavior)
79
+
80
+
By default, the Helm chart includes a pre-delete hook that automatically cleans up all Skyhook and DeploymentPolicy custom resources before uninstalling. This prevents orphaned resources that could cause issues during reinstallation.
81
+
82
+
```bash
83
+
# Uninstall with automatic cleanup (default)
84
+
helm uninstall skyhook --namespace skyhook
85
+
```
86
+
87
+
The pre-delete hook will:
88
+
- Delete all Skyhook resources cluster-wide
89
+
- Delete all DeploymentPolicy resources cluster-wide
90
+
- Wait for finalizers to be processed
91
+
- Proceed with uninstall even if cleanup times out (job deadline: 2 minutes, configurable via `cleanup.jobTimeoutSeconds`)
92
+
93
+
### Disabling Automatic Cleanup
94
+
95
+
If you need to preserve Skyhook resources during uninstall (e.g., for backup/migration scenarios), disable the cleanup feature:
96
+
97
+
```yaml
98
+
# values.yaml
99
+
cleanup:
100
+
enabled: false
101
+
```
102
+
103
+
When disabled, you must manually delete resources before uninstalling to avoid issues:
104
+
105
+
```bash
106
+
# Manual cleanup when automatic cleanup is disabled
107
+
kubectl delete skyhooks --all
108
+
kubectl delete deploymentpolicies --all
109
+
helm uninstall skyhook --namespace skyhook
110
+
```
111
+
112
+
### Configuring Timeout Values
113
+
114
+
For large clusters or when resources have complex finalizers, you may need to adjust the job timeout:
115
+
116
+
```yaml
117
+
# values.yaml
118
+
cleanup:
119
+
enabled: true
120
+
jobTimeoutSeconds: 180 # 3 minutes total job deadline
121
+
```
122
+
123
+
**Note:** The job will be killed if it exceeds `jobTimeoutSeconds`. The default of 120 seconds (2 minutes) should be sufficient for most clusters.
Copy file name to clipboardExpand all lines: k8s-tests/chainsaw/helm/helm-chart-test/README.md
+11-8Lines changed: 11 additions & 8 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -2,24 +2,27 @@
2
2
3
3
## Purpose
4
4
5
-
Validates that the Helm chart deploys correctly with custom configurations, including custom deployment namesand tolerations.
5
+
Validates that the Helm chart deploys correctly with custom configurations, including custom deployment names, tolerations, and automatic cleanup on uninstall.
6
6
7
7
## Test Scenario
8
8
9
9
1. Reset state from previous runs
10
-
2. Install the Helm chart with custom values:
11
-
- Different deployment name than `skyhook-operator`
12
-
- Custom tolerations
13
-
3. Verify the operator is scheduled correctly
14
-
4. Apply a skyhook and verify it completes
15
-
5. Assert metrics and state are correct
10
+
2. Install the Helm chart with a "bad" node taint and verify pods don't schedule
11
+
3. Change to a "good" node taint that matches configured tolerations
12
+
4. Reinstall the Helm chart (tests uninstall + reinstall flow)
13
+
5. Verify the operator is scheduled correctly with tolerations
14
+
6. Apply a DeploymentPolicy and Skyhook
15
+
7. Verify the Skyhook completes successfully
16
+
8. Uninstall the Helm chart (verifies pre-delete hook cleans up Skyhook/DeploymentPolicy resources automatically)
16
17
17
18
## Key Features Tested
18
19
19
20
- Custom deployment name support
20
21
- Toleration configuration via Helm values
21
-
- Operator deployment and scheduling
22
+
- Operator deployment and scheduling with node taints
22
23
- End-to-end skyhook processing with Helm-deployed operator
24
+
- Automatic cleanup of Skyhook and DeploymentPolicy resources during helm uninstall
25
+
- Pre-delete hook tolerates node taints (can schedule on same nodes as operator)
**NOTE**: because there is a finalizer on it you need to need to delete the SCRs before uninstalling the CRD or operator. If you remove the operator first, delete the CRD or SCR can hang trying to finalize. Easiest way to fix is re install the operator. You can clean up by hand, but could be some work. cleaning up: configmaps, uncording nodes, removing taints, and deleting running pods
227
+
**NOTE**: The Helm chart includes automatic cleanup of Skyhook and DeploymentPolicy resources during uninstall (enabled by default). If you've disabled automatic cleanup (`cleanup.enabled: false`), you must manually delete SCRs before uninstalling to avoid finalizer issues. If you remove the operator before deleting SCRs with finalizers, they can hang. To fix: reinstall the operator, delete resources, then uninstall properly. Manual cleanup may require removing configmaps, uncordoning nodes, removing taints, and deleting running pods.
0 commit comments