Skip to content

Commit bec107d

Browse files
authored
Merge pull request #35 from hughdanliu/startup-taint
Add support for node startup taints
2 parents a4cbcaa + 63193cb commit bec107d

File tree

11 files changed

+1551
-3
lines changed

11 files changed

+1551
-3
lines changed

charts/aws-fsx-openzfs-csi-driver/templates/clusterrole-csi-node.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -8,4 +8,4 @@ metadata:
88
rules:
99
- apiGroups: [""]
1010
resources: ["nodes"]
11-
verbs: ["get"]
11+
verbs: ["get", "patch"]

charts/aws-fsx-openzfs-csi-driver/templates/node-daemonset.yaml

Lines changed: 2 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -43,6 +43,8 @@ spec:
4343
{{- with .Values.node.tolerations }}
4444
{{- toYaml . | nindent 8 }}
4545
{{- end }}
46+
- key: "fsx.openzfs.csi.aws.com/agent-not-ready"
47+
operator: "Exists"
4648
{{- end }}
4749
{{- with .Values.node.securityContext }}
4850
securityContext:

charts/aws-fsx-openzfs-csi-driver/values.yaml

Lines changed: 0 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -3,7 +3,6 @@
33
# Declare variables to be passed into your templates.
44

55
image:
6-
# TODO: replace this temporary private ECR and regenerate the manifests when we are ready to release
76
repository: public.ecr.aws/fsx-csi-driver/aws-fsx-openzfs-csi-driver
87
# Overrides the image tag whose default is v{{ .Chart.AppVersion }}
98
tag: v0.1.0

deploy/kubernetes/base/clusterrole-csi-node.yaml

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -9,4 +9,4 @@ metadata:
99
rules:
1010
- apiGroups: [""]
1111
resources: ["nodes"]
12-
verbs: ["get"]
12+
verbs: ["get", "patch"]

docs/install.md

Lines changed: 15 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -79,6 +79,21 @@ Additionally, the driver node tolerates all taints.
7979
If you do not wish to deploy the driver node on all nodes, please set Helm `Value.node.tolerateAllTaints` to false before deployment.
8080
Add policies to `Value.node.tolerations` to configure customized toleration for nodes.
8181

82+
### Configure node startup taint
83+
There are potential race conditions on node startup (especially when a node is first joining the cluster)
84+
where pods/processes that rely on the FSx for OpenZFS CSI Driver can act on a node before the FSx for OpenZFS CSI Driver is able to start up and become fully ready.
85+
To combat this, the FSx for OpenZFS CSI Driver contains a feature to automatically remove a taint from the node on startup.
86+
Users can taint their nodes when they join the cluster and/or on startup.
87+
This will prevent other pods from running and/or being scheduled on the node prior to the FSx for OpenZFS CSI Driver becoming ready.
88+
89+
This feature is activated by default. Cluster administrators should apply the taint `fsx.openzfs.csi.aws.com/agent-not-ready:NoExecute` to their nodes:
90+
```shell
91+
kubectl taint nodes $NODE_NAME fsx.openzfs.csi.aws.com/agent-not-ready:NoExecute
92+
```
93+
Note that any effect will work, but `NoExecute` is recommended.
94+
95+
For example, EKS Managed Node Groups [support automatically tainting nodes](https://docs.aws.amazon.com/eks/latest/userguide/node-taints-managed-node-groups.html).
96+
8297
### Deploy driver
8398
You may deploy the FSx for OpenZFS CSI driver via Kustomize or Helm
8499

hack/update-gomock

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -22,3 +22,7 @@ mockgen -package=mocks -destination=./pkg/cloud/mocks/mock_metadata.go --build_f
2222

2323
mockgen -package=mocks -destination=./pkg/cloud/mocks/mock_fsx.go --build_flags=--mod=mod ${IMPORT_PATH}/pkg/cloud FSx
2424
mockgen -package=mocks -destination=./pkg/driver/mocks/mock_cloud.go --build_flags=--mod=mod ${IMPORT_PATH}/pkg/cloud Cloud
25+
26+
# Reflection-based mocking for external dependencies
27+
mockgen -package=mocks -destination=./pkg/driver/mocks/mock_k8s_client.go --build_flags=--mod=mod -mock_names='Interface=MockKubernetesClient' k8s.io/client-go/kubernetes Interface
28+
mockgen -package=mocks -destination=./pkg/driver/mocks/mock_k8s_corev1.go --build_flags=--mod=mod k8s.io/client-go/kubernetes/typed/core/v1 CoreV1Interface,NodeInterface

pkg/driver/constants.go

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -6,3 +6,9 @@ package driver
66
const (
77
DefaultCSIEndpoint = "unix://tmp/csi.sock"
88
)
9+
10+
// constants for node k8s API use
11+
const (
12+
// AgentNotReadyNodeTaintKey contains the key of taints to be removed on driver startup
13+
AgentNotReadyNodeTaintKey = "fsx.openzfs.csi.aws.com/agent-not-ready"
14+
)

0 commit comments

Comments
 (0)