-
Hello, apiVersion: kafka.strimzi.io/v1beta2
kind: Kafka
metadata:
name: kafka-staytus
namespace: kafka
spec:
kafka:
replicas: 3
template:
pod:
tolerations:
- key: 'dedicated'
operator: 'Equal'
value: 'kafka'
effect: 'NoExecute'
version: 3.3.1
listeners:
- name: plain # unique name
port: 9092
type: internal # can be
tls: false
- name: tls
port: 9093
type: internal
tls: true
config:
offsets.topic.replication.factor: 3
transaction.state.log.replication.factor: 3
transaction.state.log.min.isr: 2
default.replication.factor: 3
min.insync.replicas: 2
inter.broker.protocol.version: '3.3'
storage:
type: ephemeral
zookeeper:
replicas: 3
template:
pod:
tolerations:
- key: 'dedicated'
operator: 'Equal'
value: 'kafka'
effect: 'NoExecute'
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app
operator: In
values:
- kafka
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: strimzi.io/cluster
operator: In
values:
- kafka-staytus
topologyKey: topology.kubernetes.io/zone
storage:
type: ephemeral
entityOperator:
topicOperator: {}
userOperator: {} Zookeeper pods will deploy but Kafka pods fails with FailedScheduling : 0/10 nodes are available: 1 node(s) had taint {type: cms}, that the pod didn't tolerate, 3 node(s) had taint {network: private}, that the pod didn't tolerate, 6 node(s) didn't satisfy existing pods anti-affinity rules! The live manifest for kafka pod: apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
strimzi.io/broker-configuration-hash: 2b40af10
strimzi.io/clients-ca-cert-generation: '0'
strimzi.io/cluster-ca-cert-generation: '0'
strimzi.io/generation: '0'
strimzi.io/inter-broker-protocol-version: '3.3'
strimzi.io/kafka-version: 3.3.1
strimzi.io/log-message-format-version: '3.3'
strimzi.io/logging-appenders-hash: e893ac9f
strimzi.io/storage: '{"type":"ephemeral"}'
creationTimestamp: '2022-11-18T13:17:33Z'
generateName: kafka-staytus-kafka-
labels:
app.kubernetes.io/instance: kafka-staytus
app.kubernetes.io/managed-by: strimzi-cluster-operator
app.kubernetes.io/name: kafka
app.kubernetes.io/part-of: strimzi-kafka-staytus
argocd.argoproj.io/instance: kafka
controller-revision-hash: kafka-staytus-kafka-5bb78c59c9
statefulset.kubernetes.io/pod-name: kafka-staytus-kafka-0
strimzi.io/cluster: kafka-staytus
strimzi.io/kind: Kafka
strimzi.io/name: kafka-staytus-kafka
name: kafka-staytus-kafka-0
namespace: kafka
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: kafka-staytus-kafka
uid: 454cb7d8-224f-4690-bdb8-3fbb276c01c3
resourceVersion: '5831044'
uid: 76444e6b-8fc0-4721-973e-7c113afe553c
spec:
affinity: {}
containers:
- command:
- /opt/kafka/kafka_run.sh
env:
- name: KAFKA_METRICS_ENABLED
value: 'false'
- name: STRIMZI_KAFKA_GC_LOG_ENABLED
value: 'false'
- name: KAFKA_HEAP_OPTS
value: '-Xms128M'
image: 'quay.io/strimzi/kafka:0.32.0-kafka-3.3.1'
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- /opt/kafka/kafka_liveness.sh
failureThreshold: 3
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: kafka
ports:
- containerPort: 9090
name: tcp-ctrlplane
protocol: TCP
- containerPort: 9091
name: tcp-replication
protocol: TCP
- containerPort: 9092
name: tcp-clients
protocol: TCP
- containerPort: 9093
name: tcp-clientstls
protocol: TCP
readinessProbe:
exec:
command:
- test
- '-f'
- /var/opt/kafka/kafka-ready
failureThreshold: 3
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /var/lib/kafka/data
name: data
- mountPath: /tmp
name: strimzi-tmp
- mountPath: /opt/kafka/cluster-ca-certs
name: cluster-ca
- mountPath: /opt/kafka/broker-certs
name: broker-certs
- mountPath: /opt/kafka/client-ca-certs
name: client-ca-cert
- mountPath: /opt/kafka/custom-config/
name: kafka-metrics-and-logging
- mountPath: /var/opt/kafka
name: ready-files
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-fllqg
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: kafka-staytus-kafka-0
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: kafka-staytus-kafka
serviceAccountName: kafka-staytus-kafka
subdomain: kafka-staytus-kafka-brokers
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: dedicated
operator: Equal
value: kafka
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: data
- emptyDir:
medium: Memory
sizeLimit: 5Mi
name: strimzi-tmp
- name: cluster-ca
secret:
defaultMode: 292
secretName: kafka-staytus-cluster-ca-cert
- name: broker-certs
secret:
defaultMode: 292
secretName: kafka-staytus-kafka-brokers
- name: client-ca-cert
secret:
defaultMode: 292
secretName: kafka-staytus-clients-ca-cert
- configMap:
defaultMode: 420
name: kafka-staytus-kafka-config
name: kafka-metrics-and-logging
- emptyDir:
medium: Memory
sizeLimit: 1Ki
name: ready-files
- name: kube-api-access-fllqg
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: '2022-11-18T13:17:33Z'
message: >-
0/10 nodes are available: 1 node(s) had taint {type: cms}, that the pod
didn't tolerate, 3 node(s) had taint {network: private}, that the pod
didn't tolerate, 6 node(s) didn't satisfy existing pods anti-affinity
rules.
reason: Unschedulable
status: 'False'
type: PodScheduled
phase: Pending
qosClass: BestEffort And the live manifest for zookeeper pod: apiVersion: v1
kind: Pod
metadata:
annotations:
kubernetes.io/psp: eks.privileged
strimzi.io/cluster-ca-cert-generation: '0'
strimzi.io/generation: '0'
strimzi.io/logging-hash: 0f057cb0
creationTimestamp: '2022-11-18T13:16:59Z'
generateName: kafka-staytus-zookeeper-
labels:
app.kubernetes.io/instance: kafka-staytus
app.kubernetes.io/managed-by: strimzi-cluster-operator
app.kubernetes.io/name: zookeeper
app.kubernetes.io/part-of: strimzi-kafka-staytus
argocd.argoproj.io/instance: kafka
controller-revision-hash: kafka-staytus-zookeeper-57c86b7996
statefulset.kubernetes.io/pod-name: kafka-staytus-zookeeper-0
strimzi.io/cluster: kafka-staytus
strimzi.io/kind: Kafka
strimzi.io/name: kafka-staytus-zookeeper
name: kafka-staytus-zookeeper-0
namespace: kafka
ownerReferences:
- apiVersion: apps/v1
blockOwnerDeletion: true
controller: true
kind: StatefulSet
name: kafka-staytus-zookeeper
uid: bebc02cc-a9ae-4c3b-b7e6-041b4a87ed15
resourceVersion: '5830975'
uid: 901a792d-f7e2-4e15-8ca3-720354681a2b
spec:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app
operator: In
values:
- kafka
podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: strimzi.io/cluster
operator: In
values:
- kafka-staytus
topologyKey: topology.kubernetes.io/zone
containers:
- command:
- /opt/kafka/zookeeper_run.sh
env:
- name: ZOOKEEPER_METRICS_ENABLED
value: 'false'
- name: ZOOKEEPER_SNAPSHOT_CHECK_ENABLED
value: 'true'
- name: STRIMZI_KAFKA_GC_LOG_ENABLED
value: 'false'
- name: KAFKA_HEAP_OPTS
value: '-Xms128M'
- name: ZOOKEEPER_CONFIGURATION
value: |
tickTime=2000
initLimit=5
syncLimit=2
autopurge.purgeInterval=1
admin.enableServer=false
image: 'quay.io/strimzi/kafka:0.32.0-kafka-3.3.1'
imagePullPolicy: IfNotPresent
livenessProbe:
exec:
command:
- /opt/kafka/zookeeper_healthcheck.sh
failureThreshold: 3
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
name: zookeeper
ports:
- containerPort: 2888
name: tcp-clustering
protocol: TCP
- containerPort: 3888
name: tcp-election
protocol: TCP
- containerPort: 2181
name: tcp-clients
protocol: TCP
readinessProbe:
exec:
command:
- /opt/kafka/zookeeper_healthcheck.sh
failureThreshold: 3
initialDelaySeconds: 15
periodSeconds: 10
successThreshold: 1
timeoutSeconds: 5
resources: {}
terminationMessagePath: /dev/termination-log
terminationMessagePolicy: File
volumeMounts:
- mountPath: /tmp
name: strimzi-tmp
- mountPath: /var/lib/zookeeper
name: data
- mountPath: /opt/kafka/custom-config/
name: zookeeper-metrics-and-logging
- mountPath: /opt/kafka/zookeeper-node-certs/
name: zookeeper-nodes
- mountPath: /opt/kafka/cluster-ca-certs/
name: cluster-ca-certs
- mountPath: /var/run/secrets/kubernetes.io/serviceaccount
name: kube-api-access-bt7td
readOnly: true
dnsPolicy: ClusterFirst
enableServiceLinks: true
hostname: kafka-staytus-zookeeper-0
nodeName: ip-10-15-101-77.eu-central-1.compute.internal
preemptionPolicy: PreemptLowerPriority
priority: 0
restartPolicy: Always
schedulerName: default-scheduler
securityContext: {}
serviceAccount: kafka-staytus-zookeeper
serviceAccountName: kafka-staytus-zookeeper
subdomain: kafka-staytus-zookeeper-nodes
terminationGracePeriodSeconds: 30
tolerations:
- effect: NoExecute
key: dedicated
operator: Equal
value: kafka
- effect: NoExecute
key: node.kubernetes.io/not-ready
operator: Exists
tolerationSeconds: 300
- effect: NoExecute
key: node.kubernetes.io/unreachable
operator: Exists
tolerationSeconds: 300
volumes:
- emptyDir: {}
name: data
- emptyDir:
medium: Memory
sizeLimit: 5Mi
name: strimzi-tmp
- configMap:
defaultMode: 420
name: kafka-staytus-zookeeper-config
name: zookeeper-metrics-and-logging
- name: zookeeper-nodes
secret:
defaultMode: 292
secretName: kafka-staytus-zookeeper-nodes
- name: cluster-ca-certs
secret:
defaultMode: 292
secretName: kafka-staytus-cluster-ca-cert
- name: kube-api-access-bt7td
projected:
defaultMode: 420
sources:
- serviceAccountToken:
expirationSeconds: 3607
path: token
- configMap:
items:
- key: ca.crt
path: ca.crt
name: kube-root-ca.crt
- downwardAPI:
items:
- fieldRef:
apiVersion: v1
fieldPath: metadata.namespace
path: namespace
status:
conditions:
- lastProbeTime: null
lastTransitionTime: '2022-11-18T13:16:59Z'
status: 'True'
type: Initialized
- lastProbeTime: null
lastTransitionTime: '2022-11-18T13:17:30Z'
status: 'True'
type: Ready
- lastProbeTime: null
lastTransitionTime: '2022-11-18T13:17:30Z'
status: 'True'
type: ContainersReady
- lastProbeTime: null
lastTransitionTime: '2022-11-18T13:16:59Z'
status: 'True'
type: PodScheduled
containerStatuses:
- containerID: >-
docker://bac93bb84e98a868ec4ab34c5f83a372d7ab53d5ad5e8945555229968b4f65be
image: 'quay.io/strimzi/kafka:0.32.0-kafka-3.3.1'
imageID: >-
docker-pullable://quay.io/strimzi/kafka@sha256:680ae1958dbcb9da8ee4128a67c1163a6ee2744221f7d29f73d8bcc237fd0173
lastState: {}
name: zookeeper
ready: true
restartCount: 0
started: true
state:
running:
startedAt: '2022-11-18T13:17:15Z'
hostIP: 10.15.101.77
phase: Running
podIP: 10.15.101.175
podIPs:
- ip: 10.15.101.175
qosClass: BestEffort
startTime: '2022-11-18T13:16:59Z' I also tried the same affinity that used for Zookeeper, for Kafka too, but again faced the same issue. Is it a bug or I'm doing something wrong? |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 2 replies
-
All Strimzi does is it passes the rules you configure in the Kafka custom resource to the Pods. That seemed to work fine based on your YAMLs. Without knowing your cluster, how its worker nodes are setup, which zones are they in, what pods are running there, it is impossible to tell you how should your rules look like. It is also not clear what exactly you want to achieve with your rules. The error message you got is fairly clear in explaining why the different nodes cannot be used:
So you need to go after it and figure out what the different issues on the different nodes are. Why do they have some other taints or what other pods are running there and what affinity rules they have so that they block these pods from being deployed. Keep in mind that what matters is not only the affinity rules of your pod but also the affinity rules of the pods already running there. The rule you have there for ZooKeeper: podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: strimzi.io/cluster
operator: In
values:
- kafka-staytus
topologyKey: topology.kubernetes.io/zone Says that the ZooKeeper pods need to run in their own zone. That means that you need the 6 nodes which have the right taint to be at least in 4 zones to be able to deploy the cluster. This in general does not seem to make sense because you do not want to have your ZooKeeper pods run in zones a, b and c and then deploy all your Kafka clusters to zone d. Did you really wanted that? Or should the |
Beta Was this translation helpful? Give feedback.
-
@scholzj Thank you for your answer.I want to have 3 nodes in 3 different availability zone, and each node has one Zookeeper pod and one Kafka pod. To achieve that I have created 3 dedicated worker nodes that tainted with dedicated=kafka and labeled with app=kafka. These nodes are in 3 different availability zone a,b,c. The rule for Zookeeper: I put replicas 3 for Zookeeper, also added toleration to Zookeeper pod: tolerations:
- key: 'dedicated'
operator: 'Equal'
value: 'kafka'
effect: 'NoExecute' As I know adding toleration to pods doesn't force the pods to schedule on the tainted nodes, just give them permission to be there. So I add nodeAffinity to put the Zookeeper pods on those 3 nodes. nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: app
operator: In
values:
- kafka At the end I add podAntiAffinity as the topologyKey field specifies that the pods matching the selector (strimzi.io/cluster=kafka-staytus) should not be sharing the same topologyKey. So at the end each pod will be deployed on each node in different availability zone. podAntiAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
- labelSelector:
matchExpressions:
- key: strimzi.io/cluster
operator: In
values:
- kafka-staytus
topologyKey: topology.kubernetes.io/zone Zookeeper pods scheduled on those 3 nodes and each node one Zookeeper pod, as I expected. kafka:
replicas: 3
template:
pod:
tolerations:
- key: 'dedicated'
operator: 'Equal'
value: 'kafka'
effect: 'NoExecute' No pod anti-affinity for Kafka pods. So Kafka pods can sit on any of those 3 dedicated nodes or any other nodes that is not tainted (as there is no label selector) and has capacity for the pods. But it failed scheduling 0/10 nodes are available: 1 node(s) had taint {type: cms}, that the pod didn't tolerate, 3 node(s) had taint {network: private}, that the pod didn't tolerate, 6 node(s) didn't satisfy existing pods anti-affinity rules. |
Beta Was this translation helpful? Give feedback.
All Strimzi does is it passes the rules you configure in the Kafka custom resource to the Pods. That seemed to work fine based on your YAMLs. Without knowing your cluster, how its worker nodes are setup, which zones are they in, what pods are running there, it is impossible to tell you how should your rules look like. It is also not clear what exactly you want to achieve with your rules. The error message you got is fairly clear in explaining why the different nodes cannot be used: