Skip to content

[Bug]: strimzi-cluster-operator keeps restarting with issue: WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for #10037

@vietnguyenmipro

Description

@vietnguyenmipro

Bug Description

strimzi-cluster-operator keeps restarting with issue: WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for xxxx ms, time limit is 2000 ms io.vertx.core.VertxException: Thread blocked

Steps to reproduce

  1. Install strimzi-kafka-operator as dependency in a application helm chart:
    `dependencies:
  • name: strimzi-kafka-operator
    version: "0.40.0"
    repository: "oci://quay.io/strimzi-helm"`
  1. In helm templates folders: create new cluster with this definition:
    `apiVersion: kafka.strimzi.io/v1beta2
    kind: Kafka
    metadata:
    name: myapp
    spec:
    kafka:
    version: 3.7.0
    replicas: 1
    listeners:
    • name: plain
      port: 9092
      type: internal
      tls: false
    • name: tls
      port: 9093
      type: internal
      tls: true
      config:
      offsets.topic.replication.factor: 1
      transaction.state.log.replication.factor: 1
      transaction.state.log.min.isr: 1
      default.replication.factor: 1
      min.insync.replicas: 1
      inter.broker.protocol.version: "3.7"
      storage:
      type: persistent-claim
      class: {{ .Values.kafka.cluster.kafka.storage.class }}
      size: {{ .Values.kafka.cluster.kafka.storage.size }}
      deleteClaim: true
      zookeeper:

livenessProbe:

initialDelaySeconds: 120

timeoutSeconds: 5

readinessProbe:

initialDelaySeconds: 120

timeoutSeconds: 5

replicas: {{ .Values.kafka.cluster.zookeeper.replicas }}
storage:
  type: persistent-claim
  class: {{ .Values.kafka.cluster.zookeeper.storage.class }}
  size: {{ .Values.kafka.cluster.zookeeper.storage.size }}
  deleteClaim: true

entityOperator:
template:
topicOperatorContainer:
env:
- name: STRIMZI_USE_ZOOKEEPER_TOPIC_STORE
value: "true"
topicOperator: {}
userOperator: {}
4. Install the app with its dependency: helm install 5. Strimzi-cluster-operator keeps retartings with log below:2024-04-29 19:19:36 WARN BlockedThreadChecker: - Thread Thread[vert.x-eventloop-thread-1,5,main] has been blocked for 128580 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
at jdk.internal.misc.Unsafe.park(Native Method) ~[?:?]
at java.util.concurrent.locks.LockSupport.park(LockSupport.java:211) ~[?:?]
at java.util.concurrent.CompletableFuture$Signaller.block(CompletableFuture.java:1864) ~[?:?]
at java.util.concurrent.ForkJoinPool.unmanagedBlock(ForkJoinPool.java:3465) ~[?:?]
at java.util.concurrent.ForkJoinPool.managedBlock(ForkJoinPool.java:3436) ~[?:?]
at java.util.concurrent.CompletableFuture.waitingGet(CompletableFuture.java:1898) ~[?:?]
at java.util.concurrent.CompletableFuture.get(CompletableFuture.java:2072) ~[?:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:491) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:419) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:397) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handlePatch(BaseOperation.java:763) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:232) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation$$Lambda$594/0x00007fc73c420c00.apply(Unknown Source) ~[?:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:237) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:262) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:45) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.strimzi.operator.common.operator.resource.AbstractNamespacedResourceOperator.patchOrReplace(AbstractNamespacedResourceOperator.java:262) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.AbstractNamespacedResourceOperator.internalUpdate(AbstractNamespacedResourceOperator.java:238) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.DeploymentOperator.internalUpdate(DeploymentOperator.java:96) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.DeploymentOperator.internalUpdate(DeploymentOperator.java:22) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.AbstractNamespacedResourceOperator.lambda$reconcile$0(AbstractNamespacedResourceOperator.java:108) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.AbstractNamespacedResourceOperator$$Lambda$825/0x00007fc73c627d28.apply(Unknown Source) ~[?:?]
at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38) ~[io.vertx.vertx-core-4.5.4.jar:4.5.4]
at io.vertx.core.impl.future.FutureBase.lambda$emitSuccess$0(FutureBase.java:60) ~[io.vertx.vertx-core-4.5.4.jar:4.5.4]
at io.vertx.core.impl.future.FutureBase$$Lambda$391/0x00007fc73c390fa8.run(Unknown Source) ~[?:?]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[io.netty.netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at java.lang.Thread.run(Thread.java:840) ~[?:?]
2024-04-29 19:19:37 ERROR AbstractOperator:284 - Reconciliation #1(watch) Kafka(forme/myapp): createOrUpdate failed
io.fabric8.kubernetes.client.KubernetesClientException: Operation: [patch] for kind: [Deployment] with name: [myapp-entity-operator] in namespace: [forme] failed.
at io.fabric8.kubernetes.client.KubernetesClientException.launderThrowable(KubernetesClientException.java:159) ~[io.fabric8.kubernetes-client-api-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:234) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:237) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:262) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.patch(HasMetadataOperation.java:45) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.strimzi.operator.common.operator.resource.AbstractNamespacedResourceOperator.patchOrReplace(AbstractNamespacedResourceOperator.java:262) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.AbstractNamespacedResourceOperator.internalUpdate(AbstractNamespacedResourceOperator.java:238) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.DeploymentOperator.internalUpdate(DeploymentOperator.java:96) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.DeploymentOperator.internalUpdate(DeploymentOperator.java:22) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.strimzi.operator.common.operator.resource.AbstractNamespacedResourceOperator.lambda$reconcile$0(AbstractNamespacedResourceOperator.java:108) ~[io.strimzi.operator-common-0.40.0.jar:0.40.0]
at io.vertx.core.impl.future.Composition.onSuccess(Composition.java:38) ~[io.vertx.vertx-core-4.5.4.jar:4.5.4]
at io.vertx.core.impl.future.FutureBase.lambda$emitSuccess$0(FutureBase.java:60) ~[io.vertx.vertx-core-4.5.4.jar:4.5.4]
at io.netty.util.concurrent.AbstractEventExecutor.runTask(AbstractEventExecutor.java:173) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.AbstractEventExecutor.safeExecute(AbstractEventExecutor.java:166) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor.runAllTasks(SingleThreadEventExecutor.java:470) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:569) ~[io.netty.netty-transport-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30) ~[io.netty.netty-common-4.1.107.Final.jar:4.1.107.Final]
at java.lang.Thread.run(Thread.java:840) ~[?:?]
Caused by: java.io.IOException: request timed out
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.waitForResult(OperationSupport.java:504) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handleResponse(OperationSupport.java:524) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:419) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.OperationSupport.handlePatch(OperationSupport.java:397) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.BaseOperation.handlePatch(BaseOperation.java:763) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
at io.fabric8.kubernetes.client.dsl.internal.HasMetadataOperation.lambda$patch$2(HasMetadataOperation.java:232) ~[io.fabric8.kubernetes-client-6.10.0.jar:?]
... 18 more
Caused by: java.net.http.HttpTimeoutException: request timed out
at jdk.internal.net.http.ResponseTimerEvent.handle(ResponseTimerEvent.java:63) ~[java.net.http:?]
at jdk.internal.net.http.HttpClientImpl.purgeTimeoutsAndReturnNextDeadline(HttpClientImpl.java:1270) ~[java.net.http:?]
at jdk.internal.net.http.HttpClientImpl$SelectorManager.run(HttpClientImpl.java:899) ~[java.net.http:?]
`

Expected behavior

The strimzi-cluster-operator works without above issue.

Strimzi version

0.40.0

Kubernetes version

1,29

Installation method

helm

Infrastructure

No response

Configuration files and logs

No response

Additional context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions