Replies: 1 comment 2 replies
-
It is quite hard to comment on the issue without the full log. As described in the FAQ in our docs, the Failed to acquire lock within 10000ms error might not mean necessarily anything bad. So it is hard to say what it means in your case without the full log. Also, you seem to be using a fairly old Strimzi version - many things were fixed since 0.22. |
Beta Was this translation helpful? Give feedback.
2 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Hi, Im using Strimzi 0.22.1 on OpenShift 4.5 platform.
I noticed that sometimes the Cluster operator gets "stuck" and only restarting the operator temporary solves the problem.
This warning in the Cluster operator logs appears for all the kafkas in several namespaces and recurring for many times:
2022-03-28 08:24:21 WARN AbstractOperator:247 - Reconciliation #30(timer) Kafka(some-namespace/some-kafka-cluster): Failed to acquire lock lock::some-namespace::Kafka::some-kafka-cluster within 10000ms.
It appears that threads of the operator are "stuck"; shouldn't the operator know how to deal with this problem and release the lock?
There are warnings that might be related:
io.vertx.core.impl.BlockedThreadChecker
WARNING: Thread Thread[vert.x-eventloop-thread-0, 5, main] =Thread[vert.x-eventloop-thread-0, 5, main] has been blocked for 7609 ms, time limit is 2000 ms
io.vertx.core.VertxException: Thread blocked
Another common errors:
ERROR AbstractOperator:276 - Reconciliation #374(timer) Kafka(some-namespace/some-kafka-cluster): createOrUpdate failed
io.strimzi.operator.cluster.operator.resource.KafkaRoller$UnforceableProblem: Pod my-kafka-cluster-0 is currently not rollable
WARN WatcherWebSocketListener:102 - Exec Failure
java.net.SocketTimeoutException: sent ping but didn't receive pong within 30000ms (after 4 successful ping/pongs)
...
...
caused by:
java.net.ssl.SSLException: Socket closed
...
caused by:
java.net.SocketException: Socket closed
SEVERE: Unhandled exception
io.fabric8.kuberenetes.client.KubernetesClientException: Operation [get] for kind: [Kafka] with name: [my-kafka-cluster] in namespace: [my-namespace] failed
...
...
caused by:
java.net.ConnectException: Failed to connect to {myKubernetesApiIP}
...
caused by:
java.net.ConnectionException: Connection refused (Connection refused)
The error of KubernetesClientException occurs with another failed [get] operation like secrets and other resources,
it seems like the kuberenetes api does not respond but as far as I know only Strimzi produces this error.
Does anyone know what might be the cause to this issue?
Thanks
Beta Was this translation helpful? Give feedback.
All reactions