-
-
Notifications
You must be signed in to change notification settings - Fork 222
Description
Issue submitter TODO list
- I've looked up my issue in FAQ
- I've searched for an already existing issues here
- I've tried running
main-labeled docker image and the issue still persists there - I'm running a supported version of the application which is listed here
Describe the bug (actual behavior)
I have spent too long troubleshooting this issue already so please see my AI Bug Report.
Ill have time next week to do more testing but very busy at the moment so here is my Data on the issues i had with V1.3.0 in KRAFT MSK
--- AI GENERATED BUG REPORT SORRY NOT SORRY ----
Bug report: Kafka UI v1.3.0 fails against MSK KRaft with NPE after successful connect
- Component: Kafka UI
- Version: v1.3.0 (latest as of report time)
- Deployment: Kubernetes (EKS), IRSA for auth (also reproduced with EKS Pod Identity)
- Target cluster: Amazon MSK in KRaft mode
- Authentication: SASL_SSL + AWS_MSK_IAM
Summary
Kafka UI v1.3.0 connects to an MSK KRaft cluster successfully (STS AssumeRoleWithWebIdentity works, TLS and SASL IAM handshake complete, brokers respond to DESCRIBE_CONFIGS and API_VERSIONS), but then fails with an IllegalStateException caused by a NullPointerException in ReactiveAdminClient$ConfigRelatedInfo.lambda$extract$2. The same configuration works against a ZooKeeper-based MSK cluster and also works against the same KRaft cluster when using Kafka UI v1.2.0. This appears to be a KRaft-specific regression in v1.3.0.
Expected behavior
Kafka UI connects to MSK KRaft cluster, loads broker metadata, and UI shows cluster state.
Actual behavior
- AdminClient initializes and authenticates.
- Receives DESCRIBE_CONFIGS and API_VERSIONS responses from brokers.
- Then AdminClient is closed and UI reports “cannot connect”; logs show an NPE in
ReactiveAdminClient$ConfigRelatedInfo.
Configuration (sanitized)
kafka:
clusters:
- name: ehis-prod
bootstrapServers: b-1.example.amazonaws.com:9098,b-3.example.amazonaws.com:9098,b-4.example.amazonaws.com:9098
properties:
security.protocol: SASL_SSL
sasl.mechanism: AWS_MSK_IAM
sasl.jaas.config: software.amazon.msk.auth.iam.IAMLoginModule required;
sasl.client.callback.handler.class: software.amazon.msk.auth.iam.IAMClientCallbackHandlerKey log excerpts
- STS WebIdentity succeeds (IRSA):
Action=AssumeRoleWithWebIdentity ... RoleArn=arn:aws:iam::ACCOUNT:role/ehis-eks-kafkaui-role ...
HTTP/1.1 200 OK
<SubjectFromWebIdentityToken>system:serviceaccount:kafkaui:kafkaui</SubjectFromWebIdentityToken>
... Successfully refreshed cached value ... Using STS credentials ...
- TLS + SASL IAM succeed:
SSL handshake completed successfully ... peerHost 'b-*.amazonaws.com' protocol 'TLSv1.2'
SaslClientAuthenticator: Creating SaslClient ... mechs=[AWS_MSK_IAM]
SaslClientAuthenticator: Set SASL client state to INTERMEDIATE/COMPLETE
- Brokers respond to metadata/config:
AdminClientConfig values: bootstrap.controllers = []
bootstrap.servers = [b-1..., b-3..., b-4...]
Received DESCRIBE_CONFIGS response from node ...
Received API_VERSIONS response ... supportedFeatures include 'kraft.version'
- Then UI errors (repeats every poll):
ERROR ... StatisticsService: Failed to collect cluster ehis-prod info
java.lang.IllegalStateException: Error while creating AdminClient for the cluster ehis-prod
Caused by: java.lang.NullPointerException
at io.kafbat.ui.service.ReactiveAdminClient$ConfigRelatedInfo.lambda$extract$2(ReactiveAdminClient.java:166)
Repro steps
- Deploy Kafka UI v1.3.0 on EKS with IRSA (or EKS Pod Identity), using the config above.
- Point to an MSK KRaft cluster with AWS_MSK_IAM over SASL_SSL.
- Open the UI; cluster shows “cannot connect”. Logs show the NPE after successful metadata calls.
What works
- Same Kafka UI v1.3.0 with ZooKeeper-based MSK cluster.
- Same KRaft cluster with Kafka UI v1.2.0 (no code/config changes besides image tag).
- Network, IAM role, and ACLs are correct (STS and broker responses confirm this).
Impact
Blocks Kafka UI usage against MSK KRaft on v1.3.0.
Workaround
- Downgrade Kafka UI image to v1.2.0.
Request
- Investigate the NPE in
ReactiveAdminClient$ConfigRelatedInfo.lambda$extract$2on the KRaft code path in v1.3.0. - Provide a fix or guidance if additional KRaft-specific config is required in v1.3.0.
Notes:
- Failure occurs after successful STS, TLS, SASL IAM, and broker responses, so it appears to be application-side processing rather than connectivity or permissions.
Expected behavior
For the client to not err out.
Your installation details
Its in the report above,
EKS MSK KRAFT Cluster using both PodID and IRSA.
Steps to reproduce
Look at my infra and try to make the same connections.
I am free next week to test any fix branches requested next week.
Screenshots
Dont hav etime.
Logs
Logs are sensitive in my situtation i have exported them from test and can give back specifics if requested. I have the logs from DEBUG output. But heres the specific ones that should be ok to share.
2025-08-14T13:32:50.442931826Z �[30m2025-08-14 13:32:50,442�[0;39m �[39mDEBUG�[0;39m [�[34mkafka-admin-client-thread | kafbat-ui-admin-1755178369-6�[0;39m] �[33mo.a.k.c.n.SslTransportLayer�[0;39m: [SslTransportLayer channelId=3 key=channel=java.nio.channels.SocketChannel[connection-pending remote=b-3.ehiskafkaprod.w6xlhq.c17.kafka.us-east-1.amazonaws.com/10.200.139.29:9098], selector=sun.nio.ch.EPollSelectorImpl@51163b59, interestOps=8, readyOps=0] SSLEngine.closeInBound() raised an exception.
2025-08-14T13:32:50.442941186Z javax.net.ssl.SSLException: closing inbound before receiving peer's close_notify
2025-08-14T13:32:50.442945796Z at java.base/sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:794)
2025-08-14T13:32:50.442949646Z at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:204)
2025-08-14T13:32:50.442959506Z at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:1022)
2025-08-14T13:32:50.442963166Z at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:155)
2025-08-14T13:32:50.442966757Z at org.apache.kafka.common.network.Selector.doClose(Selector.java:976)
2025-08-14T13:32:50.442970157Z at org.apache.kafka.common.network.Selector.close(Selector.java:960)
2025-08-14T13:32:50.442973377Z at org.apache.kafka.common.network.Selector.close(Selector.java:906)
2025-08-14T13:32:50.442976747Z at org.apache.kafka.common.network.Selector.lambda$null$0(Selector.java:388)
2025-08-14T13:32:50.442979877Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1146)
2025-08-14T13:32:50.442983447Z at org.apache.kafka.common.utils.Utils.closeAllQuietly(Utils.java:1161)
2025-08-14T13:32:50.442987007Z at org.apache.kafka.common.network.Selector.close(Selector.java:387)
2025-08-14T13:32:50.442990327Z at org.apache.kafka.clients.NetworkClient.close(NetworkClient.java:689)
2025-08-14T13:32:50.442993697Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1125)
2025-08-14T13:32:50.442996857Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1108)
2025-08-14T13:32:50.443000667Z at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1490)
2025-08-14T13:32:50.443004598Z at java.base/java.lang.Thread.run(Thread.java:1583)
2025-08-14T13:32:50.443262164Z �[30m2025-08-14 13:32:50,443�[0;39m �[39mDEBUG�[0;39m [�[34mkafka-admin-client-thread | kafbat-ui-admin-1755178369-6�[0;39m] �[33mo.a.k.c.n.SslTransportLayer�[0;39m: [SslTransportLayer channelId=-3 key=channel=java.nio.channels.SocketChannel[connection-pending remote=b-4.ehiskafkaprod.w6xlhq.c17.kafka.us-east-1.amazonaws.com/10.200.186.72:9098], selector=sun.nio.ch.EPollSelectorImpl@51163b59, interestOps=8, readyOps=0] SSLEngine.closeInBound() raised an exception.
2025-08-14T13:32:50.443270764Z javax.net.ssl.SSLException: closing inbound before receiving peer's close_notify
2025-08-14T13:32:50.443274664Z at java.base/sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:794)
2025-08-14T13:32:50.443278084Z at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:204)
2025-08-14T13:32:50.443281334Z at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:1022)
2025-08-14T13:32:50.443284434Z at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:155)
2025-08-14T13:32:50.443287484Z at org.apache.kafka.common.network.Selector.doClose(Selector.java:976)
2025-08-14T13:32:50.443290734Z at org.apache.kafka.common.network.Selector.close(Selector.java:960)
2025-08-14T13:32:50.443294104Z at org.apache.kafka.common.network.Selector.close(Selector.java:906)
2025-08-14T13:32:50.443297304Z at org.apache.kafka.common.network.Selector.lambda$null$0(Selector.java:388)
2025-08-14T13:32:50.443300374Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1146)
2025-08-14T13:32:50.443303865Z at org.apache.kafka.common.utils.Utils.closeAllQuietly(Utils.java:1161)
2025-08-14T13:32:50.443307205Z at org.apache.kafka.common.network.Selector.close(Selector.java:387)
2025-08-14T13:32:50.443310765Z at org.apache.kafka.clients.NetworkClient.close(NetworkClient.java:689)
2025-08-14T13:32:50.443314065Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1125)
2025-08-14T13:32:50.443317385Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1108)
2025-08-14T13:32:50.443320695Z at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1490)
2025-08-14T13:32:50.443324145Z at java.base/java.lang.Thread.run(Thread.java:1583)
2025-08-14T13:32:50.443517070Z �[30m2025-08-14 13:32:50,443�[0;39m �[39mDEBUG�[0;39m [�[34mkafka-admin-client-thread | kafbat-ui-admin-1755178369-6�[0;39m] �[33mo.a.k.c.n.SslTransportLayer�[0;39m: [SslTransportLayer channelId=8 key=channel=java.nio.channels.SocketChannel[connection-pending remote=b-8.ehiskafkaprod.w6xlhq.c17.kafka.us-east-1.amazonaws.com/10.200.146.191:9098], selector=sun.nio.ch.EPollSelectorImpl@51163b59, interestOps=8, readyOps=0] SSLEngine.closeInBound() raised an exception.
2025-08-14T13:32:50.443525120Z javax.net.ssl.SSLException: closing inbound before receiving peer's close_notify
2025-08-14T13:32:50.443529330Z at java.base/sun.security.ssl.SSLEngineImpl.closeInbound(SSLEngineImpl.java:794)
2025-08-14T13:32:50.443532950Z at org.apache.kafka.common.network.SslTransportLayer.close(SslTransportLayer.java:204)
2025-08-14T13:32:50.443536140Z at org.apache.kafka.common.utils.Utils.closeAll(Utils.java:1022)
2025-08-14T13:32:50.443539410Z at org.apache.kafka.common.network.KafkaChannel.close(KafkaChannel.java:155)
2025-08-14T13:32:50.443542520Z at org.apache.kafka.common.network.Selector.doClose(Selector.java:976)
2025-08-14T13:32:50.443545540Z at org.apache.kafka.common.network.Selector.close(Selector.java:960)
2025-08-14T13:32:50.443548440Z at org.apache.kafka.common.network.Selector.close(Selector.java:906)
2025-08-14T13:32:50.443551330Z at org.apache.kafka.common.network.Selector.lambda$null$0(Selector.java:388)
2025-08-14T13:32:50.443554870Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1146)
2025-08-14T13:32:50.443558060Z at org.apache.kafka.common.utils.Utils.closeAllQuietly(Utils.java:1161)
2025-08-14T13:32:50.443576471Z at org.apache.kafka.common.network.Selector.close(Selector.java:387)
2025-08-14T13:32:50.443580391Z at org.apache.kafka.clients.NetworkClient.close(NetworkClient.java:689)
2025-08-14T13:32:50.443587271Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1125)
2025-08-14T13:32:50.443590901Z at org.apache.kafka.common.utils.Utils.closeQuietly(Utils.java:1108)
2025-08-14T13:32:50.443594371Z at org.apache.kafka.clients.admin.KafkaAdminClient$AdminClientRunnable.run(KafkaAdminClient.java:1490)
2025-08-14T13:32:50.443598791Z at java.base/java.lang.Thread.run(Thread.java:1583)
2025-08-14T13:32:50.444695667Z �[30m2025-08-14 13:32:50,444�[0;39m �[34mINFO �[0;39m [�[34mkafka-admin-client-thread | kafbat-ui-admin-1755178369-6�[0;39m] �[33mo.a.k.c.m.Metrics�[0;39m: Metrics scheduler closed
2025-08-14T13:32:50.444711037Z �[30m2025-08-14 13:32:50,444�[0;39m �[34mINFO �[0;39m [�[34mkafka-admin-client-thread | kafbat-ui-admin-1755178369-6�[0;39m] �[33mo.a.k.c.m.Metrics�[0;39m: Closing reporter org.apache.kafka.common.metrics.JmxReporter
2025-08-14T13:32:50.444714807Z �[30m2025-08-14 13:32:50,444�[0;39m �[34mINFO �[0;39m [�[34mkafka-admin-client-thread | kafbat-ui-admin-1755178369-6�[0;39m] �[33mo.a.k.c.m.Metrics�[0;39m: Metrics reporters closed
2025-08-14T13:32:50.444718907Z �[30m2025-08-14 13:32:50,444�[0;39m �[39mDEBUG�[0;39m [�[34mkafka-admin-client-thread | kafbat-ui-admin-1755178369-6�[0;39m] �[33mo.a.k.c.a.KafkaAdminClient�[0;39m: [AdminClient clientId=kafbat-ui-admin-1755178369-6] Exiting AdminClientRunnable thread.
2025-08-14T13:32:50.444901342Z �[30m2025-08-14 13:32:50,444�[0;39m �[39mDEBUG�[0;39m [�[34mparallel-1�[0;39m] �[33mo.a.k.c.a.KafkaAdminClient�[0;39m: [AdminClient clientId=kafbat-ui-admin-1755178369-6] Kafka admin client closed.
2025-08-14T13:32:50.445130157Z �[30m2025-08-14 13:32:50,444�[0;39m �[1;31mERROR�[0;39m [�[34mparallel-1�[0;39m] �[33mi.k.u.s.StatisticsService�[0;39m: Failed to collect cluster ehis-prod info
2025-08-14T13:32:50.445139117Z java.lang.IllegalStateException: Error while creating AdminClient for the cluster ehis-prod
2025-08-14T13:32:50.445143377Z at io.kafbat.ui.service.AdminClientServiceImpl.lambda$createAdminClient$5(AdminClientServiceImpl.java:58)
2025-08-14T13:32:50.445157838Z at reactor.core.publisher.Mono.lambda$onErrorMap$29(Mono.java:3862)
Additional context
No response