-
Notifications
You must be signed in to change notification settings - Fork 3.7k
Closed
Description
Search before asking
- I searched in the issues and found nothing similar.
Version
2.10.5
Minimal reproduce step
- Create topic
bin/pulsar-admin tenants create tenant1
bin/pulsar-admin namespaces create tenant1/namespace1
bin/pulsar-admin namespaces set-persistence --bookkeeper-ack-quorum 2 --bookkeeper-ensemble 3 --bookkeeper-write-quorum 3 --ml-mark-delete-max-rate 0 tenant1/namespace1
bin/pulsar-admin namespaces set-retention tenant1/namespace1 --size -1 --time 3d
bin/pulsar-admin namespaces set-message-ttl tenant1/namespace1 --messageTTL 604800
bin/pulsar-admin topics create-partitioned-topic tenant1/namespace1/topic1 -p 3
- Produce large payload & batch from the admin tool with tls
bin/pulsar-perf produce persistent://tenant1/namespace1/topic1 -mk autoIncrement -bb 5242880 -r 5000 -s 5242 -bm 1000 -threads 30 --auth-plugin org.apache.pulsar.client.impl.auth.AuthenticationTls --auth-params '{"tlsCertFile":"conf/user.cer","tlsKeyFile":"conf/user.key.pem"}'
-
Stop until it produced around 1 million messages
-
Wait until all the messages goes to BookKeeper backlog
-
Start consumer to consume all the messages with tls
bin/pulsar-perf consume persistent://tenant1/namespace1/topic1 --auth-plugin org.apache.pulsar.client.impl.auth.AuthenticationTls --auth-params '{"tlsCertFile":"conf/user.cer","tlsKeyFile":"conf/user.key.pem"}' -sp Earliest -ss sub1
What did you expect to see?
Able to consume all produced messages properly from the consumer
What did you see instead?
Consumer stopped receiving msg in the middle, and could see some error from the broker logs like
2024-01-19T14:05:39,899+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] ERROR org.apache.bookkeeper.proto.checksum.DigestManager - Mac mismatch for ledger-id: 852, entry-id: 35932
2024-01-19T14:05:39,902+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] ERROR org.apache.bookkeeper.proto.checksum.DigestManager - Mac mismatch for ledger-id: 852, entry-id: 35932
2024-01-19T14:05:39,916+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] ERROR org.apache.bookkeeper.proto.checksum.DigestManager - Mac mismatch for ledger-id: 852, entry-id: 35932
2024-01-19T14:05:39,916+0000 [BookKeeperClientWorker-OrderedExecutor-4-0] ERROR org.apache.bookkeeper.client.PendingReadOp - Read of ledger entry failed: L852 E35899-E35998, Sent to [100.87.157.209:3181, 100.111.147.236:3181, 100.96.184.253:3181], Heard from [100.87.157.209:3181, 100.111.147.236:3181, 100.96.184.253:3181] : bitset = {0, 1, 2}, Error = 'Entry digest does not match'. First unread entry is (35973, rc = 0)
2024-01-19T14:05:39,916+0000 [broker-topic-workers-OrderedExecutor-15-0] ERROR org.apache.pulsar.broker.service.persistent.PersistentDispatcherSingleActiveConsumer - [persistent://tenant1/namespace1/topic1-0 / sub1-Consumer{subscription=PersistentSubscription{topic=persistent://tenant1/namespace1/topic1-0, name=sub1}, consumerId=0, consumerName=383fd, address=/100.96.184.253:50090}] Error reading entries at 852:35899 : Entry digest does not match - Retrying to read in 15.0 seconds
Anything else?
Seems only happening when there is SSL exception in the middle of the produce like
2024-01-19T13:39:13,450+0000 [pulsar-client-io-12-1] WARN org.apache.pulsar.client.impl.ClientCnx - Got exception io.netty.handler.codec.DecoderException: io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: error:100003fc:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_RECORD_MAC
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:397)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: error:100003fc:SSL routines:OPENSSL_internal:SSLV3_ALERT_BAD_RECORD_MAC
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.newSSLExceptionForError(ReferenceCountedOpenSslEngine.java:1377)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.shutdownWithError(ReferenceCountedOpenSslEngine.java:1089)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.sslReadErrorResult(ReferenceCountedOpenSslEngine.java:1399)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1325)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1426)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1469)
at io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:223)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1353)
at io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1257)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1297)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
... 15 more
or
2024-01-19T14:01:02,532+0000 [pulsar-client-io-6-1] WARN org.apache.pulsar.client.impl.ClientCnx - Got exception io.netty.handler.codec.DecoderException: io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: error:10000438:SSL routines:OPENSSL_internal:TLSV1_ALERT_INTERNAL_ERROR
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:499)
at io.netty.handler.codec.ByteToMessageDecoder.channelRead(ByteToMessageDecoder.java:290)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:444)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.AbstractChannelHandlerContext.fireChannelRead(AbstractChannelHandlerContext.java:412)
at io.netty.channel.DefaultChannelPipeline$HeadContext.channelRead(DefaultChannelPipeline.java:1410)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:440)
at io.netty.channel.AbstractChannelHandlerContext.invokeChannelRead(AbstractChannelHandlerContext.java:420)
at io.netty.channel.DefaultChannelPipeline.fireChannelRead(DefaultChannelPipeline.java:919)
at io.netty.channel.epoll.AbstractEpollStreamChannel$EpollStreamUnsafe.epollInReady(AbstractEpollStreamChannel.java:800)
at io.netty.channel.epoll.EpollEventLoop.processReady(EpollEventLoop.java:499)
at io.netty.channel.epoll.EpollEventLoop.run(EpollEventLoop.java:397)
at io.netty.util.concurrent.SingleThreadEventExecutor$4.run(SingleThreadEventExecutor.java:997)
at io.netty.util.internal.ThreadExecutorMap$2.run(ThreadExecutorMap.java:74)
at io.netty.util.concurrent.FastThreadLocalRunnable.run(FastThreadLocalRunnable.java:30)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: io.netty.handler.ssl.ReferenceCountedOpenSslEngine$OpenSslException: error:10000438:SSL routines:OPENSSL_internal:TLSV1_ALERT_INTERNAL_ERROR
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.newSSLExceptionForError(ReferenceCountedOpenSslEngine.java:1377)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.shutdownWithError(ReferenceCountedOpenSslEngine.java:1089)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.sslReadErrorResult(ReferenceCountedOpenSslEngine.java:1399)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1325)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1426)
at io.netty.handler.ssl.ReferenceCountedOpenSslEngine.unwrap(ReferenceCountedOpenSslEngine.java:1469)
at io.netty.handler.ssl.SslHandler$SslEngineType$1.unwrap(SslHandler.java:223)
at io.netty.handler.ssl.SslHandler.unwrap(SslHandler.java:1353)
at io.netty.handler.ssl.SslHandler.decodeNonJdkCompatible(SslHandler.java:1257)
at io.netty.handler.ssl.SslHandler.decode(SslHandler.java:1297)
at io.netty.handler.codec.ByteToMessageDecoder.decodeRemovalReentryProtection(ByteToMessageDecoder.java:529)
at io.netty.handler.codec.ByteToMessageDecoder.callDecode(ByteToMessageDecoder.java:468)
... 15 more
Are you willing to submit a PR?
- I'm willing to submit a PR!
SpikeBlues, avillev, semistone, mcanalesmayo and GmaD-X
Metadata
Metadata
Assignees
Labels
No labels