-
Couldn't load subscription status.
- Fork 340
Description
Hello! I'm trying to do basic message backups to S3 using https://docs.confluent.io/kafka-connectors/s3-sink/current/overview.html#schema-evolution
When trying to use the s3-source to restore messages into a brand new Kafka, I get the error:
[2023-08-17 16:03:31,521] ERROR [s3-source|task-0] WorkerSourceTask{id=s3-source-0} Task threw an uncaught and unrecoverable exception. Task is being killed and will not recover until manually restarted (org.apache.kafka.connect.runtime.WorkerTask:210)
org.apache.kafka.connect.errors.ConnectException: Error while executing read with record co-ordinates : RecordCoordinates [storagePartition=topics/staging-teambank-risk-assessment-calculated/, startOffset=0, endOffset=4904]
at io.confluent.connect.cloud.storage.errorhandler.handlers.ReThrowErrorHandler.handle(ReThrowErrorHandler.java:21)
at io.confluent.connect.cloud.storage.source.util.StorageObjectSourceReader.nextRecord(StorageObjectSourceReader.java:69)
at io.confluent.connect.cloud.storage.source.StorageSourceTask.poll(StorageSourceTask.java:161)
at org.apache.kafka.connect.runtime.AbstractWorkerSourceTask.poll(AbstractWorkerSourceTask.java:457)
at org.apache.kafka.connect.runtime.AbstractWorkerSourceTask.execute(AbstractWorkerSourceTask.java:351)
at org.apache.kafka.connect.runtime.WorkerTask.doRun(WorkerTask.java:202)
at org.apache.kafka.connect.runtime.WorkerTask.run(WorkerTask.java:257)
at org.apache.kafka.connect.runtime.AbstractWorkerSourceTask.run(AbstractWorkerSourceTask.java:75)
at org.apache.kafka.connect.runtime.isolation.Plugins.lambda$withClassLoader$1(Plugins.java:177)
at java.base/java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:515)
at java.base/java.util.concurrent.FutureTask.run(FutureTask.java:264)
at java.base/java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1128)
at java.base/java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:628)
at java.base/java.lang.Thread.run(Thread.java:829)
Caused by: java.lang.ClassCastException: class java.nio.HeapByteBuffer cannot be cast to class org.apache.avro.generic.GenericRecord (java.nio.HeapByteBuffer is in module java.base of loader 'bootstrap'; org.apache.avro.generic.GenericRecord is in unnamed module of loader org.apache.kafka.connect.runtime.isolation.PluginClassLoader @2beee7ff)
at io.confluent.connect.cloud.storage.source.format.CloudStorageAvroFormat.extractRecord(CloudStorageAvroFormat.java:75)
at io.confluent.connect.cloud.storage.source.StorageObjectFormat.nextRecord(StorageObjectFormat.java:72)
at io.confluent.connect.cloud.storage.source.util.StorageObjectSourceReader.nextRecord(StorageObjectSourceReader.java:65)
... 12 more
[2023-08-17 16:03:31,523] INFO [s3-source|task-0] Stopping storage source connector (io.confluent.connect.cloud.storage.source.StorageSourceTask:233)
Configs for cluster 1
Where I'm trying to backup messages using s3-sink. Running a separate pod from the Kafka using:
/opt/bitnami/kafka/bin/connect-standalone.sh /config/connect-standalone.properties /config/sink.propertiesconnect-standalone.properties
bootstrap.servers=kafka-0.kafka-headless.kafka.svc.cluster.local:9093,kafka-1.kafka-headless.kafka.svc.cluster.local:9093,kafka-2.kaf
ka-headless.kafka.svc.cluster.local:9093
offset.flush.interval.ms=10000
offset.storage.file.filename=/tmp/connect.offsets
plugin.path=/opt/bitnami/kafka/plugins
key.converter=org.apache.kafka.connect.converters.ByteArrayConverter
value.converter=org.apache.kafka.connect.converters.ByteArrayConvertersink.properties
name=s3-sink
connector.class=io.confluent.connect.s3.S3SinkConnector
topics.regex=.*
flush.size=10000
rotate.schedule.interval.ms=600000
locale=en_US
timezone=Europe/Berlin
format.class=io.confluent.connect.s3.format.avro.AvroFormat
schema.compatibility=NONE
schema.generator.class=io.confluent.connect.storage.hive.schema.DefaultSchemaGenerator
partition.duration.ms=600000
partitioner.class=io.confluent.connect.storage.partitioner.TimeBasedPartitioner
path.format='year'=YYYY/'month'=MM/'day'=dd/'hour'=HH
storage.class=io.confluent.connect.s3.storage.S3Storage
store.url=https://XXXXXXXXXXXXXXXX
s3.bucket.name=kafka-backup-testing
s3.bucket.tagging=true
s3.part.size=5242880
aws.access.key.id=XXXXXXXXXXXXXXXX
aws.secret.access.key=XXXXXXXXXXXXXXXX
behavior.on.error=failConfigs for cluster 2
Where I'm trying to restore messages using s3-source. Running a separate pod from the Kafka using:
/opt/bitnami/kafka/bin/connect-standalone.sh /config/connect-standalone.properties /config/source.propertiesconnect-standalone.properties
(same as cluster 1)
source.properties
name=s3-source
connector.class=io.confluent.connect.s3.source.S3SourceConnector
confluent.topic.bootstrap.servers=kafka-0.kafka-headless.kafka.svc.cluster.local:9093,kafka-1.kafka-headless.kafka.svc.cluster.local:9093,kafka-2.kafka-headless.kafka.svc.cluster.local:9093
tasks.max=1
format.class=io.confluent.connect.s3.format.avro.AvroFormat
partitioner.class=io.confluent.connect.storage.partitioner.TimeBasedPartitioner
path.format='year'=YYYY/'month'=MM/'day'=dd/'hour'=HH
storage.class=io.confluent.connect.s3.storage.S3Storage
store.url=https://XXXXXXXXXXXXXXXX
s3.bucket.name=kafka-backup-testing
aws.access.key.id=XXXXXXXXXXXXXXXX
aws.secret.access.key=XXXXXXXXXXXXXXXX
behavior.on.error=failWhat am I doing wrong here?
Btw the messages are protobuf encoded, but I don't want to lock in the message format with the protobuf converter. To my understanding, I just want to use the ByteArrayConverter as I just want to backup and restore the messages as-is.