understanding kafka connect topics and configurations (offset.storage.topic, config.storage.topic, status.storage.topic) #7738
-
Hi all, I have a problem, which I now have a work around, but I'm trying to figure out the root cause and while doing so understand better if I have some miss configuration on my internal Kafka connect topics. As part of the resource to define the KafkaConnect we have:
What I experienced is that I updated my connector and had some dependencies to a file in the file system that was always causing the connectors to fail, because they exist in the new configuration but not in the old. I went and checked the topic itself with the kafka-cli and I noticed I had these configurations:
Now the problem I was experiencing was the fact that it seemed the tasks and the connector itself at each restart would pick up the older configurations and not the ones defined in the "kind: KafkaConnector" My question is, do I need to define any special kind of Kafka policy in these topics so that they always pick up the correct values? I was wondering if there is some sort of KafkaTopic configuration I need to define so that it prevents this issue. I to be honest could not find if there is a way to configure this at topic level with the topic resource, but I also don't know if this is required/expected I was looking at the KafkaTopic CRD
And not only I did not find anything obvious to me that I needed to configure but I would also prefer to not thinker with these defaults. Can someone maybe give me some hints on how I might be miss configuring these internal topics? UPDATE: I have now better explanation on the problem and how did I worked around it, but I'm still confused if this is really the expected results, or I missed something. In short, the main problem is this:
Say we have a KafkaConnector YAML resource definition, let's call it version 1 that defines X tasks, and they reference a property file located in a file, say
You update the YAML configuration to version 2 and the application slightly changed (new KafkaConnect version) and now the property will be referenced somewhere else:
After this update the tasks will go into error because it will say that file from version 1 (connection-details/jdbc-details.properties) cannot be found, which it's true, they should now be looking for file of version 2. Initially I thought the problem was the connector and the fact that my configurations were not picked up from Kafka. To work around this, I tried to restart the tasks with the Rest API from KafkaConnector but that did not work. I needed to change the YAML to reduce to say 1 task instead of whatever was defined before. |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 11 replies
-
TBH, I don't really follow the problem from your description. But Kafka Connect manages its topics on its own. So you should not interfere with them normally (+ it also validates some of the topic configurations at startup). |
Beta Was this translation helpful? Give feedback.
-
FWIW, we've been bitten by this issue a number of times as well. As you said, our only known workaround when configs seem not to propagate is to resize down to 1 task then back up again. From the JavaDoc:
Perhaps the |
Beta Was this translation helpful? Give feedback.
FWIW, we've been bitten by this issue a number of times as well. As you said, our only known workaround when configs seem not to propagate is to resize down to 1 task then back up again.
It seems the config propagation logic is quite complex within Kafka Connect. See https://github.com/apache/kafka/blob/3.3.2/connect/runtime/src/main/java/org/apache/kafka/connect/storage/KafkaConfigBackingStore.java
From the JavaDoc:
Perhaps the
commit
entry never occurs, or some Task config state remains inconsistent preventing the new config from be…