Allow defining multiple source kafka topics #2060
Allow defining multiple source kafka topics #2060Fadelis wants to merge 1 commit intoGoogleCloudPlatform:mainfrom
Conversation
|
This pull request has been marked as stale due to 180 days of inactivity. It will be closed in 1 week if no further activity occurs. If you think that’s incorrect or this pull request requires a review, please simply write any comment. If closed, you can revive the PR at any time. Thank you for your contributions. |
|
Bump @GoogleCloudPlatform/dataflow-templates-wg |
…ist kafka topic with the message
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2060 +/- ##
============================================
+ Coverage 49.57% 49.58% +0.01%
- Complexity 4792 5142 +350
============================================
Files 941 941
Lines 57618 57628 +10
Branches 6235 6236 +1
============================================
+ Hits 28565 28577 +12
+ Misses 27010 27006 -4
- Partials 2043 2045 +2
🚀 New features to boost your workflow:
|
|
The new Flex templates support reading from a single topic only (the example is incorrect, I'll fix the example), because they have a lot of extra functionality compared to the legacy templates (e.g. reading messages with different schema from the same topic, writing to different BigQuery tables based on schema, etc.), and supporting this while also reading from multiple topics is not straightforward and will require more changes than just parsing the input topic as a list. At the minimum, we would need better support for the All in all, IMO all this would make the template way too complicated. It's supposed to cover only some basic use cases. If someone has a complex pipeline that requires all this functionality while also reading from multiple topics, I'd recommend writing a custom pipeline for this use case instead (or a custom template). |
Supporting possibility listening to multiple Kafka topics in dataflows would be very valuable. It was supported in non Flex templates and seems like the base options example hints to it as well, but somehow it wasn't not implemented (and there's a bug for it #2038)
DataflowTemplates/v2/kafka-common/src/main/java/com/google/cloud/teleport/v2/kafka/options/KafkaReadOptions.java
Line 40 in ff60a5f
With the possibility of listening to multiple topics, it would be also very beneficial to be able to enrich the stored record with the topic of the message, so added such functionality in a similar fashion as the
persistKafkaKeyoption.