debezium/dbz#1691 Add Python example demonstrating Debezium Connect-mode#400
debezium/dbz#1691 Add Python example demonstrating Debezium Connect-mode#400KMohnishM wants to merge 1 commit intodebezium:mainfrom
Conversation
|
@kmos This is definitely good point. Unfortunately it is the way how pydbzengine is doing it - see https://github.com/memiiso/pydbzengine/tree/main/pydbzengine/debezium/libs But yes it would be really nice if the Python coude woul be just obtain the Java libs somethow. |
|
But @jpechane, unless I am missing something, isn't the In addition, I don't believe we need the |
|
@Naros That's True. The Let me verify that the example works correctly without committing the JAR files and update the PR accordingly. I will also remove the I will push an update shortly . |
| # Fallback: unknown schema type, return string representation | ||
| return str(value) | ||
|
|
||
| if schema_type == "STRUCT": |
There was a problem hiding this comment.
Debezium supports much more datatypes like timstamp etc. Could you extend the mapping to be properly converted?
There was a problem hiding this comment.
Good point, thanks for calling this out.
I have extended the conversion logic to support Kafka Connect and Debezium logical types, including Timestamp, Date, Time, Decimal, and Debezium-specific types like MicroTimestamp and ZonedTimestamp.
The implementation is in connect_message.py. Let me know if there are any additional types you’d like to see covered .
| elif schema_type == "STRING": | ||
| return str(value) | ||
|
|
||
| elif schema_type in ("INT8", "INT16", "INT32", "INT64"): |
There was a problem hiding this comment.
I wonder if we can support conversion to either Python native types or numpy types dependning on user needs.
| field_value = value.get(field) | ||
| except Exception: | ||
| field_value = None | ||
| result[field_name] = struct_to_dict(field_value, field.schema()) |
There was a problem hiding this comment.
For some of the table we might prefer pre-defined static classes. Can we support both struct_to_dict and a configuration where we define topic/table name to classname mapping?
There was a problem hiding this comment.
Yes, I have added support for that.
The default behavior still returns dictionaries via struct_to_dict, but I introduced an optional topic/table-to-class mapping.
If a mapping is provided, records can be materialized into typed Python objects. The mapping supports full topic name, short topic, or table name for flexibility.
|
@KMohnishM Please squash the commits to the JARs are not recored in git history. |
… Mode Signed-off-by: Mohnish <kmohnishm@gmail.com>
Description
This PR adds a Python example demonstrating how to use Debezium in Connect-mode using the
pydbzengineintegration.The example shows how Debezium can be run in Connect mode without JSON serialization overhead, allowing direct access to structured change events from Python applications.
The example includes a minimal setup for running a PostgreSQL connector and consuming CDC events using Python. It also provides a utility script to download the required Debezium dependencies via Maven.
Added a Python example project under
debezium-python/