Add MariaDB migration code written by Claude#278
Add MariaDB migration code written by Claude#278GeoffMontee wants to merge 1 commit intoscylladb:masterfrom
Conversation
|
build fail, fix it first |
| // ScyllaDBConnection Implementation | ||
| // ============================================================================ | ||
|
|
||
| ScyllaDBConnection::ScyllaDBConnection(const ScyllaDBConfig& config) |
There was a problem hiding this comment.
eh ?
Scylla is supported in spark, we don't need a cpp connection to it
not mentioning cpp driver for scylla is obsolete and it will be replaced by its cpp-over-rust variant
so this code is completely useless and unoptimal
| * Licensed under Apache License 2.0 | ||
| */ | ||
|
|
||
| #include "mariadb_scylla_migrator.h" |
There was a problem hiding this comment.
I find it hard to believe mariadb doesn't have a spark DF connector
seems it does : https://mariadb.com/ja/resources/blog/hands-on-mariadb-columnstore-spark-connector/
maybe that would be more usefull than cpp native call?
There was a problem hiding this comment.
Hi @tarzanek ,
MariaDB Columnstore is different from native MariaDB. That blog post is also from 2018, which is ancient history in terms of MariaDB Columnstore maturity. That was back when Columnstore was just rebranded InfiniDB.
I chose MariaDB Connector/C for this , because it is the only MariaDB connector that has an API for the binlog.
Perhaps the binlog streaming/applying functionality should be separate from the Spark functionality. That would allow us to use something like MariaDB Connector/J for the spark functionality, but still use MariaDB Connector/C for the binlog streaming/applying functionality. The binlog stuff probably has to occur on one node at a time anyway. It's not like it can be divided up and given to multiple workers, because commit ordering is very important.
I'd love to meet with you sometime and discuss the best way to implement all of this. Let me know if you're down.
Thanks!
tarzanek
left a comment
There was a problem hiding this comment.
compiling a native library for spark seems like an overkill
(executors can be heterogeneous , so what will be the same is JDK version, so ideally is to build against it and not rely on below OS and its libs)
This was entirely written by Claude. We might need to fix some stuff. It's more of an intellectual exercise than a production-ready feature.