Add MariaDB migration code written by Claude by GeoffMontee · Pull Request #278 · scylladb/scylla-migrator

GeoffMontee · 2025-12-16T02:10:09Z

This was entirely written by Claude. We might need to fix some stuff. It's more of an intellectual exercise than a production-ready feature.

tarzanek · 2025-12-16T08:27:38Z

build fail, fix it first

tarzanek · 2025-12-16T08:29:20Z

native/src/scylladb_connection.cpp

+// ScyllaDBConnection Implementation
+// ============================================================================
+
+ScyllaDBConnection::ScyllaDBConnection(const ScyllaDBConfig& config)


eh ?

Scylla is supported in spark, we don't need a cpp connection to it
not mentioning cpp driver for scylla is obsolete and it will be replaced by its cpp-over-rust variant

so this code is completely useless and unoptimal

tarzanek · 2025-12-16T08:30:43Z

native/src/mariadb_connection.cpp

+ * Licensed under Apache License 2.0
+ */
+
+#include "mariadb_scylla_migrator.h"


I find it hard to believe mariadb doesn't have a spark DF connector

seems it does : https://mariadb.com/ja/resources/blog/hands-on-mariadb-columnstore-spark-connector/
maybe that would be more usefull than cpp native call?

Hi @tarzanek ,

MariaDB Columnstore is different from native MariaDB. That blog post is also from 2018, which is ancient history in terms of MariaDB Columnstore maturity. That was back when Columnstore was just rebranded InfiniDB.

I chose MariaDB Connector/C for this , because it is the only MariaDB connector that has an API for the binlog.

Perhaps the binlog streaming/applying functionality should be separate from the Spark functionality. That would allow us to use something like MariaDB Connector/J for the spark functionality, but still use MariaDB Connector/C for the binlog streaming/applying functionality. The binlog stuff probably has to occur on one node at a time anyway. It's not like it can be divided up and given to multiple workers, because commit ordering is very important.

I'd love to meet with you sometime and discuss the best way to implement all of this. Let me know if you're down.

Thanks!

tarzanek

compiling a native library for spark seems like an overkill
(executors can be heterogeneous , so what will be the same is JDK version, so ideally is to build against it and not rely on below OS and its libs)

Add MariaDB migration code written by Claude

a85692f

GeoffMontee requested review from julienrf and tarzanek December 16, 2025 02:26

tarzanek reviewed Dec 16, 2025

View reviewed changes

tarzanek requested changes Dec 16, 2025

View reviewed changes

GeoffMontee marked this pull request as draft December 16, 2025 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add MariaDB migration code written by Claude#278

Add MariaDB migration code written by Claude#278
GeoffMontee wants to merge 1 commit intoscylladb:masterfrom
GeoffMontee:add-mariadb-source

GeoffMontee commented Dec 16, 2025

Uh oh!

tarzanek commented Dec 16, 2025

Uh oh!

tarzanek Dec 16, 2025

Uh oh!

tarzanek Dec 16, 2025

Uh oh!

GeoffMontee Dec 16, 2025

Uh oh!

tarzanek left a comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants

Conversation

GeoffMontee commented Dec 16, 2025

Uh oh!

tarzanek commented Dec 16, 2025

Uh oh!

tarzanek Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

tarzanek Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

GeoffMontee Dec 16, 2025

Choose a reason for hiding this comment

Uh oh!

tarzanek left a comment

Choose a reason for hiding this comment

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

2 participants