Support for distributed migrations #130

rcampos87 · 2025-08-15T20:37:50Z

Main motivation for this PR is to fix the handling of migrations performed by django through a load balancer, which can lead to inconsistent results if a clickhouse cluster with multiple nodes is behind a load balancer and round-robin is in effect. By making migrations distributed, all nodes are aware of the migration data and we can have much more consistent results when running manage.py migrate. It also makes the process of distributing migrations data automatic. (See discussion #114)

When having distributed_migrations and migration_cluster set, new distributed and local tables will be created for migrations, and all migration querysets will be routed to the distributed table.

In order to test the load balacing use case, a new docker compose service was added for HAProxy. For simplicity, already existent clickhouse nodes were used behind the HAProxy.

Example configuration would be

{
        "ENGINE": "clickhouse_backend.backend",
        "HOST": "load-balancer.dns",
        "PORT": 9004,
        ....
        "OPTIONS": {
            "distributed_migrations": True,
            "migration_cluster": "cluster",
         }
}

In my case, a clickhouse cluster with 3 nodes is behind an AWS ELB and everytime when running makemigrations or migrate, a different result could be achieved, and by using distributed migrations, all my issues were gone.

jayvynl · 2025-08-22T09:48:26Z

Hi could you add some test for Undo a Migration That’s Already Been Applied

when distributed_migrations is on

rcampos87 · 2025-08-25T15:53:54Z

@jayvynl ok, tests added. When running the whole test suite, i ran into failures on the tests_datetime.py, perhaps due to my location.

tests/migrations/test_loader.py

jayvynl

Thank you for the pr, I have leave some comments, mainly for test improve. And don't forget write changelog.

jayvynl · 2025-08-28T15:32:05Z

I have confirmed the problem, but I am unable to find the cause. It's very strange, because distributed table have been tested in existing testcase, insertions and mutations can be queried immediatly on all nodes.

jayvynl · 2025-09-05T15:21:40Z

@rcampos87 According to clickhouse distributed engine document, It's recommend to use replicated tables as the ubderlying table.

If internal_replication is set to false (the default), data is written to all replicas. In this case, the Distributed table replicates data itself. This is worse than using replicated tables because the consistency of replicas is not checked and, over time, they will contain slightly different data.

In clickhouse-config of this project, internal_replication is set to true, if you use plain MergeTree as the underlying table, the Distributed table replicates data itself. , so the lag occurrs.

rcampos87 · 2025-09-05T15:31:41Z

ah I see. thanks for looking into it @jayvynl

rcampos87 · 2025-09-08T18:31:34Z

@jayvynl using ReplicatedMergeTree seems to have solved the lag indeed. Added the tests you asked for too. RMT only works if there are replicas by what I read, so migrations still need to support MergeTree too.

rcampos87 · 2025-09-08T19:17:28Z

Ok, added a simple check for replicas.

Handle distributed migrations

5ce8834

add extra test and update README.md

f63869e

fix

b14830f

jayvynl reviewed Aug 26, 2025

View reviewed changes

tests/migrations/test_loader.py Outdated Show resolved Hide resolved

jayvynl reviewed Aug 26, 2025

View reviewed changes

tests/migrations/test_loader.py Outdated Show resolved Hide resolved

jayvynl reviewed Aug 26, 2025

View reviewed changes

fix

8cfd5e9

fix

a386db6

rcampos87 added 2 commits September 8, 2025 20:10

fix

1e15bc9

fix

b4ed144

rcampos87 requested a review from jayvynl September 8, 2025 19:17

rcampos87 added 3 commits September 8, 2025 20:30

lint

478e0e6

simplify

bbf6df8

remove unused

60572dc

jayvynl merged commit 0abe336 into jayvynl:main Sep 18, 2025
60 checks passed

rcampos87 deleted the feature/distributed-migrations branch September 18, 2025 13:33

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Support for distributed migrations #130

Support for distributed migrations #130

Uh oh!

rcampos87 commented Aug 15, 2025

Uh oh!

jayvynl commented Aug 22, 2025

Uh oh!

rcampos87 commented Aug 25, 2025

Uh oh!

Uh oh!

Uh oh!

jayvynl left a comment

Uh oh!

jayvynl commented Aug 28, 2025 •

edited

Loading

Uh oh!

jayvynl commented Sep 5, 2025 •

edited

Loading

Uh oh!

rcampos87 commented Sep 5, 2025

Uh oh!

rcampos87 commented Sep 8, 2025 •

edited

Loading

Uh oh!

rcampos87 commented Sep 8, 2025

Uh oh!

Uh oh!

Uh oh!

Support for distributed migrations #130

Support for distributed migrations #130

Uh oh!

Conversation

rcampos87 commented Aug 15, 2025

Uh oh!

jayvynl commented Aug 22, 2025

Uh oh!

rcampos87 commented Aug 25, 2025

Uh oh!

Uh oh!

Uh oh!

jayvynl left a comment

Choose a reason for hiding this comment

Uh oh!

jayvynl commented Aug 28, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jayvynl commented Sep 5, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rcampos87 commented Sep 5, 2025

Uh oh!

rcampos87 commented Sep 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

rcampos87 commented Sep 8, 2025

Uh oh!

Uh oh!

Uh oh!

jayvynl commented Aug 28, 2025 •

edited

Loading

jayvynl commented Sep 5, 2025 •

edited

Loading

rcampos87 commented Sep 8, 2025 •

edited

Loading