-
-
Notifications
You must be signed in to change notification settings - Fork 60
ref(migrations): Use ON CLUSTER for DDL statements #7668
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Refactor migration DDL operations to use ClickHouse's ON CLUSTER syntax instead of executing the same SQL on each node individually. This is more efficient and atomic for multi-node clusters. Changes: - Add `_get_on_cluster_clause()` helper to SqlOperation that returns the ON CLUSTER clause for multi-node clusters - Add `_get_execution_node()` helper to get a single node for execution (ON CLUSTER handles distribution to other nodes) - Modify `execute()` to use single-node execution instead of per-node iteration - Add `alter_sync=2` to MIGRATE settings so ClickHouse blocks until all replicas confirm completion (removes need for mutation polling) - Update all DDL operations (CreateTable, DropTable, AddColumn, etc.) to include ON CLUSTER clause in their format_sql() methods - Keep InsertIntoSelect using per-node execution (it's DML, not DDL) - Remove _block_on_mutations() polling since alter_sync=2 handles this Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Fix ON CLUSTER clause to use the appropriate cluster name based on target: - LOCAL target: uses cluster_name (for storage nodes) - DISTRIBUTED target: uses distributed_cluster_name (for query nodes) This fixes the test_distributed_migrations test failures where the distributed tables (like migrations_dist) weren't being created because we were using the wrong cluster name. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Increase the MIGRATE client settings timeout from 10 seconds to 5 minutes (300000ms) to allow ON CLUSTER DDL operations to complete across all replicas. This is needed because alter_sync=2 blocks until all replicas confirm completion, which can take longer than the previous 10 second timeout on larger clusters. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
RunSql runs arbitrary SQL statements that may not support ON CLUSTER syntax (e.g., queries, DML, or statements that already contain ON CLUSTER). Override execute() to use per-node execution similar to InsertIntoSelect. This fixes the distributed_migrations test failures where RunSql was only executing on one node but subsequent AddIndex operations using ON CLUSTER expected the column to exist on all nodes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Add _execute_per_node() method to SqlOperation base class and use it in both RunSql and InsertIntoSelect instead of duplicating the same logic. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
The previous approach of patching ClickhouseCluster class methods didn't work reliably because get_cluster() returns cached cluster instances. Instead, patch get_cluster directly to return a mock cluster with the desired configuration. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Update all tests in test_operations.py to mock get_cluster so they work consistently in both single-node and multi-node test environments. This ensures tests produce deterministic results regardless of cluster config. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
snuba/migrations/operations.py
Outdated
| def execute(self) -> None: | ||
| self._execute_per_node() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if the statement contains ON CLUSTER syntax but you execute it per node, will thaet cause a problem?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The query will fail.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The class docstring is incorrect then and this seems strange to me because it implies that RunSql migrations can only be done on a single node basis? If that's the case then we should either validate the operation when it is instantiated that it does not contain ON CLUSTER or make running the query per node an optional behavior, not the default
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point.
- Remove SQL truncation in migration logs (was truncating to 32 chars) - Add test verifying ON CLUSTER DDL fails on single-node without Zookeeper The test documents that ON CLUSTER operations require Zookeeper/Keeper for distributed DDL coordination, validating that our migration code correctly uses is_single_node() to avoid ON CLUSTER on single nodes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
RunSql now checks if the SQL statement contains ON CLUSTER: - If present: uses single-node execution (parent's execute()) - If absent: uses per-node execution (_execute_per_node()) This allows RunSql to properly handle both cases: - SQL with explicit ON CLUSTER that should only run once - SQL without ON CLUSTER that needs to run on each node Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
volokluev
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Once the tests pass, feel free to merge
The test assumed Zookeeper wasn't configured, which isn't true in CI. The ON CLUSTER behavior is already covered by unit tests that mock the cluster configuration. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Summary
Refactor migration DDL operations to use ClickHouse's
ON CLUSTERsyntax instead of executing the same SQL on each node individually. This is more efficient and atomic for multi-node clusters since ClickHouse handles distributing the DDL to all nodes.Changes
Core refactoring:
_get_on_cluster_clause()helper that returns the appropriate ON CLUSTER clause based on target type:LOCALtarget → usescluster_name(for storage nodes)DISTRIBUTEDtarget → usesdistributed_cluster_name(for query nodes)_get_execution_node()helper to get a single node for execution (ON CLUSTER handles distribution)execute()to use single-node execution with ON CLUSTER syntax_execute_per_node()helper for operations that don't support ON CLUSTERDDL operations updated to include ON CLUSTER:
CreateTable,CreateMaterializedView,RenameTable,DropTable,TruncateTableAddColumn,DropColumn,ModifyColumnModifyTableTTL,RemoveTableTTLModifyTableSettings,ResetTableSettingsAddIndex,AddIndices,DropIndex,DropIndicesOperations with smart execution:
RunSql- Detects if SQL containsON CLUSTER(case-insensitive):InsertIntoSelect- DML operation, always uses per-node executionSettings changes:
alter_sync=2to MIGRATE settings to ensure ClickHouse blocks until all replicas confirm completionLogging improvements:
Test improvements:
test_operations.pynow mockget_clusterfor deterministic behavior_make_single_node_mock_cluster()and_make_mock_cluster()RunSqlON CLUSTER detection behaviorTest plan
pytest tests/migrations/test_operations.py -v)pytest tests/migrations/ -v)mypy)🤖 Generated with Claude Code