Skip to content

Commit 11b96f2

Browse files
committed
clickhouse: prevent replicated tables from starting in read-only mode.
On start, ClickHouse compares the local state of each distributed table to its distributed state. If it finds a discrepancy, it starts the table in read-only mode. When this happens, oximeter can't write new records to the relevant table(s). In the past, we've worked around this by manually instructing ClickHouse using the `force_restore_data` sentinel file, but this requires manual detection and intervention each time a table starts up in read-only mode. This patch sets the `replicated_max_ratio_of_wrong_parts` flag to 1.0 so that ClickHouse always accepts local state, and never starts tables in read-only mode. As described in ClickHouse/ClickHouse#66527, this appears to be a bug, or at least an ergonomic flaw, in ClickHouse. One replica of a table can routinely fall behind the others, e.g. due to restart or network partition, and shouldn't require manual intervention to start back up. Part of #8595.
1 parent f33aa73 commit 11b96f2

File tree

1 file changed

+5
-1
lines changed

1 file changed

+5
-1
lines changed

clickhouse-admin/types/src/config.rs

Lines changed: 5 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -201,9 +201,13 @@ impl ReplicaConfig {
201201
<max_tasks_in_queue>1000</max_tasks_in_queue>
202202
</distributed_ddl>
203203
204-
<!-- Disable sparse column serialization, which we expect to not need -->
205204
<merge_tree>
205+
<!-- Disable sparse column serialization, which we expect to not need -->
206206
<ratio_of_defaults_for_sparse_serialization>1.0</ratio_of_defaults_for_sparse_serialization>
207+
208+
<!-- Prevent ClickHouse from setting distributed tables to read-only. -->
209+
<!-- See https://github.com/oxidecomputer/omicron/issues/8595 for details. -->
210+
<replicated_max_ratio_of_wrong_parts>1.0</replicated_max_ratio_of_wrong_parts>
207211
</merge_tree>
208212
{macros}
209213
{remote_servers}

0 commit comments

Comments
 (0)