Skip to content

Commit 6ce899a

Browse files
committed
docs(dwh): add features and limitations page MTA-6745
1 parent ce03b77 commit 6ce899a

File tree

1 file changed

+62
-0
lines changed

1 file changed

+62
-0
lines changed
Lines changed: 62 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,62 @@
1+
---
2+
title: Data Warehouse for ClickHouse® features and limitations
3+
description: This page presents the different features and limitations of Data Warehouse for ClickHouse®
4+
tags: data warehouse clickhouse features specs data sheet limits limitations
5+
dates:
6+
validation: 2025-11-19
7+
posted: 2025-11-19
8+
---
9+
10+
11+
## Features
12+
Information about the load balancer
13+
14+
Every Scaleway Data Warehouse deployment comes automatically with a load balancer, even for deployments with only 1 node.
15+
16+
This load balancer balances the queries over the nodes.
17+
18+
It would be nice to have a diagram of a Data Warehouse deployment with 1 load balancer over 3 nodes.
19+
How we tweaked ClickHouse to seamlessly integrate with a cluster
20+
21+
Since the load balancer balances the queries, the user cannot know on which node a query will land.
22+
23+
In order to make sure this doesn't generate confusion and frustration for the user, we tweaked ClickHouse to make sure everything the user created is replicated over all nodes.
24+
25+
This is done by aliasing commands:
26+
27+
| Default command | Replaced by |
28+
|------------------------------|-------------|
29+
| `CREATE DATABASE <database>` | `CREATE DATABASE <database> ON CLUSTER <Scaleway cluster>` |
30+
| `DELETE DATABASE <database>` | `DELETE DATABASE <database> ON CLUSTER <Scaleway cluster>` |
31+
| `CREATE DICTIONARY <dictionary>` | `CREATE DICTIONARY <dictionary> ON CLUSTER <Scaleway cluster>` |
32+
| `DELETE DICTIONARY <dictionary>` | `DELETE DICTIONARY <dictionary> ON CLUSTER <Scaleway cluster>` |
33+
34+
Moreover, the creation of any table in the `MergeTree` family will also be aliased in order to create the Replicated version:
35+
36+
| Default table | Replaced by |
37+
|------------------------------|----------------------------------------|
38+
| MergeTree | ReplicatedMergeTree |
39+
| ReplacingMergeTree | ReplicatedReplacingMergeTree |
40+
| CoalescingMergeTree | ReplicatedCoalescingMergeTree |
41+
| SummingMergeTree | ReplicatedSummingMergeTree |
42+
| AggregatingMergeTree | ReplicatedAggregatingMergeTree |
43+
| CollapsingMergeTree | ReplicatedCollapsingMergeTree |
44+
| VersionedCollapsingMergeTree | ReplicatedVersionedCollapsingMergeTree |
45+
| GraphiteMergeTree | ReplicatedGraphiteMergeTree |
46+
47+
More info about the MergeTree table family: https://clickhouse.com/docs/engines/table-engines/mergetree-family
48+
Important note: if the user does not specify the table engine, the table will be created as a ReplicatedMergeTree by default.
49+
50+
## Limitations
51+
52+
### Sharding
53+
54+
Sharding cannot be manually configured in Data Warehouse for ClickHouse®. All nodes in the cluster contain a full copy of the data, meaning the deployment operates in a replicated (or "replica") mode rather than a sharded (or "distributed") architecture.
55+
56+
The total data capacity of the cluster is therefore limited to the storage of a single node, and single queries cannot be parallelized across shards to enhance performance. Queries are executed on each replica independently, so while high availability and read scalability are improved, compute resources are not horizontally scalable for large analytical workloads that would benefit from data distribution.
57+
58+
### Distributed table engine
59+
60+
Due to the absence of sharding, the `Distributed` table engine has no effect in a Data Warehouse for ClickHouse®.
61+
62+
Refer to the [official ClickHouse® documentation](https://clickhouse.com/docs/engines/table-engines/special/distributed) for more information.

0 commit comments

Comments
 (0)