Skip to content
Draft
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
62 changes: 62 additions & 0 deletions pages/data-warehouse/reference-content/features-limitations.mdx
Original file line number Diff line number Diff line change
@@ -0,0 +1,62 @@
---
title: Data Warehouse for ClickHouse® features and limitations
description: This page presents the different features and limitations of Data Warehouse for ClickHouse®
tags: data warehouse clickhouse features specs data sheet limits limitations
dates:
validation: 2025-11-19
posted: 2025-11-19
---


## Features
Information about the load balancer

Every Scaleway Data Warehouse deployment comes automatically with a load balancer, even for deployments with only 1 node.

This load balancer balances the queries over the nodes.

It would be nice to have a diagram of a Data Warehouse deployment with 1 load balancer over 3 nodes.
How we tweaked ClickHouse to seamlessly integrate with a cluster

Since the load balancer balances the queries, the user cannot know on which node a query will land.

In order to make sure this doesn't generate confusion and frustration for the user, we tweaked ClickHouse to make sure everything the user created is replicated over all nodes.

This is done by aliasing commands:

| Default command | Replaced by |
|------------------------------|-------------|
| `CREATE DATABASE <database>` | `CREATE DATABASE <database> ON CLUSTER <Scaleway cluster>` |
| `DELETE DATABASE <database>` | `DELETE DATABASE <database> ON CLUSTER <Scaleway cluster>` |
| `CREATE DICTIONARY <dictionary>` | `CREATE DICTIONARY <dictionary> ON CLUSTER <Scaleway cluster>` |
| `DELETE DICTIONARY <dictionary>` | `DELETE DICTIONARY <dictionary> ON CLUSTER <Scaleway cluster>` |

Moreover, the creation of any table in the `MergeTree` family will also be aliased in order to create the Replicated version:

| Default table | Replaced by |
|------------------------------|----------------------------------------|
| MergeTree | ReplicatedMergeTree |
| ReplacingMergeTree | ReplicatedReplacingMergeTree |
| CoalescingMergeTree | ReplicatedCoalescingMergeTree |
| SummingMergeTree | ReplicatedSummingMergeTree |
| AggregatingMergeTree | ReplicatedAggregatingMergeTree |
| CollapsingMergeTree | ReplicatedCollapsingMergeTree |
| VersionedCollapsingMergeTree | ReplicatedVersionedCollapsingMergeTree |
| GraphiteMergeTree | ReplicatedGraphiteMergeTree |

More info about the MergeTree table family: https://clickhouse.com/docs/engines/table-engines/mergetree-family
Important note: if the user does not specify the table engine, the table will be created as a ReplicatedMergeTree by default.

## Limitations

### Sharding

Sharding cannot be manually configured in Data Warehouse for ClickHouse®. All nodes in the cluster contain a full copy of the data, meaning the deployment operates in a replicated (or "replica") mode rather than a sharded (or "distributed") architecture.

The total data capacity of the cluster is therefore limited to the storage of a single node, and single queries cannot be parallelized across shards to enhance performance. Queries are executed on each replica independently, so while high availability and read scalability are improved, compute resources are not horizontally scalable for large analytical workloads that would benefit from data distribution.

### Distributed table engine

Due to the absence of sharding, the `Distributed` table engine has no effect in a Data Warehouse for ClickHouse®.

Refer to the [official ClickHouse® documentation](https://clickhouse.com/docs/engines/table-engines/special/distributed) for more information.
Loading