Skip to content

Commit a6188fa

Browse files
rohit-ngvjdhama
authored andcommitted
add blog
1 parent 0e9b0cf commit a6188fa

File tree

1 file changed

+364
-0
lines changed

1 file changed

+364
-0
lines changed
Lines changed: 364 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,364 @@
1+
# ClickHouse Deployment and Performance Benchmarking on ECS
2+
3+
"Imagine being a Formula One driver, racing at breakneck speeds, but without any telemetry data to guide you. It’s a thrilling ride, but one wrong turn or overheating engine could lead to disaster. Just like a pit crew relies on performance metrics to optimize the car's speed and handling, we utilize observability in ClickHouse to monitor our data system's health. These metrics provide crucial insights, allowing us to identify bottlenecks, prevent outages, and fine-tune performance, ensuring our data engine runs as smoothly and efficiently as a championship-winning race car."
4+
5+
<p align="center">
6+
<img src="images/blog/clickhouse-benchmarking/clickhouse-storage.jpeg" alt="Clickhouse Storage">
7+
</p>
8+
9+
In this blog, we'll dive into the process of deploying ClickHouse on AWS Elastic Container Service (ECS). We’ll also look at performance benchmarking to evaluate ClickHouse as a high-performance log storage backend. Our focus will be on its ingestion rates, query performance, scalability, and resource utilization.
10+
11+
## Architecture: Replication for Fault Tolerance
12+
13+
In this architecture, we utilize five servers to ensure data availability and reliability. Two of these servers are dedicated to hosting copies of the data, while the remaining three serve to coordinate the replication process. We will create a database and a table using the **ReplicatedMergeTree** engine, which allows for seamless data replication across the two data nodes.
14+
15+
#### Key Terms:
16+
17+
- **Replica:** In ClickHouse, a replica refers to a copy of your data. There is always at least one copy (the original), and adding a second replica enhances fault tolerance. This ensures that if one copy fails, the other remains accessible.
18+
19+
- **Shard:** A shard is a subset of your data. If you do not split the data across multiple servers, all data resides in a single shard. Sharding helps distribute the load when a single server's capacity is exceeded. The destination server for the data is determined by a sharding key, which can be random or derived from a hash function. In our examples, we will use a random key for simplicity.
20+
21+
This architecture not only protects your data but also allows for better handling of increased loads, making it a robust solution for data management in ClickHouse. For more detailed information, refer to the official documentation on [ClickHouse Replication Architecture](https://clickhouse.com/docs/en/architecture/replication).
22+
23+
### Configuration Changes for ClickHouse Deployment
24+
25+
**Node Descriptions**
26+
27+
- **clickhouse-01**: Data node for storing data.
28+
- **clickhouse-02**: Another data node for data storage.
29+
- **clickhouse-keeper-01**: Responsible for distributed coordination.
30+
- **clickhouse-keeper-02**: Responsible for distributed coordination.
31+
- **clickhouse-keeper-03**: Responsible for distributed coordination.
32+
33+
### Installation Steps
34+
35+
- **ClickHouse Server**: We deployed ClickHouse Server and Client on the data nodes, clickhouse-01 and clickhouse-02, using Docker images, specifically `clickhouse/clickhouse-server` for installation.
36+
37+
- **ClickHouse Keeper**: Installed on the three servers (clickhouse-keeper-01, clickhouse-keeper-02, and clickhouse-keeper-03) using Docker image `clickhouse/clickhouse-keeper`.
38+
39+
### Configuration Files and Best Practices
40+
41+
#### General Configuration Guidelines
42+
43+
- Add configuration files to `/etc/clickhouse-server/config.d/`.
44+
- Add user configuration files to `/etc/clickhouse-server/users.d/`.
45+
- Keep the original `/etc/clickhouse-server/config.xml` and `/etc/clickhouse-server/users.xml` files unchanged.
46+
47+
#### clickhouse-01 Configuration
48+
49+
The configuration for clickhouse-01 includes five files for clarity, although they can be combined if desired. Here are key elements:
50+
51+
- **Network and Logging Configuration**:
52+
- Sets the display name to "cluster_1S_2R node 1."
53+
- Configures ports for HTTP (8123) and TCP (9000).
54+
55+
```xml
56+
<clickhouse>
57+
<logger>
58+
<level>debug</level>
59+
<log>/var/log/clickhouse-server/clickhouse-server.log</log>
60+
<errorlog>/var/log/clickhouse-server/clickhouse-server.err.log</errorlog>
61+
<size>1000M</size>
62+
<count>3</count>
63+
</logger>
64+
<display_name>cluster_1S_2R node 1</display_name>
65+
<listen_host>0.0.0.0</listen_host>
66+
<http_port>8123</http_port>
67+
<tcp_port>9000</tcp_port>
68+
</clickhouse>
69+
```
70+
71+
- **Macros Configuration**:
72+
- Simplifies DDL by using macros for shard and replica numbers.
73+
74+
```xml
75+
<clickhouse>
76+
<macros>
77+
<shard>01</shard>
78+
<replica>01</replica>
79+
<cluster>cluster_1S_2R</cluster>
80+
</macros>
81+
</clickhouse>
82+
```
83+
84+
- **Replication and Sharding Configuration**:
85+
- Defines a cluster named `cluster_1S_2R` with one shard and two replicas.
86+
87+
```xml
88+
<clickhouse>
89+
<remote_servers replace="true">
90+
<cluster_1S_2R>
91+
<secret>mysecretphrase</secret>
92+
<shard>
93+
<internal_replication>true</internal_replication>
94+
<replica>
95+
<host>clickhouse-01.clickhouse</host>
96+
<port>9000</port>
97+
</replica>
98+
<replica>
99+
<host>clickhouse-02.clickhouse</host>
100+
<port>9000</port>
101+
</replica>
102+
</shard>
103+
</cluster_1S_2R>
104+
</remote_servers>
105+
</clickhouse>
106+
```
107+
108+
- **Using ClickHouse Keeper**:
109+
- Configures ClickHouse Server to coordinate with ClickHouse Keeper.
110+
111+
```xml
112+
<clickhouse>
113+
<zookeeper>
114+
<node>
115+
<host>clickhouse-keeper-01.clickhouse</host>
116+
<port>9181</port>
117+
</node>
118+
<node>
119+
<host>clickhouse-keeper-02.clickhouse</host>
120+
<port>9181</port>
121+
</node>
122+
<node>
123+
<host>clickhouse-keeper-03.clickhouse</host>
124+
<port>9181</port>
125+
</node>
126+
</zookeeper>
127+
</clickhouse>
128+
```
129+
130+
#### clickhouse-02 Configuration
131+
132+
The configuration is mostly similar to clickhouse-01, with key differences noted:
133+
134+
- **Network and Logging Configuration**:
135+
- Similar to clickhouse-01 but with a different display name.
136+
137+
- **Macros Configuration**:
138+
- The replica is set to 02 on this node.
139+
140+
#### clickhouse-keeper Configuration
141+
142+
For ClickHouse Keeper, each node configuration includes:
143+
144+
- **General Configuration**:
145+
- Ensure the server ID is unique across all instances.
146+
147+
```xml
148+
<clickhouse>
149+
<logger>
150+
<level>trace</level>
151+
<log>/var/log/clickhouse-keeper/clickhouse-keeper.log</log>
152+
<errorlog>/var/log/clickhouse-keeper/clickhouse-keeper.err.log</errorlog>
153+
<size>1000M</size>
154+
<count>3</count>
155+
</logger>
156+
<listen_host>0.0.0.0</listen_host>
157+
<keeper_server>
158+
<tcp_port>9181</tcp_port>
159+
<server_id>1</server_id> <!-- Change for each keeper -->
160+
<log_storage_path>/var/lib/clickhouse/coordination/log</log_storage_path>
161+
<snapshot_storage_path>/var/lib/clickhouse/coordination/snapshots</snapshot_storage_path>
162+
</keeper_server>
163+
</clickhouse>
164+
```
165+
166+
### Image Baking and Deployment
167+
168+
All configuration changes were integrated into the Docker image using the base ClickHouse image. The configured image was then pushed to ECR and utilized in our ECS tasks for efficient deployment.
169+
170+
<p align="center">
171+
<img src="/images/blog/clickhouse-benchmarking/ecs-clickhouse-deployment.png" alt="ECS Clickhouse Deployment">
172+
</p>
173+
174+
## ECS Cluster Setup and ClickHouse Deployment
175+
176+
#### ClickHouse Deployment Overview
177+
178+
- **AWS Partner Solution:** We leveraged the AWS Partner Solution Deployment Guide for ClickHouse to ensure a structured setup.
179+
180+
- **ECS Cluster:** ClickHouse was deployed on Amazon ECS using a multi-node configuration, following best practices to achieve high availability.
181+
182+
#### VPC Configuration
183+
184+
- **Custom VPC:** A dedicated Virtual Private Cloud (VPC) was created with subnets designated for public and private instances. This configuration enhances security and streamlines component communication.
185+
186+
#### Security Measures
187+
188+
- **Security Groups:** Network traffic was restricted by configuring security groups to allow only necessary ports:
189+
- **8123** for HTTP
190+
- **9000** for TCP connections
191+
192+
#### Instance Type Selection
193+
194+
- **EC2 Instances:** Instance types were chosen based on the compute and memory needs of ClickHouse nodes, balancing performance and cost-effectiveness.
195+
196+
#### Database Configuration
197+
198+
- **Shards and Replicas:** The ClickHouse database was set up with **2 shards** and **1 replica**, following the static configuration recommended in the ClickHouse horizontal scaling documentation.
199+
200+
#### Container Image
201+
202+
- **Docker Image:** We utilized the `clickhouse/clickhouse-server` container image, which includes both the ClickHouse server and keeper.
203+
204+
#### Auto Scaling Configuration
205+
206+
- **Auto Scaling Group:** An Auto Scaling Group was set up with `m5.large` instances, providing **3 GB of memory** for each container to ensure optimal performance under varying loads.
207+
208+
209+
## Performance Benchmarking Metrics
210+
211+
Now that the deployment architecture is established, let's move on to evaluating ClickHouse's performance through a series of benchmarking metrics.
212+
213+
### 2.1 Data Ingestion Performance Metrics
214+
215+
- **Average Queries per Second:** This metric measures the sustained query ingestion rate during heavy load, offering insights into how well ClickHouse handles log ingestion.
216+
217+
- **CPU Usage (Cores):** Tracking the average CPU cores used during ingestion helps determine the efficiency of resource utilization during data ingestion.
218+
219+
- **IO Wait Time:** Indicates the time ClickHouse spent waiting for I/O operations, such as disk reads or writes, which directly impacts ingestion throughput.
220+
221+
- **OS CPU Usage:** This metric differentiates between user-space and kernel-space CPU usage to offer a clearer picture of where the processing power is consumed during data ingestion.
222+
223+
- **Disk Throughput:** Measures the average read/write throughput from the file system, crucial for understanding how efficiently data is being ingested and written to storage.
224+
225+
- **Memory Usage (Tracked):** Monitoring the memory consumed by ClickHouse over time helps identify potential memory bottlenecks during sustained ingestion loads.
226+
227+
### 2.2 Query Execution Metrics
228+
229+
- **Response Times:** We measured the average query execution times, especially focusing on complex operations such as joins and aggregations.
230+
231+
- **CPU Wait Time:** This metric captures the latency introduced by waiting for CPU cycles during query execution, giving insight into query performance under load.
232+
233+
- **Average Selected Rows/Second:** This metric tracks how many rows ClickHouse processes per second during `SELECT` queries, offering a gauge for query throughput.
234+
235+
- **Average Merges Running:** In ClickHouse's MergeTree engine, merges are essential for optimizing data. Tracking the number of concurrent merges gives an indication of how well ClickHouse is handling data restructuring.
236+
237+
### 2.3 Scalability Metrics
238+
239+
- **Load Average:** This metric tracks the system load over a 15-minute window, providing a real-time view of how ClickHouse handles varying loads.
240+
241+
- **Max Parts per Partition:** As part of the merging process, this metric reflects the largest number of parts within a partition, offering insight into how well ClickHouse manages its partitioning strategy.
242+
243+
- **TCP Connections:** The number of active TCP connections to the ClickHouse nodes reflects how well the system can handle network traffic under high query loads.
244+
245+
- **Memory Efficiency:** This metric monitors memory allocation efficiency and tracks peak memory usage during both data ingestion and query execution.
246+
247+
## 3. Log Ingestion Testing
248+
249+
To benchmark log ingestion, we used the following table schema to handle log data:
250+
251+
```sql
252+
CREATE TABLE logs
253+
(
254+
`remote_addr` String,
255+
`remote_user` String,
256+
`runtime` UInt64,
257+
`time_local` DateTime,
258+
`request_type` String,
259+
`request_path` String,
260+
`request_protocol` String,
261+
`status` UInt64,
262+
`size` UInt64,
263+
`referer` String,
264+
`user_agent` String
265+
)
266+
ENGINE = MergeTree
267+
ORDER BY (toStartOfHour(time_local), status, request_path, remote_addr);
268+
```
269+
270+
### Dataset
271+
272+
We used a public dataset containing 66 million records to perform ingestion tests. The dataset can be found at this [link](https://datasets-documentation.s3.eu-west-3.amazonaws.com/http_logs/data-66.csv.gz)
273+
274+
### 3.1 Baseline Performance Testing
275+
276+
- **Initial Ingestion Rate:** We measured ingestion rates under normal load to evaluate whether real-time log ingestion was achievable.
277+
278+
- **Disk I/O:** Disk throughput was closely monitored to evaluate how well ClickHouse handles log writes and merges during ingestion.
279+
280+
### 3.2 High Load Performance
281+
282+
- **Stress Testing:** Simulating log bursts under peak traffic allowed us to analyze the stability and performance of the ingestion pipeline.
283+
284+
- **Monitoring:** During high-load testing, key metrics such as CPU, memory, and I/O usage were tracked to ensure no bottlenecks surfaced.
285+
286+
## 4. Query Performance Testing
287+
288+
To evaluate query performance, we designed several test queries ranging from simple `SELECT` statements to more complex join operations and aggregations.
289+
290+
### 4.1 Test Queries
291+
292+
- **Simple Select Queries:** Evaluating performance for basic queries that retrieve specific fields from the `logs` table.
293+
294+
```sql
295+
SELECT * FROM logs;
296+
```
297+
298+
```sql
299+
SELECT toStartOfInterval(toDateTime(time_local), INTERVAL 900 second) AS time, count()
300+
FROM logs
301+
WHERE time_local >= '1548288000'
302+
AND time_local <= '1550966400'
303+
AND status = 404
304+
AND request_path = '/apple-touch-icon-precomposed.png'
305+
AND remote_addr = '2.185.223.153'
306+
AND runtime > 4000
307+
GROUP BY time
308+
ORDER BY time ASC
309+
LIMIT 10000;
310+
```
311+
312+
- **Joins:** To test more complex queries, we used a join operation between two `logs` tables:
313+
314+
```sql
315+
SELECT toStartOfInterval(toDateTime(l.time_local), INTERVAL 900 second) AS time, count()
316+
FROM logs l
317+
JOIN logs_local ll ON l.remote_addr = ll.remote_addr AND l.time_local = ll.time_local
318+
WHERE l.time_local >= '1548288000'
319+
AND l.time_local <= '1550966400'
320+
AND l.status = 404
321+
AND l.request_path = '/apple-touch-icon-precomposed.png'
322+
AND l.remote_addr = '2.185.223.153'
323+
AND l.runtime > 4000
324+
GROUP BY time
325+
ORDER BY time ASC
326+
LIMIT 10000;
327+
```
328+
329+
- **Aggregations:** Performance of aggregate queries was tested on fields like status codes and request paths.
330+
331+
```sql
332+
SELECT uniq(remote_addr) AS `unique ips`
333+
FROM logs
334+
WHERE time_local >= '1548288000'
335+
AND time_local <= '1550966400'
336+
AND status = 404
337+
AND request_path = '/apple-touch-icon-precomposed.png'
338+
AND remote_addr = '2.185.223.153'
339+
AND runtime > 4000;
340+
```
341+
342+
```sql
343+
SELECT toStartOfInterval(toDateTime(time_local), INTERVAL 900 second) AS time, avg(runtime) AS avg_request_time, quantile(0.99)(runtime) AS 99_runtime
344+
FROM logs
345+
WHERE time_local >= '1548288000'
346+
AND time_local <= '1550966400'
347+
AND status = 404
348+
AND request_path = '/apple-touch-icon-precomposed.png'
349+
AND remote_addr = '2.185.223.153'
350+
AND runtime > 4000
351+
GROUP BY time
352+
ORDER BY time ASC
353+
LIMIT 100000;
354+
```
355+
356+
### 4.2 Query Benchmarking Results
357+
358+
- **Response Time:** We documented the average response times for each type of query to understand performance under load.
359+
360+
- **Resource Utilization:** We tracked CPU, memory, and I/O usage during the execution of these queries to evaluate resource efficiency.
361+
362+
- **Throughput:** Finally, we measured how many queries could be executed per second under sustained load conditions.
363+
364+
**🔍 For detailed performance metrics and benchmarks, please refer to the full report [**here**](https://infraspec.getoutline.com/doc/clickhouse-deployment-and-performance-benchmarking-on-ecs-Stsim2Uoz1).**

0 commit comments

Comments
 (0)