You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: content/blog/clickhouse-benchmarking.md
+10-10Lines changed: 10 additions & 10 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -10,7 +10,7 @@ weight: 1
10
10
"Imagine being a Formula One driver, racing at breakneck speeds, but without any telemetry data to guide you. It’s a thrilling ride, but one wrong turn or overheating engine could lead to disaster. Just like a pit crew relies on performance metrics to optimize the car's speed and handling, we utilize observability in ClickHouse to monitor our data system's health. These metrics provide crucial insights, allowing us to identify bottlenecks, prevent outages, and fine-tune performance, ensuring our data engine runs as smoothly and efficiently as a championship-winning race car."
In this blog, we'll dive into the process of deploying ClickHouse on AWS Elastic Container Service (ECS). We’ll also look at performance benchmarking to evaluate ClickHouse as a high-performance log storage backend. Our focus will be on its ingestion rates, query performance, scalability, and resource utilization.
@@ -216,7 +216,7 @@ All configuration changes were integrated into the Docker image using the base C
216
216
217
217
Now that the deployment architecture is established, let's move on to evaluating ClickHouse's performance through a series of benchmarking metrics.
218
218
219
-
### 2.1 Data Ingestion Performance Metrics
219
+
### Data Ingestion Performance Metrics
220
220
221
221
-**Average Queries per Second:** This metric measures the sustained query ingestion rate during heavy load, offering insights into how well ClickHouse handles log ingestion.
222
222
@@ -230,7 +230,7 @@ Now that the deployment architecture is established, let's move on to evaluating
230
230
231
231
-**Memory Usage (Tracked):** Monitoring the memory consumed by ClickHouse over time helps identify potential memory bottlenecks during sustained ingestion loads.
232
232
233
-
### 2.2 Query Execution Metrics
233
+
### Query Execution Metrics
234
234
235
235
-**Response Times:** We measured the average query execution times, especially focusing on complex operations such as joins and aggregations.
236
236
@@ -240,7 +240,7 @@ Now that the deployment architecture is established, let's move on to evaluating
240
240
241
241
-**Average Merges Running:** In ClickHouse's MergeTree engine, merges are essential for optimizing data. Tracking the number of concurrent merges gives an indication of how well ClickHouse is handling data restructuring.
242
242
243
-
### 2.3 Scalability Metrics
243
+
### Scalability Metrics
244
244
245
245
-**Load Average:** This metric tracks the system load over a 15-minute window, providing a real-time view of how ClickHouse handles varying loads.
246
246
@@ -250,7 +250,7 @@ Now that the deployment architecture is established, let's move on to evaluating
250
250
251
251
-**Memory Efficiency:** This metric monitors memory allocation efficiency and tracks peak memory usage during both data ingestion and query execution.
252
252
253
-
## 3. Log Ingestion Testing
253
+
## Log Ingestion Testing
254
254
255
255
To benchmark log ingestion, we used the following table schema to handle log data:
256
256
@@ -277,23 +277,23 @@ ORDER BY (toStartOfHour(time_local), status, request_path, remote_addr);
277
277
278
278
We used a public dataset containing 66 million records to perform ingestion tests. The dataset can be found at this [link](https://datasets-documentation.s3.eu-west-3.amazonaws.com/http_logs/data-66.csv.gz)
279
279
280
-
### 3.1 Baseline Performance Testing
280
+
### Baseline Performance Testing
281
281
282
282
-**Initial Ingestion Rate:** We measured ingestion rates under normal load to evaluate whether real-time log ingestion was achievable.
283
283
284
284
-**Disk I/O:** Disk throughput was closely monitored to evaluate how well ClickHouse handles log writes and merges during ingestion.
285
285
286
-
### 3.2 High Load Performance
286
+
### High Load Performance
287
287
288
288
-**Stress Testing:** Simulating log bursts under peak traffic allowed us to analyze the stability and performance of the ingestion pipeline.
289
289
290
290
-**Monitoring:** During high-load testing, key metrics such as CPU, memory, and I/O usage were tracked to ensure no bottlenecks surfaced.
291
291
292
-
## 4. Query Performance Testing
292
+
## Query Performance Testing
293
293
294
294
To evaluate query performance, we designed several test queries ranging from simple `SELECT` statements to more complex join operations and aggregations.
295
295
296
-
### 4.1 Test Queries
296
+
### Test Queries
297
297
298
298
-**Simple Select Queries:** Evaluating performance for basic queries that retrieve specific fields from the `logs` table.
299
299
@@ -359,7 +359,7 @@ ORDER BY time ASC
359
359
LIMIT100000;
360
360
```
361
361
362
-
### 4.2 Query Benchmarking Results
362
+
### Query Benchmarking Results
363
363
364
364
-**Response Time:** We documented the average response times for each type of query to understand performance under load.
0 commit comments