Bitnami ClickHouse merges can’t keep up with very high insert rate (Too many parts / delayed TTL drops) on Bitnami Helm prod cluster

### Name and Version

bitnamilegacy/clickhouse:25.7.5-debian-12-r0

### What steps will reproduce the bug?

We are running ClickHouse via **Bitnami Helm chart to store OpenTelemetry telemetr**y (traces/logs/metrics). In production traffic (100k–200k spans/sec), inserts generate parts faster than merges can process, leading to:

- frequent Too many parts ... merges slower than inserts errors
- very slow merge backlog
- expired partitions not fully dropped for 2–3 days, even with TTL enabled
- very high memory usage due to background merges

We need guidance on:

- How to speed up merges, especially for expired partitions
- Memory optimisation / stability tuning under high insert rates

**Environment**

1. **Deployment**: Bitnami Helm ClickHouse
2. **ClickHouse image:** bitnamilegacy/clickhouse:25.7.5-debian-12-r0
3. **Cluster topology:** 3 shards × 2 replicas (ReplicatedMergeTree)
4. **Node resources per pod:**
         - CPU: 32 vCPU
         - Memory: 268 GiB

5. **Workload: OpenTelemetry traces/logs/metrics**
6. **Ingestion**:

        -       Dev traffic: ~5k spans/sec (works fine)
        -        Prod traffic: 100k–200k spans/sec

7. **Hashing / sharding:** we use sipHash64(TraceId) and observe distribution is good (even across shards)

**Tables / Schema:**
We have local replicated tables + distributed tables for: traces/metrics/logs [check attached file]

**Table design (example traces):**
**Engine: ReplicatedMergeTree**
**Partitioning: PARTITION BY toDate(IngestedAt)**
**TTL toDateTime(IngestedAt) + toIntervalDay(1)**
**enabled ttl_only_drop_parts=1**

Distributed table writes through exporter to traces_dist

**Problem**
In production, insert concurrency is very high and merges cannot keep up. This causes:

1) Too many parts error
          We hit this regularly:
          code: 252, message: Too many parts (3001 with average size of 84.45 MiB) in table 'otel.traces (...UUID...)'. 
          Merges are processing significantly slower than inserts
          After this happens:
          exporter requests fail and retry
          otel collector exporter queue increases
          ingestion backlog accumulates

2) TTL expired partitions not dropping quickly
TTL does work and partitions gradually shrink, however for very large partitions (example: ~3 TB/day partition), fully dropping can take 2–3 days due to merge backlog.
We see:
expired partitions still present (with active parts)
size gradually decreasing over time instead of being dropped soon after expiry


3) Memory pressure
Large percentage of pod memory is used (merges + background tasks) and we see overall memory heavily utilized during merge backlog.
Mitigation/Tuning applied so far

We increased background merge capacity:
**background_pool_size: 32
background_merges_mutations_concurrency_ratio: 2**

[back1.yaml](https://github.com/user-attachments/files/24859432/back1.yaml)

This helped but did not resolve the issue completely under 100k–200k spans/sec.

**Otel clickhouse exporter:**

```
clickhouse:
      endpoint: tcp://clickhouse.svc.cluster.local:9000
      username: default
      password: mysecurepassword
      create_schema: false
      database: otel
      traces_table_name: traces_dist
      ttl: 24h
      sending_queue:
        enabled: true
        num_consumers: 25
        queue_size: 20000
      retry_on_failure:
        enabled: true
        initial_interval: 5s
        max_interval: 30s
        max_elapsed_time: 300s
      timeout: 30s

batch/clickhouse:
      send_batch_max_size: 12000
      send_batch_size: 10000
      timeout: 5s
```



### Are you using any custom parameters or values?

_No response_

### What is the expected behavior?

_No response_

### What do you see instead?

Merge process is slow and memory utilising full

### Additional information

_No response_

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Bitnami ClickHouse merges can’t keep up with very high insert rate (Too many parts / delayed TTL drops) on Bitnami Helm prod cluster #36447

Name and Version

What steps will reproduce the bug?

Are you using any custom parameters or values?

What is the expected behavior?

What do you see instead?

Additional information

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Bitnami ClickHouse merges can’t keep up with very high insert rate (Too many parts / delayed TTL drops) on Bitnami Helm prod cluster #36447

Description

Name and Version

What steps will reproduce the bug?

Are you using any custom parameters or values?

What is the expected behavior?

What do you see instead?

Additional information

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions