-
Notifications
You must be signed in to change notification settings - Fork 2k
Description
Bug description
At some point, ClickHouse started showing sudden spike-like surges in both memory and CPU usage, and weβve even seen OOMs occur as a result (Image A: 5-minute interval).
When viewed over a 12-hour period, the issue appears even more severe (Image B: 12-hour interval).
When user access to the Web UI is blocked by IP, the system stabilizes almost immediately (Image C: 5-minute interval).
I don't know what's causing this issue. I can only speculate.
Would a user leaving Live Tail running also count toward the Web UI tokenizer idle timeout or max timeout?
We are currently operating with the following settings:
- Tokenizer idle timeout: 1 hour
- Max timeout: 4 hours
Also, if multiple users leave Live Tail running for several minutes or even hours and then cancel it, is there any chance those sessions are not being cleaned up properly inside SigNoz and just keep accumulating β for example, due to some internal bug?
One thing I can reasonably infer is this: when access to the SigNoz Web UI is blocked for all users except my IP, the memory usage becomes stable on the Memory Dashboard. However, as soon as I allow user access again, the memory spikes return.
Expected behavior
Sudden ClickHouse Memory Spikes / OOM
How to reproduce
-
At some point(An issue with an unknown cause), ClickHouse started experiencing repeated memory spikes.
-
I blocked access for all user IPs except my own as the administrator.
-
ClickHouse then entered a stable state.
-
I re-enabled access for all users.
-
The ClickHouse issue occurred again.
Version information
- Signoz version: v0.114.1
- Browser version: Chrome Latest
- Your OS and version: Windows
- Your CPU Architecture(ARM/Intel):