-
-
Notifications
You must be signed in to change notification settings - Fork 58
Description
Hi,
We’ve observed an issue where the /ping endpoint of datarhei/core becomes unavailable or responds with ECONNRESET, but only in cases when one of the RTSP camera processes is stuck or unreachable.
What we’re seeing
• When a camera drops (e.g., offline, frozen, or rebooting), the corresponding process in core starts hanging.
• At that same moment, /ping — which usually responds instantly - starts timing out or returning read ECONNRESET.
• This is being used as a health check from external systems like n8n, and it makes the instance look unhealthy.
• It seems like this only happens when the camera is down; otherwise, core works flawlessly.
Our assumptions and concerns
We suspect that /ping might be internally referencing or waiting on information from active processes — even if they’re stuck waiting for an RTSP source to respond. If this is the case, then:
A single faulty RTSP input could degrade or even block the responsiveness of the entire core instance.
We have observed a recurring issue when using UDP push sources with MPEG-TS streams.
This could become a hidden bottleneck or even a denial-of-service vector if multiple processes are misbehaving or unresponsive.
- We rely on /ping as a lightweight health check - if it fails, our automation (like camera monitors, process supervisors, etc.) assumes core is down.
- Even though core is alive and trying to recover, the failed /ping causes upstream workflows to crash or retry.
- This creates system-wide instability, despite the failure being isolated to one RTSP process.
Log:
ts=2025-08-06T06:55:59Z level=INFO component="Process" msg="Started" id="core2_sub-push"
ts=2025-08-06T06:56:00Z level=INFO component="Session" msg="Closed" id="HTTP" location="any" peer="any" reference="" rx_bitrate_kbit=0 rx_bytes=556 rx_maxbitrate_kbit=0.3229166666666667 tx_bitrate_kbit=0 tx_bytes=19282 tx_maxbitrate_kbit=0 type="http"
ts=2025-08-06T06:56:00Z level=INFO component="Process" msg="Failed" id="core2_sub-push"
ts=2025-08-06T06:56:00Z level=INFO component="Process" msg="Stopped" id="core2_sub-push"