Skip to content

[BUG] High CPU in UDP Worker - Suspected Deadlock Between dialog and cgrates Modules #3713

@amatangira

Description

@amatangira

Hello OpenSIPS Team,

I am experiencing a high CPU issue where a single UDP receiver process consistently utilizes 100% of a CPU core.

Based on my analysis of the GDB backtrace, my hunch is that a UDP worker process is getting stuck in a get_lock spinlock. I suspect this might be a race condition between the dialog and cgrates modules. The call stack points to a dialog timer triggering a destroy_dlg event, which in turn calls into the cgrates_acc function. It appears to be this cgrates callback that gets stuck while waiting for the lock.

This issue has been observed when calls originate from a WSS client. As a temporary workaround, the CPU usage returns to normal if I comment out the cgrates_acc() function call in my configuration and restart OpenSIPS.

The GDB backtrace is attached for your review.

gdbtrace.txt

Version Information:

version: opensips 3.5.7 (x86_64/linux)
flags: STATS: Off, USE_TCP, USE_TLS, USE_SCTP, USE_ASYNC, DISABLE_NAGLE, USE_MCAST, DNS_IP_HACK, SHM_MEM, SHM_MMAN, PKG_MALLOC, Q_MALLOC, F_MALLOC, DBG_MALLOC, FAST_LOCK-ADAPTIVE_LOCK
ADAPTIVE_WAIT_LOOPS=1024, MAX_RECV_BUFFER_SIZE 262144, MAX_LISTEN 16, MAX_URI_SIZE 1024, BUF_SIZE 65535
poll method support: poll, epoll_lt, epoll_et, sigio_rt, select.
main poll method: epoll_lt
tls engine: WolfSSL

Operating System:
CentOS 9 Stream

Metadata

Metadata

Assignees

Labels

No labels
No labels

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions