Skip to content

[Bug] v7.0.128 stuck in 100% CPU and no data being sent in SRT publishing #4587

@felipewd

Description

@felipewd

Hi,

I'm using v7.0.128 to receive 2 SRT streams. The ffmpeg processing I'm doing outside srs.

However during the night those streams reconnected and srs entered in a bad place: spinning 100% CPU, stuck, and wouldn't allow me to publish any new streams or ffprobe the current ones. I had to kill it and restart it get it going.

I was able to capture perf data here srs.txt

It wad gathered with:

perf record -F 199 -g -p $(pidof srs) -- sleep 30
perf script > srs.txt

The offending stack seems to be:

srt::CUDT::tsbpd(void*)
srt::sync::Condition::wait_until()
srt::sync::steady_clock::now()
pthread_mutex_unlock
clock_gettime

And also got a perf top:
Image

Now, all things point to a TSBPD issue...which is srt-related.

The main problem for me was srs not being able to publish any other SRT streams, wouldn't drop the troubling one (for whatever reason), and I couldn't ffprobe any streams.

There weren't any much useful log lines, however I noticed the last one were from last night, so it seems even the log file wasn't being updated anymore:

[2025-11-18 23:03:40.220][INFO][784764][87185zl5] -> SRT_PLA Transport Stats # pktSent=48434, pktSndLoss=0, pktRetrans=0, pktSndDrop=0
[2025-11-18 23:03:40.220][INFO][784764][87185zl5] -> SRT_PLA time=3656994, packets=48434, okbps=0,10199,10199, ikbps=0,0,0
[2025-11-18 23:03:42.648][INFO][784764][057080l6] SRS: cpu=13.99%,48MB, cid=2,0, timer=56,0,0, clock=1,27,7,3,2,1,1,0,0, objs=(pkt:0,raw:0,fua:0,msg:5810,oth:0,buf:0)
[2025-11-18 23:03:47.650][INFO][784764][057080l6] SRS: cpu=13.99%,48MB, cid=2,0, timer=62,0,0, clock=0,42,6,0,0,0,0,0,0, objs=(pkt:0,raw:0,fua:0,msg:5811,oth:0,buf:0)
[2025-11-18 23:03:48.193][INFO][784764][32m36124] <- SRT_CPB Transport Stats # pktRecv=9681, pktRcvLoss=0, pktRcvRetrans=0, pktRcvDrop=0
[2025-11-18 23:03:48.193][INFO][784764][32m36124] <- SRT_CPB time=3671853, packets=9680, okbps=0,0,0, ikbps=0,10193,10199
[2025-11-18 23:03:50.236][INFO][784764][585vt76u] -> SRT_PLA Transport Stats # pktSent=29069, pktSndLoss=0, pktRetrans=0, pktSndDrop=0
[2025-11-18 23:03:50.236][INFO][784764][585vt76u] -> SRT_PLA time=3660920, packets=29069, okbps=0,10199,10200, ikbps=0,0,0
[2025-11-18 23:03:52.654][INFO][784764][057080l6] SRS: cpu=14.99%,48MB, cid=2,0, timer=62,0,0, clock=0,42,6,0,0,0,0,0,0, objs=(pkt:0,raw:0,fua:0,msg:5811,oth:0,buf:0)
[2025-11-18 23:03:57.657][INFO][784764][057080l6] SRS: cpu=13.99%,48MB, cid=2,0, timer=62,0,0, clock=0,40,7,0,0,0,0,0,0, objs=(pkt:0,raw:0,fua:0,msg:5815,oth:0,buf:0)
[2025-11-18 23:03:58.207][INFO][784764][6534sfev] <- SRT_CPB Transport Stats # pktRecv=38775, pktRcvLoss=0, pktRcvRetrans=0, pktRcvDrop=0
[2025-11-18 23:03:58.207][INFO][784764][6534sfev] <- SRT_CPB time=3681861, packets=38768, okbps=0,0,0, ikbps=0,10199,10200
[2025-11-18 23:04:00.659][INFO][784764][057080l6] CircuitBreaker: cpu=200.60%,48MB, break=1,1,0, cond=200.60%
[2025-11-18 23:04:01.659][INFO][784764][057080l6] CircuitBreaker: cpu=201.00%,48MB, break=1,1,0, cond=201.00%
[2025-11-18 23:04:02.659][INFO][784764][057080l6] CircuitBreaker: cpu=200.00%,48MB, break=1,1,0, cond=200.00%
[2025-11-18 23:04:02.659][INFO][784764][057080l6] SRS: cpu=200.00%,48MB, cid=2,0, timer=62,0,0, clock=0,40,7,0,0,0,0,0,0, objs=(pkt:0,raw:0,fua:0,msg:5815,oth:0,buf:0)
[2025-11-18 23:04:03.660][INFO][784764][057080l6] CircuitBreaker: cpu=200.80%,48MB, break=1,1,0, cond=200.80%

here's my full config:

listen              1955;
max_connections     1000;
srs_log_tank        file;
srs_log_file        ./log/srs.log;
pid		    log/srs.pid;
daemon              on;
http_api {
    enabled         on;
    listen          1985;
}

stats {
    network         0;
}

srt_server {
    enabled          on;
    listen           1935;
    maxbw           -1;
    mss              1456;
    latency          800;
    recvlatency      800;
    peerlatency      800;
    tlpktdrop        off;
    sendbuf          8388608;
    recvbuf          8388608;
    peer_idle_timeout 15000;
    connect_timeout    8000;
    tsbpdmode       on;
    default_app     live;
}


vhost __defaultVhost__ {
     min_latency off;
     tcp_nodelay off;
     chunk_size 128;
     in_ack_size 0;
     out_ack_size 2500000;
     publish {
	     mr off;
	     mr_latency 350;
	     firstpkt_timeout 20000;
	     normal_timeout 7000;
	     parse_sps on;
	     try_annexb_first on;
	     kickoff_for_idle 0;
     }
     play {
	     gop_cache off;
	     gop_cache_max_frames 2500;
	     queue_length 10;
	     time_jitter full;
	     atc off;
	     mix_correct off;
	     atc_auto off;
	     mw_latency 350;
	     mw_msgs 8;
	     send_min_interval 10.0;
	     reduce_sequence_header on;
     }
     srt {
	     enabled on;
	     srt_to_rtmp off;
     }
}

I can provide my current 2 streams if you want...but I don't know what packet/network scenario cause things to haywire on srs. So I can't reproduce the issue.

Metadata

Metadata

Assignees

No one assigned

    Labels

    EnglishNativeThis issue is conveyed exclusively in English.

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions