-
Notifications
You must be signed in to change notification settings - Fork 15
jitter: Add automatic buffer reset on large RTP discontinuities #29
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
This commit improves the resilience of the jitter buffer by automatically detecting and handling large RTP discontinuities in both sequence numbers and timestamps. New features: - Detect large sequence number jumps (>1000) and timestamp jumps (>30s) - Automatically reset buffer state when discontinuities are detected - Add warning logs for operational visibility when jumps occur - Track previous timestamp to enable timestamp discontinuity detection Bug fixes: - Prevent buffer from staying in invalid state after large gaps - Improve packet expiration handling after discontinuities to reduce premature drops and delayed emission The reset mechanism clears all buffered packets and reinitializes the buffer state, ensuring clean recovery from stream interruptions. Tests: - Add TestLargeSequenceJump to verify sequence discontinuity handling - Add TestLargeTimestampJump to verify timestamp discontinuity handling - Add TestSequenceWraparound to ensure wraparound boundaries work correctly
alexlivekit
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm!
jitter/buffer.go
Outdated
| } | ||
|
|
||
| func (b *Buffer) isLargeTimestampJump(current, prev uint32) bool { | ||
| const MAX_TIMESTAMP_JUMP = 48000 * 30 // 30 seconds at 48kHz |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
just a note: the same buffer is also used for video packets - for which we have clock rate of 90kHz so the comment isn't entirely accurate.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do we handle camera mute / no-update screenshare / etc.? IIRC it's entirely possible to go seconds without any RTP packet if the screenshare stays static.
| b.logger.Infow("resetting jitter buffer due to RTP discontinuity") | ||
|
|
||
| for b.head != nil { | ||
| next := b.head.next |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At this point packets are dropped - might be good to update PacketsDropped stats field and invoke packet loss callback which API consumer might use to hook up e.g sending packet loss indicator.
jitter/buffer.go
Outdated
| } | ||
|
|
||
| func (b *Buffer) isLargeSequenceJump(current, prev uint16) bool { | ||
| const MAX_SEQUENCE_JUMP = 1000 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
High bitrate video key frames can span hundreds of packets - the specific value would make the buffer too sensitive even for relatively small time gaps.
could you describe what you mean by invalid state? Packet loss is to be expected under unstable network conditions, based on the description provided I am note entirely sure what is the sequence of events and how exactly they lead to that invalid state - would appreciate few more details 🙏 |
milos-lk
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hardcoded values are too rigid and could cause false positive state resets (esp. for video).
Would also like to get better understanding of the issue we try to fix.
|
Thank you for the thorough review! Based on your concerns about hardcoded thresholds causing false positives, especially for video streams, I've reconsidered the approach. Problem DescriptionWhen clients experience intermittent network issues, CPU starvation, or SDK problems, packet transmission can be interrupted for several seconds. Upon recovery, sequence numbers and timestamps show large gaps. The jitter buffer, designed for normal packet reordering (withinRange() up to 3000 packets), continues waiting for packets that will never arrive, causing the following issues:
Logs2025-11-02T01:31:01.383Z WARN large sequence number gap
{
"currSN": 17570,
"gapSN": 152,
"currTS": 1990786961,
"gapTS": 145920,
"timeSinceHighest": "3.040274159s"
}
Stats over 22 seconds:
{
"packetsExpected": 1122,
"packetsSeenPrimary": 245,
"packetsLost": 877,
"packetLostPercentage": 78.163994,
"gapHistogram": "[54:1, 101:2]"
}SolutionBackward Compatibility Guarantee.
|
|
|
||
| if dropped > 0 { | ||
| b.stats.PacketsDropped += uint64(dropped) | ||
| if b.onPacketLoss != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thinking about this again - if there was a huge gap in seen timestamps or sequence numbers it might be good to call onPacketLoss callback regardless of whether we actually had and dropped packets. It might be more robust to move it outside of if dropped > 0 block.
Abstract
prevTS) to enable timestamp-based discontinuity detectionProblem
Jitter buffer could get stuck in invalid state after large gaps (e.g., stream source switches, network reconnections), causing choppy playback and incorrect packet handling.
Solution
Added automatic detection and reset:
Testing
TestLargeSequenceJump: Verifies recovery after sequence jumpTestLargeTimestampJump: Verifies recovery after timestamp jumpTestSequenceWraparound: Ensures normal wraparound still worksImpact