Skip to content
Open
Show file tree
Hide file tree
Changes from 2 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
11 changes: 10 additions & 1 deletion opus/opus.go
Original file line number Diff line number Diff line change
Expand Up @@ -64,7 +64,7 @@ func Decode(w media.PCM16Writer, targetChannels int, logger logger.Logger) (Writ
}, nil
}

func Encode(w Writer, channels int, logger logger.Logger) (media.PCM16Writer, error) {
func Encode(w Writer, channels int, useDtx bool, logger logger.Logger) (media.PCM16Writer, error) {
enc, err := opus.NewEncoder(w.SampleRate(), channels, opus.AppVoIP)
if err != nil {
return nil, err
Expand All @@ -73,6 +73,7 @@ func Encode(w Writer, channels int, logger logger.Logger) (media.PCM16Writer, er
w: w,
enc: enc,
buf: make([]byte, w.SampleRate()/rtp.DefFramesPerSec*channels),
useDtx: useDtx,
logger: logger,
}, nil
}
Expand Down Expand Up @@ -162,6 +163,7 @@ type encoder struct {
w Writer
enc *opus.Encoder
buf Sample
useDtx bool
logger logger.Logger
}

Expand All @@ -178,6 +180,13 @@ func (e *encoder) WriteSample(in media.PCM16Sample) error {
if err != nil {
return err
}

// opus_encode() returns the number of bytes actually written to the packet.
// The return value can be negative, which indicates that an error has occurred.
// If the return value is 1 byte, then the packet does not need to be transmitted (DTX).
if n == 1 && e.useDtx {
return nil
}
Copy link
Contributor

@boks1971 boks1971 Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:TIL: I thought 1 byte should be transmitted to signal background noise level. I think I have seen 1 byte packets comes in to the server. But, libwebrtc mentions this also (https://source.chromium.org/chromium/chromium/src/+/main:media/audio/audio_opus_encoder.cc;drc=dfb49a410ff213451f72db8d792973550025033e;l=288). Where are the 1 byte packets coming from into the server? Have to check again if I am not remembering something properly.

Copy link
Member Author

@anunaym14 anunaym14 Jun 1, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe some clients are not negotiating dtx? meet.livekit.io does not seem to be negotiating it.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

JS SDK sample app does DTX. I think I checked with that. Will test out and check packet sizes with DTX enabled.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DTX packets (or the packet before DTX kicks in) is 1 byte when trying with JS SDK sample app (setting red: false in the demo app) and logging on SFU side. Puzzled what opus_encode returns and how that is getting sent. Definitely does not jive with opus_encode() doc or libwebrtc code.

return e.w.WriteSample(e.buf[:n])
}

Expand Down
13 changes: 12 additions & 1 deletion pcm.go
Original file line number Diff line number Diff line change
Expand Up @@ -182,6 +182,7 @@ type sampleWriter[T ~[]byte] struct {
w MediaSampleWriter
sampleRate int
sampleDur time.Duration
lastWrite time.Time
}

func (w *sampleWriter[T]) String() string {
Expand All @@ -199,7 +200,17 @@ func (w *sampleWriter[T]) Close() error {
func (w *sampleWriter[T]) WriteSample(in T) error {
data := make([]byte, len(in))
copy(data, in)
return w.w.WriteSample(media.Sample{Data: data, Duration: w.sampleDur})

var droppedPackets uint16
if !w.lastWrite.IsZero() {
timeSinceLastWrite := time.Since(w.lastWrite)
if timeSinceLastWrite > w.sampleDur {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe need to make this a bit more fuzzy? As it is written here, a write coming through after 19.99ms will not insert a silence packet in between. Guess, it is okay, but just something to thinking about a bit and see if some fuzz factor should be applied (for example anything over 90% of duration can be considered a full duration).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense, added a 10% tolerance

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a comment in the specific commit. I may be reading it wrong, but it looked like an empty packet would not be inserted if the gap is 19.5 ms for a sample duration of 10ms.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You're right, missed that it's an integer division, not float. Converting it to float and rounding it off to the nearest integer instead.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That could add a dropped packet at 15.1 ms. I am thinking it should add a packet at > 19ms only. So, you would have to include that tolerance in the number of packets calculation too.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm okay, let me adjust that

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fpushed, it should be correct now!

droppedPackets = uint16((timeSinceLastWrite - w.sampleDur) / w.sampleDur)
}
}

w.lastWrite = time.Now()
return w.w.WriteSample(media.Sample{Data: data, Duration: w.sampleDur, PrevDroppedPackets: droppedPackets})
}

// MonoToStereo converts mono PCM from src to stereo PCM in dst.
Expand Down
Loading