fix(smallwebrtc): respect audio_out_10ms_chunks parameter in RawAudioTrack#3645
fix(smallwebrtc): respect audio_out_10ms_chunks parameter in RawAudioTrack#3645
Conversation
…Track The RawAudioTrack class was hardcoded to always produce 10ms audio frames regardless of the audio_out_10ms_chunks transport parameter. This caused firmware clients to receive 20ms chunks even when 40ms was configured. Changes: - Add num_10ms_chunks parameter to RawAudioTrack constructor - Update add_audio_bytes to chunk audio based on configured size - Update recv() to produce frames of the configured size - Pass audio_out_10ms_chunks from TransportParams when creating track
Codecov Report❌ Patch coverage is
... and 1 file with indirect coverage changes 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Pull request overview
This PR fixes a bug where SmallWebRTCTransport was ignoring the audio_out_10ms_chunks parameter from TransportParams, always producing 10ms audio frames instead of respecting the configured chunk size (e.g., 40ms for audio_out_10ms_chunks=4).
Changes:
- Added
num_10ms_chunksparameter toRawAudioTrackclass with proper calculation of chunk sizes - Updated validation and frame generation logic to respect configurable chunk sizes
- Modified track instantiation to pass through the configured parameter from
TransportParams
Reviewed changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.
| File | Description |
|---|---|
| src/pipecat/transports/smallwebrtc/transport.py | Modified RawAudioTrack to accept and respect num_10ms_chunks parameter, updating initialization, validation, and frame generation logic to support configurable audio chunk sizes |
| changelog/3645.fixed.md | Added changelog entry documenting the fix |
💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
Ensure num_10ms_chunks is a positive integer to prevent division by zero or invalid audio chunk sizes. Raises ValueError if value is less than 1.
|
Requesting that @filipi87 take a look at this one when he has a moment. |
… guard) Migrate from pytest-style to unittest.IsolatedAsyncioTestCase to match the pattern used by other transport tests (e.g. test_livekit_transport.py). Guard the aiortc/av import with try/except and skipUnless so tests gracefully skip when webrtc dependencies aren't installed. Add pyright suppressions for false positives inherent to testing internals of optional-dependency classes. Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
|
@filipi87 I went through line by line and this all looks right to me, however my SmallWebRTC Vibe isn't strong enough to 100% confirm 🙏 |
filipi87
left a comment
There was a problem hiding this comment.
The audio_out_10ms_chunks is used inside BaseOutputTransport to chunk the audio we receive from the TTS, store it in a queue, and optionally mix it with other audio, and then send it to the appropriate transport (Daily, SmallWebRTC, WebSocket ....).
As far as I remember, creating 40 ms chunks improved CPU usage inside Pipecat. But I am not sure that when we implemented this, we intended for transports to preserve that chunk size when actually sending audio to WebRTC. I don’t think that was the original intention.
For example, in Daily we always send 10 ms of audio, regardless of what is passed to DailyTransport.
And the reason is, WebRTC works best when sending 10–20 ms audio frames to keep latency low:
- 10 ms: ultra low latency
- 20 ms: low latency
This is why we implemented the same logic in SmallWebRTC.
My final concern is that if we send 40 ms frames to WebRTC, this will impact latency. So, If we want to support that behavior, we should introduce a new property to explicitly control it.
cc: @aconchillo for thoughts on this, who implemented similar logic inside Daily.
Summary
Fixed an issue where
SmallWebRTCTransportwas not respecting theaudio_out_10ms_chunksparameter fromTransportParams. TheRawAudioTrackclass was hardcoded to always produce 10ms audio frames, causing firmware clients to receive incorrect chunk sizes (e.g., receiving 20ms chunks when 40ms was configured).Problem
When using
SmallWebRTCTransporton Pipecat Cloud withaudio_out_10ms_chunks=4(40ms), firmware clients were still receiving 20ms audio chunks. This happened because:audio_out_10ms_chunksRawAudioTrackin SmallWebRTC was re-chunking the audio back to 10ms for WebRTC transmissionrecv()method always produced 10ms frames regardless of configurationSolution
Updated
RawAudioTrackto accept and respect a configurable chunk size:num_10ms_chunksparameter toRawAudioTrackconstructor (default 1 for backward compatibility)add_audio_bytes()to validate and chunk audio using the configured chunk sizerecv()to produce audio frames of the configured size and advance timestamps correctly_handle_client_connected()to passaudio_out_10ms_chunksfromTransportParamswhen creating the trackTesting
audio_out_10ms_chunks=4now correctly produces 40ms audio frames over WebRTC