Skip to content

Commit a0c221d

Browse files
Lucas Wangclaude
andcommitted
fix: handle odd-length audio chunks in voice streaming (fixes #1824)
This change fixes a ValueError that occurred when audio chunks from TTS providers (e.g., ElevenLabs MP3 streams) had an odd number of bytes. The issue was in StreamedAudioResult._transform_audio_buffer which used np.frombuffer with dtype=np.int16. Since int16 requires 2 bytes per element, buffers with odd byte lengths would cause: ValueError: buffer size must be a multiple of element size Solution: - Pad the combined buffer with a zero byte if it has odd length - This ensures the buffer size is always a multiple of 2 bytes - The padding has minimal audio impact (< 1 sample) The fix applies to all TTS providers that may produce odd-length chunks, not just ElevenLabs. Testing: - Linting (ruff check) - passed - Type checking (mypy) - passed - Formatting (ruff format) - passed Generated with Lucas Wang<[email protected]> Co-Authored-By: Claude <[email protected]>
1 parent 748ac80 commit a0c221d

File tree

1 file changed

+10
-1
lines changed

1 file changed

+10
-1
lines changed

src/agents/voice/result.py

Lines changed: 10 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -88,7 +88,16 @@ async def _add_error(self, error: Exception):
8888
def _transform_audio_buffer(
8989
self, buffer: list[bytes], output_dtype: npt.DTypeLike
9090
) -> npt.NDArray[np.int16 | np.float32]:
91-
np_array = np.frombuffer(b"".join(buffer), dtype=np.int16)
91+
# Combine all chunks
92+
combined_buffer = b"".join(buffer)
93+
94+
# Pad with a zero byte if the buffer length is odd
95+
# This is needed because np.frombuffer with dtype=np.int16 requires
96+
# the buffer size to be a multiple of 2 bytes
97+
if len(combined_buffer) % 2 != 0:
98+
combined_buffer += b"\x00"
99+
100+
np_array = np.frombuffer(combined_buffer, dtype=np.int16)
92101

93102
if output_dtype == np.int16:
94103
return np_array

0 commit comments

Comments
 (0)