-
-
Notifications
You must be signed in to change notification settings - Fork 6.9k
Description
Check for existing issues
- I have searched the existing issues and checked that my issue is not a duplicate.
What happened?
LiteLLM proxy handles normal /v1/audio/speech requests correctly, but stream_format="sse" does not behave correctly.
I verified that a simple non-streaming TTS request through LiteLLM works:
model="gpt-4o-mini-tts"- no explicit
stream_format - response:
200 OK x-litellm-model-api-base: https://api.openai.com
However, when testing /v1/audio/speech with stream_format="sse" against an OpenAI-compatible TTS backend behind LiteLLM, the proxy does not preserve SSE behavior.
Instead of returning Content-Type: text/event-stream and data: {...} events, the proxy returns a binary audio response.
Expected behavior:
stream_format="sse"should return an SSE stream of audio events.
Actual behavior:- LiteLLM returns a normal binary audio response instead of SSE.
Steps to Reproduce
- Verify that normal TTS works through LiteLLM:
curl -i -sS "https://<litellm-host>/audio/speech" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <litellm-key>" \
-d '{
"input":"Hello from gpt-4o-mini-tts.",
"voice":"alloy",
"model":"gpt-4o-mini-tts",
"response_format":"pcm"
}'- Verify that the same endpoint behaves incorrectly when SSE is requested:
curl -i -sS "https://<litellm-host>/audio/speech" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer <litellm-key>" \
-d '{
"input":"Hello from gpt-4o-mini-tts.",
"voice":"alloy",
"model":"gpt-4o-mini-tts",
"response_format":"pcm",
"stream_format":"sse"
}'-
Observe that the response does not behave like OpenAI speech SSE. In my setup, LiteLLM returns HTTP 500 Internal Server Error for this request instead of an SSE stream.
-
Compare this with an OpenAI-compatible upstream TTS backend that supports SSE directly: when called without LiteLLM, the same /audio/speech request shape returns Content-Type: text/event-stream and data: {"type":"speech.audio.delta", ...} events.
Relevant log output
Working non-streaming request through LiteLLM:
HTTP/2 200
content-type: audio/mpeg
x-litellm-model-api-base: https://api.openai.com
x-litellm-response-cost: 2.75e-05
x-litellm-version: 1.81.0
Unexpected response for stream_format="sse" through LiteLLM:
HTTP/2 200
content-type: audio/mpeg
x-litellm-version: 1.81.0
<raw audio bytes...>
Expected SSE shape:
HTTP/1.1 200 OK
content-type: text/event-stream; charset=utf-8
data: {"type":"speech.audio.delta","audio":"..."}
data: {"type":"speech.audio.done","usage":{...}}What part of LiteLLM is this about?
SDK (litellm Python package)
What LiteLLM version are you on ?
v1.81.0