-
Notifications
You must be signed in to change notification settings - Fork 670
Open
Description
Description of the Bug
When using the Gemini native audio model for real-time voice generation, the model occasionally stops outputting audio midway through a sentence.
This behavior is inconsistent — sometimes the audio completes normally, but other times it cuts off abruptly without finishing the response.
Model & Configuration
model = "gemini-2.5-flash-preview-native-audio-dialog"
CONFIG = types.LiveConnectConfig(
response_modalities=["AUDIO"],
media_resolution=types.MediaResolution.MEDIA_RESOLUTION_LOW,
# Context window compression (commented out due to causing more frequent stops)
# context_window_compression=types.ContextWindowCompressionConfig(
# trigger_tokens=28000,
# sliding_window=types.SlidingWindow(target_tokens=13774),
# ),
speech_config=types.SpeechConfig(
language_code="en-US",
voice_config=types.VoiceConfig(
prebuilt_voice_config=types.PrebuiltVoiceConfig(voice_name="puck")
),
),
realtime_input_config=types.RealtimeInputConfig(
automatic_activity_detection=types.AutomaticActivityDetection(disabled=False),
activity_handling=types.ActivityHandling.NO_INTERRUPTION
),
input_audio_transcription=types.AudioTranscriptionConfig(),
output_audio_transcription=types.AudioTranscriptionConfig(),
system_instruction=SYSTEM_INSTRUCTION,
proactivity=types.ProactivityConfig(proactive_audio=True)
)Notes:
- Commented out context window compression because it seemed to cause even more frequent stops.
- Output is inconsistent — sometimes fine, sometimes stops after just one sentence in the middle of an interaction.
Expected Behavior
The model should complete the entire generated audio without stopping unless explicitly interrupted.
Actual Behavior
The audio output randomly stops mid-sentence, requiring a retry or manual continuation.
Frequency
- Happens more often after some responses, not usually at the very start.
- Can also occur with short responses.
Possible Causes (Not Confirmed)
- API issue (most probably)
- Context window handling
- Streaming bug
- Session resumption issue
Request
Could you please confirm if this is a known issue and provide any updates or recommended workarounds?
tesla1900, thejackwu, ossianravn, parthasai, walln and 1 more
Metadata
Metadata
Assignees
Labels
No labels