-
Notifications
You must be signed in to change notification settings - Fork 16
Text-Only Chat Mode: Agent Disconnects When Microphone Track Not PublishedΒ #7
Description
Text-Only Chat Mode: Agent Disconnects When Microphone Track Not Published
Summary
When using text-only chat mode (agent-level or runtime override), the WebRTC connection disconnects immediately after the agent sends its first response if no microphone audio track is published by the client.
Environment
- SDK Version: 0.3.0 (or latest)
- Flutter Version: 3.x
- Platform: Android (tested on emulator sdk gphone64 arm64)
- Agent Configuration: "Enable chat mode" enabled in Advanced settings
Agent Configuration
The agent has the following settings enabled in the ElevenLabs dashboard:
- Advanced > Automatic Speech Recognition: "Enable chat mode" = ON
- Security > Overrides: "Text only" override enabled
Expected Behavior
When using text-only chat mode:
- No microphone permission should be required
- Connection should remain stable for text-based conversation
- Messages should be exchanged via the data channel without audio tracks
Actual Behavior
The connection disconnects immediately after the agent sends its first text response, with the agent participant leaving the LiveKit room.
Detailed Investigation
Attempt 1: Runtime Override with textOnly: true
Code:
await client.startSession(
agentId: 'agent_id',
overrides: ConversationOverrides(
conversation: ConversationSettingsOverrides(textOnly: true),
),
);Result: Connection disconnects mid-response. The server receives the override but still initializes audio infrastructure. When disposing the audio track, the connection fails.
Logs:
[TextChat] π‘ Status: connected
[TextChat] π Debug: {conversation_initiation_metadata_event: {conversation_id: conv_xxx, agent_output_audio_format: pcm_48000, user_input_audio_format: pcm_48000}, type: conversation_initiation_metadata}
[TextChat] π Agent text part [start]: ""
[TextChat] π Agent text part [delta]: "greeting message..."
trackDispose() track is null
[TextChat] β Disconnected: agent
Observation: Even with textOnly: true override, the metadata still shows agent_output_audio_format: pcm_48000, indicating the server still sets up audio infrastructure.
Attempt 2: Agent-Level Chat Mode Only (No Runtime Override)
Removed the runtime textOnly: true override to avoid potential conflicts, relying solely on the agent-level "Enable chat mode" setting.
Code:
await client.startSession(
agentId: 'agent_id',
// No overrides - using agent-level chat mode
);Result:
- With microphone enabled: Connection works perfectly. Ping/pong keepalive functions, messages exchange successfully.
- With microphone disabled (skipMicrophone): Connection disconnects after first agent response.
Attempt 3: Skip Microphone Setup
Modified the SDK to add a skipMicrophone parameter that prevents setMicrophoneEnabled(true) from being called in LiveKitManager.connect().
Code:
// In LiveKitManager.connect()
if (!textOnly) {
await _room!.localParticipant?.setMicrophoneEnabled(
true,
audioCaptureOptions: const AudioCaptureOptions(...),
);
}Result: Same disconnection issue. The server expects an audio track to be published.
Root Cause Analysis
Based on the investigation:
- The ElevenLabs server expects audio tracks even when chat mode is enabled at the agent level
- The
trackDispose() track is nullerror occurs when the server tries to handle/dispose audio infrastructure but no track exists - The ParticipantDisconnectedEvent fires because the agent participant leaves the room after the track disposal failure
- The connection only remains stable when the client publishes a microphone audio track
Working vs Non-Working Scenarios
| Scenario | Microphone Track | textOnly Override | Result |
|---|---|---|---|
| Voice mode | Published | None | β Works |
| Chat mode (agent-level) | Published | None | β Works |
| Chat mode (agent-level) | NOT Published | None | β Disconnects |
| Chat mode (runtime) | NOT Published | textOnly: true |
β Disconnects |
| Chat mode (both) | NOT Published | textOnly: true |
β Disconnects |
Request
For true text-only chat mode without microphone permission requirements, could the server be updated to:
- Not require audio track publication when chat mode is enabled
- Handle the absence of client audio track gracefully
- Use only the data channel for text-based communication
Workaround (Current)
The only working solution is to enable the microphone (which triggers permission request) even when using text-only chat. This defeats the purpose of chat mode for applications that want to avoid microphone permissions entirely.
Reproduction Steps
- Create an agent with "Enable chat mode" in Advanced settings
- Connect using the Flutter SDK without enabling microphone:
// Modified SDK to skip microphone
await _liveKitManager.connect(wsUrl, token, textOnly: true);
// Where textOnly skips setMicrophoneEnabled(true)- Observe that connection disconnects after agent's first response
Additional Context
- Ping/pong keepalive mechanism works correctly
- Data channel messages are received successfully
- The disconnection is triggered by
ParticipantDisconnectedEventwith agent identity - The issue occurs consistently across multiple connection attempts