Problem: Speech recognition not detecting voice input on mobile devices Root Causes:
- Mobile browsers (especially iOS Safari) require user gesture to enable microphone
- Speech Recognition API may not be supported on some mobile browsers
- Permissions may not be properly requested
Problem: Text-to-Speech audio received but not playing Root Cause:
- Browser autoplay policy blocks audio from playing without user interaction
AudioContextstarts in "suspended" state until user interacts with page- The code does resume AudioContext, but timing might be off
Problem: Browsers block audio playback until user interacts with the page
Solution: Add a "Start Session" button that users must click before live mode activates. This user gesture will:
- Unlock AudioContext for audio playback
- Request microphone permissions
- Start speech recognition
Implementation: Add an initial screen with "Start Live Mode" button that triggers all initialization.
Problem: Mobile browsers have limited speech recognition support
Current Detection:
if (!('webkitSpeechRecognition' in window) && !('SpeechRecognition' in window)) {
console.error("Speech recognition not supported")
return
}Enhanced Solution:
- Check for mobile browser and show appropriate message
- Use Media Recorder API as fallback for mobile (send audio to backend)
- Add better error handling and user feedback
Current Code:
if (audioContext.state === 'suspended') {
await audioContext.resume()
console.log("🔊 AudioContext resumed")
}Issue: Sometimes AudioContext.resume() needs to be called from a user gesture
Enhanced Solution:
// In the toggleMicrophone function (which IS a user gesture):
const unlockAudio = async () => {
if (audioContextRef.current?.state === 'suspended') {
await audioContextRef.current.resume()
console.log("🔊 AudioContext unlocked via user gesture")
}
}
// Call this when user enables mic (user gesture)
await unlockAudio()In toggleMicrophone() function, add:
const toggleMicrophone = async () => { // Make it async
if (isMicOn) {
// Turn off mic
if (recognitionRef.current) {
recognitionRef.current.stop()
}
setIsMicOn(false)
setIsListening(false)
} else {
// Turn on mic - THIS IS A USER GESTURE
// UNLOCK AUDIOCONTEXT HERE
if (audioContextRef.current?.state === 'suspended') {
try {
await audioContextRef.current.resume()
console.log("🔊 AudioContext unlocked!")
} catch (e) {
console.error("Failed to unlock AudioContext:", e)
}
}
setIsMicOn(true)
initializeSpeechRecognition()
}
}Add this helper function:
const isMobileDevice = () => {
return /Android|webOS|iPhone|iPad|iPod|BlackBerry|IEMobile|Opera Mini/i.test(navigator.userAgent)
}
const initializeSpeechRecognition = useCallback(() => {
// Check if mobile
if (isMobileDevice()) {
console.log("📱 Mobile device detected")
// Check for speech recognition support
if (!('webkitSpeechRecognition' in window) && !('SpeechRecognition' in window)) {
alert("Speech recognition is not supported on this mobile browser. Please use Chrome or Safari on desktop, or try the regular chat mode.")
setIsMicOn(false)
return
}
}
// ... rest of the code
})Add toast/notification when:
- AudioContext is unlocked: "✅ Audio enabled"
- Speech recognition starts: "🎤 Listening..."
- TTS starts playing: "🔊 Speaking..."
- No speech detected: "No speech detected, please try again"
- Open live mode on mobile
- Click microphone button (user gesture)
- Grant microphone permission
- Speak clearly
- Check browser console for errors
- Verify speech recognition events are firing
- Open live mode
- Click microphone button FIRST (user gesture to unlock audio)
- Speak something
- Wait for response text
- Listen for TTS audio
- Check console for "🔊 Audio playback started"
Add this test button temporarily:
<button onClick={async () => {
console.log("Testing audio...")
if (audioContextRef.current) {
console.log("AudioContext state:", audioContextRef.current.state)
if (audioContextRef.current.state === 'suspended') {
await audioContextRef.current.resume()
console.log("Resumed! New state:", audioContextRef.current.state)
}
}
// Test with a beep
const ctx = audioContextRef.current || new AudioContext()
const osc = ctx.createOscillator()
osc.connect(ctx.destination)
osc.start()
osc.stop(ctx.currentTime + 0.2)
console.log("Beep should play!")
}}>
Test Audio
</button>| Feature | Chrome Desktop | Chrome Mobile | Safari Desktop | Safari iOS | Firefox |
|---|---|---|---|---|---|
| Speech Recognition | ✅ | ✅ | ✅ (with prefix) | ❌ | |
| AudioContext | ✅ | ✅ | ✅ | ✅ | ✅ |
| Autoplay Policy | Requires gesture | Requires gesture | Requires gesture | Strict | Requires gesture |
Main Issue: Browser autoplay policies require user interaction before audio can play
Solution: Ensure AudioContext.resume() is called from a user gesture event (like clicking the microphone button)
Implementation: Add await audioContextRef.current.resume() inside the toggleMicrophone() function when turning the mic ON.
This will unlock audio playback for the entire session!