Skip to content

AudioData.get_wav_data() only processes partial audio data, causing incomplete transcriptions #848

@ftnext

Description

@ftnext

Bug Description

When using the SpeechRecognition library with OpenAI Whisper API, only the first few seconds of audio files are transcribed, regardless of the actual file duration or size.

Steps to Reproduce

  1. Use a WAV audio file longer than ~30 seconds
  2. Run transcription using: python -m speech_recognition.recognizers.whisper_api.openai --model gpt-4o-transcribe audio_file.wav
  3. Observe that only the first portion is transcribed
% uv run --python 3.12 --with 'SpeechRecognition[openai]==3.14.2' -- python -m speech_recognition.recognizers.whisper_api.openai -l ja long_audio.wav

Here is long_audio.wav example:
https://notebooklm.google.com/notebook/e7297b2e-e363-4e77-bff3-8d71e104d5a2

Expected Behavior

The entire audio file should be transcribed.

Actual Behavior

Only the first few seconds are transcribed (e.g., 18 characters from a 7.6-minute file).

Root Cause Analysis

The issue appears to be in AudioData.get_wav_data() method. When processing audio files, the method only converts a small portion of the audio data:

  • Original file: 21.89 MB, 456 seconds
  • WAV conversion result: 0.08 MB (abnormally small)
  • This suggests only ~2-3 seconds of audio are being processed

Evidence

Testing the same audio file directly with OpenAI Python SDK works perfectly:

  • SpeechRecognition library: 18 characters transcribed
  • Direct OpenAI API: 2,829 characters (complete transcription)

Environment

  • SpeechRecognition version: 3.14.2
  • Python: 3.12
  • Audio format: 24kHz, 16-bit, mono PCM WAV
  • Models tested: whisper-1, gpt-4o-transcribe, gpt-4o-mini-transcribe (all show same issue)

Workaround

Use OpenAI Python SDK directly instead of SpeechRecognition library.

Metadata

Metadata

Assignees

Labels

No labels
No labels

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions