OpenAI's whisper unfortunetly does not support live transcription. This repo is meant to fix that using a sliding window method.
Feel free to check the detailed article below on the explanations behind the sliding window method.
How to make Whisper STT live transcription. Part 1 Part 2 Part 3
- Create an installable package
- Create examples to create super class that can leverage custom models or custom audio streams.
- Fix the sentence builder algorithm, it deletes text when single words are cycled through.
- Add an iteractive gui
pip install -U openai-whisper
pip install pyaudio
pip install librosa
pip install numpyWhisper requires ffmpeg, therefore the additional installation is required.
# on Ubuntu or Debian
sudo apt update && sudo apt install ffmpeg
# on Arch Linux
sudo pacman -S ffmpeg
# on MacOS using Homebrew (https://brew.sh/)
brew install ffmpeg
# on Windows using Chocolatey (https://chocolatey.org/)
choco install ffmpeg
# on Windows using Scoop (https://scoop.sh/)
scoop install ffmpegAlternatively, you can copy the ffmpeg binary file to the working directory of your project.
import os
import time
from live_stt_2 import LiveSTT
print("Initializing...")
stt = LiveSTT()
#stt.calculate_recommended_settings(5)
print("Running")
def clear_screen():
# For Windows
if os.name == 'nt':
_ = os.system('cls')
# For macOS and Linux
else:
_ = os.system('clear')
stt.start()
while True:
time.sleep(2)
clear_screen()
print("Predicted Text:")
print(stt.confirmed_text)This error occurs when ffmpeg is not found in the environmental path or found in working directory. Check installation guide.