Whisper Streaming for Wowza Streaming Engine

This provides a docker container to run a Whisper service that integrates with the Wowza Streaming Engine module wse-plugin-caption-handlers It can also run in standalone mode and pull in an RTMP stream using ffmpeg

Usage

Files

Dockerfile

Dockerfile to build a python application using OpenAI Whisper that listens on a port that receives raw audio and returns JSON for detected speech that gets integrated with the video feed as WebVTT or Embedded 608/708. Will also make calls to a Libretranslate service to translate text detected into another language and report back

docker-compose.yaml

A docker compose file that includes Whisper and Libretranslate.

Environment Variables

Variable	Default	Description
BACKEND	faster-whisper	[faster-whisper,whisper_timestamped,openai-api] Load only this backend for Whisper processing.
MODEL	tiny.en	[tiny.en,tiny,base.en,base,small.en,small,medium.en,medium,large-v1,large-v2,large-v3,large,large-v3-turbo] Name size of the Whisper model to use. The model is automatically downloaded from the model hub if not present in model cache dir. (/tmp)
USE_GPU	Flase	Use the GPU if available and installed
LANGUAGE	auto	Source language code, e.g. en,de,cs, or 'auto' for language detection.
LOG_LEVEL	INFO	[DEBUG,INFO,WARNING,ERROR,CRITICAL] The level for logging
SOURCE_STREAM	none	an RTMP url to pull a stream in. Uses ffmpeg to capture audio and forwards the raw audio to the service
MIN_CHUNK_SIZE	1	Minimum audio chunk size in seconds. It waits up to this time to do processing. If the processing takes shorter time, it waits, otherwise it processes the whole segment that was received by this time.
SAMPLING_RATE	16000	Sample rate of the Audio.
SOURCE_LANGUAGE	en	Language of audio recieved from WSE
REPORT_LANGUAGES	en	Languages to report back to WSE
LIBRETRANSLATE_HOST	localhost	Host name of the LibreTranslate service
LIBRETRANSLATE_PORT	5000	Port of the LibreTranslate service

JSON

The service returns a json object in the format to the websocket

{
    "language": "en",
    "start": "7.580",
    "end": "8.540",
    "text": "this is text from whisper"
}

TEST

ffmpeg -hide_banner -loglevel error -f flv -i rtmp://localhost/live/myStream -c:a pcm_s16le -ac 1 -ar 16000 -f s16le - | nc localhost 3000

ffmpeg -hide_banner -loglevel error -re -i <video_file.mp4> -c:a pcm_s16le -ac 1 -ar 16000 -f s16le - | nc localhost 3000

GPU

This container and Whisper does support NVIDIA GPU for increased performance with larger models:

Install torch and triton python libraries in the Dockerfile.
Install cudnn9-cuda-12 package in the Dockerfile.
Run the docker container with --gpus all
Run the docker container with environment variables -e USE_GPU=True and -e FP16=true

Acknowledgments

This project builds upon the work from:

Original README.md

Contact

Wowza Media Systems, LLC

License

This code is distributed under the Wowza Public License.

Name		Name	Last commit message	Last commit date
Latest commit History 142 Commits
.gitignore		.gitignore
Dockerfile		Dockerfile
LICENSE		LICENSE
LICENSE.txt		LICENSE.txt
README.md		README.md
README_ORG.md		README_ORG.md
docker-compose.yaml		docker-compose.yaml
dockerhub.md		dockerhub.md
entrypoint.sh		entrypoint.sh
line_packet.py		line_packet.py
samples_jfk.wav		samples_jfk.wav
silero_vad_iterator.py		silero_vad_iterator.py
whisper_online.py		whisper_online.py
whisper_online_server.py		whisper_online_server.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Whisper Streaming for Wowza Streaming Engine

Usage

Files

Dockerfile

docker-compose.yaml

Environment Variables

JSON

TEST

GPU

Acknowledgments

Contact

License

About

Licenses found

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Whisper Streaming for Wowza Streaming Engine

Usage

Files

Dockerfile

docker-compose.yaml

Environment Variables

JSON

TEST

GPU

Acknowledgments

Contact

License

About

Resources

License

Licenses found

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages