Skip to content

Podcast generation app using Chatterbox-Turbo TTS model

Notifications You must be signed in to change notification settings

patrykkozuch/ChatterCast

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

1 Commit
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ChatterCast

ChatterCast is a project that allows you to generate a podcast based on the uploaded files realized for the Neural Networks Architectures course at AGH university. It is not meant to be a fully functional podcast generator, but rather a proof of concept that demonstrates the capabilities of the ChatterBox model for generating audio content.

The project is built using Streamlit for the (vibe-coded) web interface and ChatterBox Turbo model for podcast generation.

Examples

The examples of generated podcasts can be found in the examples folder. The generated audio files are in .wav format and cover the topic of Spatio-Temporal Graph Convolutional Networks (ST-GCN) for hand gesture recognition.

Usage

To use ChatterCast use Docker to build and run the application. First, fill in the HF_TOKEN environment variable in the compose.yml file with your HuggingFace token. Then, run the following command in the terminal:

docker compose up --build

This will build and start two containers: one for the Streamlit application and one for the ChatterBox model. The Streamlit application will be available at http://localhost:8501.

Then, fill in the OpenAI API key in the input field in the settings page of the application.

ChatterBox Model

To generate a podcast, we use the ChatterBox Turbo multilingual text-to-audio model. The model is capable of generating high-quality audio content in multiple languages based on the input text and the specified voice to clone. Both voices (Alice) and (Frank) are taken from open-source TTS datasets available online.

About

Podcast generation app using Chatterbox-Turbo TTS model

Resources

Stars

Watchers

Forks