Description

demo_h264.mp4

Description

Interactive console app that plays text-to-speech audio using Orpheus-3B. Can be used to "vocalize" your favorite chatbot's text responses.

When in "chat mode", uses a system prompt to elicit the model's special vocalizations like <laugh>, <sigh>, <gasp>, etc.

Requires setting up Orpheus model to be served locally (see below).

Core decoding logic adapted from orpheus-tts-local by isaiahbjork.

Setup

1. Install Project

git clone [github repo clone url]
cd [repo name]

Init virtual environment, and activate (requires Python 3.13+). Eg:

`python -m venv venv`
`venv\Scripts\activate`

Install dependencies:

pip install -r requirements.txt

Install Pytorch with CUDA on top, if desired.

2. Set up local LLM server with the Orpheus model

Download a quantized version of the finetuned version of the Orpheus-3B model. For example: here (Q8) or here (Q4).

Run an LLM server and select the Orpheus model, much like as you would when serving any LLM.

Example command using llama.cpp (LM Studio also works):

llama-server.exe -m path/to/Orpheus-3b-FT-Q8_0.gguf -ngl 99 -c 4096 --host 0.0.0.0

3. Edit `config.json`

Required:

Edit orpheus_llm.url to that of your LLM server's endpoint.

For llama-server, that would normally be http://127.0.0.1:8080/v1/completions.

Required for LLM chat functionality:

Update the properties of the chatbot_llm object

The url should be a chat/completions-compatible endpoint (eg, OpenRouter service).

Populate either api_key or api_key_environment_variable as needed.

Lastly, the inner request_dict object can be populated with properties which will get merged into the service request's JSON data (eg, "model", "temperature", etc).

4. Run

python app.py

Usage notes

The chat LLM system prompt can be edited using system_prompt.txt

Performance notes

Reminder here that Orpheus model inference + SNAC decoding is not a lightweight task.

If you're having trouble acheiving stutter-free audio, try offloading Orpheus LLM inference duties to another machine on the local network.

Anecdotally, my dev system (Ryzen 7700 + 3080Ti) does the audio generation about 1.5x faster than real-time, using the Orpheus-3B Q8 model and running the LLM server on the same machine. On an M1 MacbookPro with the LLM server on a different machine.

Known issues

There is a prompt-toolkit-related bug which manifests inconsistently on Mac Terminal that corrupts display. If you are afflicted by this, please leave details under Issues.

Updates

2025-04-30

New command, "!redraw", redraws the display (useful if display is corrupted by unexpected console debug text, etc).

2025-04-23

Orpheus gen refactor; updated colors; buffer underflow recovery logic

2025-04-21

Syntax for setting voice has changed to: "!voice=tara", etc. This allows for arbitrary voice names when using custom Orpheus finetunes.

2025-04-20

The sentence or phrase of the currently playing audio segment now highlights in realtime.

2025-04-13

User settings now persist.

2025-04-11

Can now save audio output to disk. Toggle with !save. This opens up some use cases.

2025-04-09

TTS text now displays in sync with audio segment being played. Toggle with !sync.

2025-04-08

Chat response now streams, allowing for audio generation to begin after the first several words are received.

Todo

Web service layer for audio generation?
Voice cloning (will have to wait for official support first)

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.gitignore		.gitignore
.rooignore.txt		.rooignore.txt
LICENSE		LICENSE
README.md		README.md
ansi.py		ansi.py
app.py		app.py
app_types.py		app_types.py
app_util.py		app_util.py
audio_streamer.py		audio_streamer.py
color.py		color.py
completions_config.py		completions_config.py
completions_manager.py		completions_manager.py
completions_simple_requester.py		completions_simple_requester.py
completions_streamer.py		completions_streamer.py
config.json		config.json
config.py		config.py
config_example.json		config_example.json
constants.py		constants.py
constants_long.py		constants_long.py
decoder.py		decoder.py
l.py		l.py
main_control.py		main_control.py
main_control_parser.py		main_control_parser.py
orpheus_constants.py		orpheus_constants.py
orpheus_gen.py		orpheus_gen.py
orpheus_gen_util.py		orpheus_gen_util.py
orpheus_llm_streamer.py		orpheus_llm_streamer.py
prefs.py		prefs.py
pyrightconfig.json		pyrightconfig.json
requirements.txt		requirements.txt
save_wav_util.py		save_wav_util.py
sentence_segmenter.py		sentence_segmenter.py
shared.py		shared.py
system_prompt.txt		system_prompt.txt
text_massager.py		text_massager.py
text_segmenter.py		text_segmenter.py
text_segmenter_ORIG.py		text_segmenter_ORIG.py
ui.py		ui.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Description

Setup

1. Install Project

2. Set up local LLM server with the Orpheus model

3. Edit `config.json`

4. Run

Usage notes

Performance notes

Known issues

Updates

Todo

About

Uh oh!

Releases

Packages

Uh oh!

Languages

License

zeropointnine/tts-toy

Folders and files

Latest commit

History

Repository files navigation

Description

Setup

1. Install Project

2. Set up local LLM server with the Orpheus model

3. Edit config.json

4. Run

Usage notes

Performance notes

Known issues

Updates

Todo

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Languages

3. Edit `config.json`

Packages