You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
feat(tts): add multi-provider TTS support with Deepinfra and custom OpenAI-compatible endpoints
Add comprehensive multi-provider TTS support enabling users to choose between OpenAI, Deepinfra, and custom OpenAI-compatible endpoints. Implement provider-specific voice management with automatic voice restoration per provider-model combination, and migrate package manager to pnpm for improved dependency handling.
Key changes:
- Add TTS provider selection (OpenAI, Deepinfra, custom-openai) in settings UI
- Implement provider-specific model and voice lists with dynamic fetching
- Add voice persistence per provider-model combination in savedVoices
- Support Deepinfra models: Kokoro-82M, Orpheus-3B, Sesame-1B with their voice libraries
- Migrate to pnpm with frozen lockfile for reproducible builds
- Update Docker configuration to use pnpm and Deepinfra API defaults
- Add migration logic for existing users to infer provider from stored baseUrl
- Update test helpers and Playwright configuration for Deepinfra API
- Add example docker-compose.yml with Kokoro-FastAPI integration
BREAKING CHANGE: Voice selection is now provider-model specific. Previously saved voices will be migrated to the new savedVoices structure, but users may need to reselect voices if switching providers.
Copy file name to clipboardExpand all lines: README.md
+77-44Lines changed: 77 additions & 44 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -9,75 +9,83 @@
9
9
10
10
# OpenReader WebUI 📄🔊
11
11
12
-
OpenReader WebUI is a document reader with Text-to-Speech capabilities, offering a TTS read along experience with narration for EPUB, PDF, TXT, MD, and DOCX documents. It can use any OpenAI compatible TTS endpoint, including[Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI) and [Orpheus-FastAPI](https://github.com/Lex-au/Orpheus-FastAPI)
12
+
OpenReader WebUI is a document reader with Text-to-Speech capabilities, offering a TTS read along experience with narration for EPUB, PDF, TXT, MD, and DOCX documents. It supports multiple TTS providers including OpenAI, Deepinfra, and custom OpenAI-compatible endpoints like[Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI) and [Orpheus-FastAPI](https://github.com/Lex-au/Orpheus-FastAPI)
13
13
14
-
- 🎯 **TTS API Integration**:
15
-
- Compatible with OpenAI text to speech API and GPT-4o Mini TTS, Kokoro-FastAPI TTS, Orpheus FastAPI or any other compatible service
16
-
- Support for TTS models (tts-1, tts-1-hd, gpt-4o-mini-tts, kokoro, and custom)
-[ ]**Support non-OpenAI TTS APIs**: ElevenLabs, etc.
34
32
-[ ]**Accessibility Improvements**
35
33
36
34
## 🐳 Docker Quick Start
37
35
38
36
### Prerequisites
39
37
- Recent version of Docker installed on your machine
40
-
- A TTS API server (Kokoro-FastAPI, Orpheus-FastAPI, or OpenAI API)
41
-
42
-
```bash
43
-
docker run --name openreader-webui \
44
-
-p 3003:3003 \
45
-
-v openreader_docstore:/app/docstore \
46
-
ghcr.io/richardr1126/openreader-webui:latest
47
-
```
48
-
49
-
(Optionally): Set the TTS `API_BASE` URL and/or `API_KEY` to be default for all devices
50
-
```bash
51
-
docker run --name openreader-webui \
52
-
-e API_BASE=http://host.docker.internal:8880/v1 \
53
-
-p 3003:3003 \
54
-
-v openreader_docstore:/app/docstore \
55
-
ghcr.io/richardr1126/openreader-webui:latest
56
-
```
57
-
58
-
> Requesting audio from the TTS API happens on the Next.js server not the client. So the base URL for the TTS API should be accessible and relative to the Next.js server. If it is in a Docker you may need to use `host.docker.internal` to access the host machine, instead of `localhost`.
59
-
60
-
Visit [http://localhost:3003](http://localhost:3003) to run the app and set your settings.
61
-
62
-
> **Note:** The `openreader_docstore` volume is used to store server-side documents. You can mount a local directory instead. Or remove it if you don't need server-side documents.
63
-
64
-
### ⬆️ Update Docker Image
38
+
- A TTS API server (Kokoro-FastAPI, Orpheus-FastAPI, Deepinfra, OpenAI, etc.) running and accessible
39
+
40
+
### 1. 🐳 Start the Docker container:
41
+
```bash
42
+
docker run --name openreader-webui \
43
+
-p 3003:3003 \
44
+
-v openreader_docstore:/app/docstore \
45
+
ghcr.io/richardr1126/openreader-webui:latest
46
+
```
47
+
48
+
(Optionally): Set the TTS `API_BASE` URL and/or `API_KEY` to be default for all devices
49
+
```bash
50
+
docker run --name openreader-webui \
51
+
-e API_KEY=none \
52
+
-e API_BASE=http://host.docker.internal:8880/v1 \
53
+
-p 3003:3003 \
54
+
-v openreader_docstore:/app/docstore \
55
+
ghcr.io/richardr1126/openreader-webui:latest
56
+
```
57
+
58
+
> **Note:** Requesting audio from the TTS API happens on the Next.js server not the client. So the base URL for the TTS API should be accessible and relative to the Next.js server. If it is in a Docker you may need to use `host.docker.internal` to access the host machine, instead of `localhost`.
59
+
60
+
Visit [http://localhost:3003](http://localhost:3003) to run the app and set your settings.
61
+
62
+
> **Note:** The `openreader_docstore` volume is used to store server-side documents. You can mount a local directory instead. Or remove it if you don't need server-side documents.
63
+
64
+
### 2. ⚙️ Configure the app settings in the UI:
65
+
- Set the TTS Provider and Model in the Settings modal
66
+
- Set the TTS API Base URL and API Key if needed (more secure to set in env vars)
67
+
- Select your model's voice from the dropdown (voices try to be fetched from TTS Provider API)
### Adding to a Docker Compose (i.e. with open-webui or Kokoro-FastAPI)
76
+
### (Alternate) 🐳 Configuration with Docker Compose and Kokoro-FastAPI
72
77
73
-
> Note: This is an example of how to add OpenReader WebUI to a docker-compose file. You can add it to your existing docker-compose file or create a new one in this directory. Then run `docker-compose up --build` to start the services.
78
+
A complete example docker-compose file with Kokoro-FastAPI and OpenReader WebUI is available in [`examples/docker-compose.yml`](examples/docker-compose.yml). You can download and use it:
0 commit comments