Skip to content

Commit fc78074

Browse files
committed
Bug fixes and readme update
1 parent 5edb990 commit fc78074

File tree

4 files changed

+43
-38
lines changed

4 files changed

+43
-38
lines changed

README.md

Lines changed: 25 additions & 22 deletions
Original file line numberDiff line numberDiff line change
@@ -6,45 +6,47 @@
66
[![GitHub Release](https://img.shields.io/github/v/release/richardr1126/OpenReader-WebUI)](../../releases)
77

88
[![Discussions](https://img.shields.io/badge/Discussions-Ask%20a%20Question-blue)](../../discussions)
9-
[![Bluesky](https://img.shields.io/badge/Bluesky-Chat%20with%20me-blue)](https://bsky.app/profile/richardr.dev)
10-
119

1210
# OpenReader WebUI 📄🔊
1311

14-
OpenReader WebUI is a document reader with Text-to-Speech capabilities, offering a TTS read along experience with narration for both PDF and EPUB documents. It can use any OpenAI compatible TTS endpoint, including [Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI).
12+
OpenReader WebUI is a document reader with Text-to-Speech capabilities, offering a TTS read along experience with narration for both PDF and EPUB documents. It can use any OpenAI compatible TTS endpoint, including [Kokoro-FastAPI](https://github.com/remsky/Kokoro-FastAPI) and [Orpheus-FastAPI](https://github.com/Lex-au/Orpheus-FastAPI)
13+
14+
> Highly available demo currently available at [https://openreader.richardr.dev/](https://openreader.richardr.dev/)
1515
1616
- 🎯 **TTS API Integration**:
17-
- Compatible with OpenAI text to speech API and GPT-4o Mini TTS, Kokoro-FastAPI TTS, or any other compatible service
18-
- Support for multiple TTS models (tts-1, tts-1-hd, gpt-4o-mini-tts, kokoro)
19-
- Custom model support for experimental or self-hosted models
20-
- Model-specific instructions support (for gpt-4o-mini-tts)
17+
- Compatible with OpenAI text to speech API and GPT-4o Mini TTS, Kokoro-FastAPI TTS, Orpheus FastAPI or any other compatible service
18+
- Support for TTS models (tts-1, tts-1-hd, gpt-4o-mini-tts, kokoro, and custom)
2119
- 💾 **Local-First Architecture**: Uses IndexedDB browser storage for documents
2220
- 🛜 **Optional Server-side documents**: Manually upload documents to the next backend for all users to download
2321
- 📖 **Read Along Experience**: Follow along with highlighted text as the TTS narrates
2422
- 📄 **Document formats**: EPUB, PDF, DOCX (with libreoffice installed)
25-
- 🎧 **Audiobook Creation**: Create and export audiobooks from PDF and ePub files with m4b format
23+
- 🎧 **Audiobook Creation**: Create and export audiobooks from PDF and ePub files **(in m4b format with ffmpeg and aac TTS output)**
2624
- 📲 **Mobile Support**: Works on mobile devices, and can be added as a PWA web app
2725
- 🎨 **Customizable Experience**:
2826
- 🔑 Set TTS API base URL (and optional API key)
29-
- 🤖 Choose from multiple TTS models or use custom models
3027
- 🎯 Set model-specific instructions for GPT-4o Mini TTS
3128
- 🏎️ Adjustable playback speed
3229
- 📐 Customize PDF text extraction margins
3330
- 🗣️ Multiple voice options (checks `/v1/audio/voices` endpoint)
3431
- 🎨 Multiple app theme options
35-
32+
33+
> Orpheus-FastAPI will only work through a [fork of Orpheus-FastAPI](https://github.com/richardr1126/LlamaCpp-Orpheus-FastAPI)
3634
3735
### 🛠️ Work in progress
3836
- [x] **Audiobook creation and download** (m4b format)
39-
- [x] **Get PDFs on iOS 17 and below working 🤞**
4037
- [x] **Support for GPT-4o Mini TTS with instructions**
41-
- [ ] **End-to-end Testing**: More playwright tests (in progress)
42-
- [ ] **More document formats**: .txt, .md
43-
- [ ] **Support more TTS APIs**: ElevenLabs, etc.
38+
- [x] **Intial e2e testing**: More playwright tests (in progress)
39+
- [x] **Orpheus-FastAPI support**: (in progress, submitted PR to Orpheus-FastAPI)
40+
- [ ] **More document formats**: .txt, .md, native .docx support
41+
- [ ] **Support non-OpenAI TTS APIs**: ElevenLabs, etc.
4442
- [ ] **Accessibility Improvements**
4543

4644
## 🐳 Docker Quick Start
4745

46+
### Prerequisites
47+
- Recent version of Docker installed on your machine
48+
- A TTS API server (Kokoro-FastAPI, Orpheus-FastAPI, or OpenAI API)
49+
4850
```bash
4951
docker run --name openreader-webui \
5052
-p 3003:3003 \
@@ -69,11 +71,17 @@ Visit [http://localhost:3003](http://localhost:3003) to run the app and set your
6971
7072
### ⬆️ Update Docker Image
7173
```bash
72-
docker stop openreader-webui && docker rm openreader-webui
74+
docker stop openreader-webui && \
75+
docker rm openreader-webui && \
7376
docker pull ghcr.io/richardr1126/openreader-webui:latest
7477
```
7578

7679
### Adding to a Docker Compose (i.e. with open-webui or Kokoro-FastAPI)
80+
81+
> Note: This is an example of how to add OpenReader WebUI to a docker-compose file. You can add it to your existing docker-compose file or create a new one in this directory. Then run `docker-compose up --build` to start the services.
82+
83+
```bash
84+
7785
Create or add to a `docker-compose.yml`:
7886
```yaml
7987
volumes:
@@ -92,11 +100,6 @@ services:
92100
restart: unless-stopped
93101
```
94102

95-
## [**Demo**](https://openreader.richardr.dev/)
96-
97-
98-
https://github.com/user-attachments/assets/262b9a01-c608-4fee-893c-9461dd48c99b
99-
100103
## Dev Installation
101104

102105
### Prerequisites
@@ -147,7 +150,7 @@ For feature requests or ideas you have for the project, please use the [Discussi
147150

148151
## 🙋‍♂️ Support and issues
149152

150-
For general questions, you can reach out to me on [Bluesky](https://bsky.app/profile/richardr.dev). If you encounter issues, please open an issue on GitHub following the template (which is very simple).
153+
If you encounter issues, please open an issue on GitHub following the template (which is very light).
151154

152155
## 👥 Contributing
153156

@@ -173,7 +176,7 @@ Contributions are welcome! Fork the repository and submit a pull request with yo
173176
- [react-pdf](https://github.com/wojtekmaj/react-pdf)
174177
- [pdf.js](https://mozilla.github.io/pdf.js/)
175178
- **EPUB:**
176-
- [react-reader](https://github.com/happyr/react-reader)
179+
- [react-reader](https://github.com/gerhardsletten/react-reader)
177180
- [epubjs](https://github.com/futurepress/epub.js/)
178181
- **UI:**
179182
- [Tailwind CSS](https://tailwindcss.com)

src/components/player/Navigator.tsx

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -5,13 +5,13 @@ import { Button } from '@headlessui/react';
55
export const Navigator = ({ currentPage, numPages, skipToLocation }: {
66
currentPage: number;
77
numPages: number | undefined;
8-
skipToLocation: (location: string | number) => void;
8+
skipToLocation: (location: string | number, shouldPause?: boolean) => void;
99
}) => {
1010
return (
1111
<div className="flex items-center space-x-1">
1212
{/* Page back */}
1313
<Button
14-
onClick={() => skipToLocation(currentPage - 1)}
14+
onClick={() => skipToLocation(currentPage - 1, true)}
1515
disabled={currentPage <= 1}
1616
className="relative p-2 rounded-full text-foreground hover:bg-offbase data-[hover]:bg-offbase data-[active]:bg-offbase/80 transition-colors duration-200 focus:outline-none disabled:opacity-50"
1717
aria-label="Previous page"
@@ -30,7 +30,7 @@ export const Navigator = ({ currentPage, numPages, skipToLocation }: {
3030

3131
{/* Page forward */}
3232
<Button
33-
onClick={() => skipToLocation(currentPage + 1)}
33+
onClick={() => skipToLocation(currentPage + 1, true)}
3434
disabled={currentPage >= (numPages || 1)}
3535
className="relative p-2 rounded-full text-foreground hover:bg-offbase data-[hover]:bg-offbase data-[active]:bg-offbase/80 transition-colors duration-200 focus:outline-none disabled:opacity-50"
3636
aria-label="Next page"

src/contexts/PDFContext.tsx

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -134,7 +134,7 @@ export function PDFProvider({ children }: { children: ReactNode }) {
134134
// This prevents unnecessary resets of the sentence index
135135
if (text !== currDocText || text === '') {
136136
setCurrDocText(text);
137-
setTTSText(text, true);
137+
setTTSText(text);
138138
}
139139
} catch (error) {
140140
console.error('Error loading PDF text:', error);

src/contexts/TTSContext.tsx

Lines changed: 14 additions & 12 deletions
Original file line numberDiff line numberDiff line change
@@ -23,6 +23,7 @@ import {
2323
useRef,
2424
useMemo,
2525
ReactNode,
26+
ReactElement
2627
} from 'react';
2728
import { Howl } from 'howler';
2829
import toast from 'react-hot-toast';
@@ -73,7 +74,7 @@ interface TTSContextType {
7374
setCurrDocPages: (num: number | undefined) => void;
7475
setSpeedAndRestart: (speed: number) => void;
7576
setVoiceAndRestart: (voice: string) => void;
76-
skipToLocation: (location: string | number) => void;
77+
skipToLocation: (location: string | number, shouldPause?: boolean) => void;
7778
registerLocationChangeHandler: (handler: (location: string | number) => void) => void; // EPUB-only: Handles chapter navigation
7879
setIsEPUB: (isEPUB: boolean) => void;
7980
}
@@ -89,7 +90,7 @@ const TTSContext = createContext<TTSContextType | undefined>(undefined);
8990
* @param {ReactNode} props.children - Child components to be wrapped by the provider
9091
* @returns {JSX.Element} TTSProvider component
9192
*/
92-
export function TTSProvider({ children }: { children: ReactNode }) {
93+
export function TTSProvider({ children }: { children: ReactNode }): ReactElement {
9394
// Configuration context consumption
9495
const {
9596
apiKey: openApiKey,
@@ -196,16 +197,26 @@ export function TTSProvider({ children }: { children: ReactNode }) {
196197
}
197198
}, [activeHowl]);
198199

200+
/**
201+
* Pauses the current audio playback
202+
* Used for external control of playback state
203+
*/
204+
const pause = useCallback(() => {
205+
abortAudio();
206+
setIsPlaying(false);
207+
}, [abortAudio]);
208+
199209
/**
200210
* Navigates to a specific location in the document
201211
* Works for both PDF pages and EPUB locations
202212
*
203213
* @param {string | number} location - The target location to navigate to
204214
* @param {boolean} keepPlaying - Whether to maintain playback state
205215
*/
206-
const skipToLocation = useCallback((location: string | number) => {
216+
const skipToLocation = useCallback((location: string | number, shouldPause = false) => {
207217
// Reset state for new content in correct order
208218
abortAudio();
219+
if (shouldPause) setIsPlaying(false);
209220
setCurrentIndex(0);
210221
setSentences([]);
211222
setCurrDocPage(location);
@@ -342,15 +353,6 @@ export function TTSProvider({ children }: { children: ReactNode }) {
342353
});
343354
}, [abortAudio]);
344355

345-
/**
346-
* Pauses the current audio playback
347-
* Used for external control of playback state
348-
*/
349-
const pause = useCallback(() => {
350-
abortAudio();
351-
setIsPlaying(false);
352-
}, [abortAudio]);
353-
354356

355357
/**
356358
* Moves forward one sentence in the text

0 commit comments

Comments
 (0)