You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: README.md
+9-3Lines changed: 9 additions & 3 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -22,7 +22,7 @@ OpenReader WebUI is an open source text to speech document reader web app built
22
22
- 🚀 **(New) Optimized TTS Pipeline**: Next.js TTS backend with in-memory LRU audio cache, ETag-aware responses, and in-flight request de-duplication for faster repeat playback
23
23
- 💾 **Local-First Architecture**: IndexedDB browser storage for documents and settings (now using Dexie.js)
24
24
- 🛜 **Optional Server-side documents**: Manually upload documents to the Next.js backend (and Docker `docstore`) for all users to download
25
-
- 📖 **Read Along Experience**: Follow along with highlighted text as the TTS narrates PDF files, with per-sentence navigation and skip controls
25
+
- 📖 **Read Along Experience**: Follow along with real-time highlighted text as the TTS narrates PDF files, using an overlay-based highlighter, per-sentence navigation, and skip controls
26
26
- 📄 **Document formats**: EPUB, PDF, TXT, MD, DOCX (with libreoffice installed, plus hardened DOCX→PDF conversion for better reliability)
27
27
- 🎨 **Customizable Experience**:
28
28
- 🔑 Select TTS provider (OpenAI, Deepinfra, or Custom OpenAI-compatible)
@@ -38,8 +38,14 @@ OpenReader WebUI is an open source text to speech document reader web app built
38
38
</summary>
39
39
40
40
- 🧠 **Smart sentence continuation**
41
+
- Improved NLP handling of complex structures and quoted dialogue provides more natural sentence boundaries and a smoother audio-text flow.
41
42
- EPUB and PDF playback now use smarter sentence splitting and continuation metadata so sentences that cross page/chapter boundaries are merged before hitting the TTS API.
42
-
- This yields more natural narration and fewer awkward pauses when a sentence spans multiple pages or EPUB spine items
43
+
- This yields more natural narration and fewer awkward pauses when a sentence spans multiple pages or EPUB spine items.
44
+
- 📄 **Modernized PDF text highlighting pipeline**
45
+
- Real-time PDF text highlighting is now offloaded to a dedicated Web Worker so scrolling and playback controls remain responsive during narration.
46
+
- A new overlay-based highlighting system draws independent highlight layers on top of the PDF, avoiding interference with the underlying text layer.
47
+
- Upgraded fuzzy matching with Dice-based similarity improves the accuracy of mapping spoken words to on-screen text.
48
+
- A new per-device setting lets you enable or disable real-time PDF highlighting during playback for a more tailored reading experience.
43
49
- 🎧 **Chapter/page-based audiobook export with resume & regeneration**
44
50
- Per-chapter/per-page generation to disk with persistent `bookId`
45
51
- Resumable generation (can cancel and continue later)
@@ -61,7 +67,7 @@ OpenReader WebUI is an open source text to speech document reader web app built
61
67
- PDF/EPUB/HTML readers use a full-height app shell with a sticky bottom TTS bar, improved scrollbars, and refined focus styles.
62
68
- ✅ **End-to-end Playwright test suite with TTS mocks**
63
69
- Deterministic TTS responses in tests via a reusable Playwright route mock.
64
-
- Coverage for accessibility, upload, navigation, folder management, deletion flows, and playback across all document types.
70
+
- Coverage for accessibility, upload, navigation, folder management, deletion flows, audiobook generation/export and playback across all document types.
0 commit comments