1- # OpenReader v1 Issue Triage and Mapping
1+ # OpenReader Issue Triage and Mapping
22
33Repository: https://github.com/richardr1126/OpenReader-WebUI/issues
44Reviewed via gh at 2025-11-10.
@@ -10,7 +10,7 @@ Summary of open items included:
1010- #44 Bug: Dialog not chunked together
1111- #40 Bug: PDF left/right extraction margins not working
1212
13- Global 1.0 guardrails
13+ Global guardrails
1414- Streaming-first playback (replace Howler with HTMLAudioElement/MSE)
1515- Dexie DB replacing vanilla IndexedDB
1616- Single engine cut-over (no dual engines)
@@ -20,14 +20,9 @@ Issue #59 — Chapter-Based MP3 Export
2020Type: Feature
2121Hypothesis / intent:
2222- Users want one-mp3-per-chapter output (besides full-book m4b).
23- v1 mapping :
23+ Ideas :
2424- Add chapterized MP3 pipeline that chunks by adapter “chapter” units.
2525- Provide ZIP export for many MP3 files (streamed).
26- Proposed modules:
27- - [ src/v1/playback/audiobook.ts] ( src/v1/playback/audiobook.ts:1 )
28- - [ src/v1/adapters/EpubAdapter.ts] ( src/v1/adapters/EpubAdapter.ts:1 )
29- - [ src/app/api/audio/convert/route.ts] ( src/app/api/audio/convert/route.ts:1 ) (extend to “mp3-per-chapter” mode)
30- - [ src/v1/db/repositories/DocumentsRepo.ts] ( src/v1/db/repositories/DocumentsRepo.ts:1 )
3126API design:
3227- POST /api/audio/convert?mode=chapters&format=mp3 -> returns stream of a ZIP.
3328Logging to add:
@@ -44,15 +39,13 @@ Likely root causes:
4439- Final m4b delivery uses single huge arrayBuffer; browser memory/timeout.
4540- Missing Content-Disposition/Range; no resumable download.
4641- Temp-file lifecycle cleanup racing with response.
47- v1 mitigation :
42+ Ideas for remediation :
4843- Serve final artifact as file on disk with streaming and Range:
4944 - New endpoint: GET /api/audio/convert/download?bookId=… that streams file with Accept-Ranges.
5045 - UI performs streamed download; no arrayBuffer buffering.
5146- Optionally support S3-compatible offload in future.
5247Touched modules:
5348- [ src/app/api/audio/convert/route.ts] ( src/app/api/audio/convert/route.ts:1 )
54- - [ src/v1/playback/audiobook.ts] ( src/v1/playback/audiobook.ts:1 )
55- - [ src/v1/components/Progress/MigrationProgress.tsx] ( src/v1/components/Progress/MigrationProgress.tsx:1 )
5649Instrumentation to add:
5750- Book ID, file size on disk, stream chunk counts, and client-abort detection.
5851- Add explicit log/error surfaces at:
@@ -64,15 +57,10 @@ Issue #47 — Kokoro combined voices (“+” syntax)
6457Type: Feature
6558Intent:
6659- Support voice strings like “bf_emma+af_heart” when provider is Kokoro/FastAPI or DeepInfra Kokoro.
67- v1 plan :
60+ Ideas for implementation :
6861- Allow free-form voice string entry and pass-through if not in known set.
6962- Validate known providers; if “+” present, skip voice validation list.
70- Components:
71- - [ src/v1/tts/types.ts] ( src/v1/tts/types.ts:1 )
72- - [ src/v1/tts/providers/DeepinfraProvider.ts] ( src/v1/tts/providers/DeepinfraProvider.ts:1 )
73- - [ src/v1/tts/providers/CustomOpenAIProvider.ts] ( src/v1/tts/providers/CustomOpenAIProvider.ts:1 )
74- - [ src/v1/components/player/VoicesControl.tsx] ( src/v1/components/player/VoicesControl.tsx:1 ) (v1)
75- - [ src/app/api/tts/route.ts] ( src/app/api/tts/route.ts:1 ) ensure it does not coerce voice when provider is deepinfra/custom.
63+
7664Logging:
7765- Emit provider, model, raw voice string in request (no PII).
7866Tests:
@@ -85,8 +73,8 @@ Observations:
8573Proposed improvements:
8674- Extend [ splitIntoSentences()] ( src/utils/nlp.ts:34 ) to apply quote-aware grouping.
8775- When a sentence begins with an opening quote and the next ends with a closing quote, join them before MAX_BLOCK_LENGTH checks.
88- v1 mapping :
89- - Provide composable splitter strategy in [ src/v1/nlp/sentences.ts ] ( src/v1/nlp/sentences.ts:1 ) wrapping existing utility.
76+ Ideas for remediation :
77+ - Provide composable splitter strategy wrapping existing utility.
9078- Add “dialog-preserve” flag toggled in settings.
9179Logging to add:
9280- Count of quote-joined sentences; per-page example of before/after lengths.
@@ -106,55 +94,7 @@ Remediation:
10694- Add visual debug overlay (dev mode) to draw margin boxes while extracting.
10795Code touchpoints:
10896- [ src/utils/pdf.ts] ( src/utils/pdf.ts:60 ) extractTextFromPDF(): margin math and width fallback.
109- - v1 adapter: [ src/v1/adapters/PdfAdapter.ts] ( src/v1/adapters/PdfAdapter.ts:1 )
11097Diagnostics to add:
11198- Per-page: kept/filtered counts, extremes of x positions, computed margins.
11299Acceptance:
113- - Test PDFs show left/right trimming correctly; e2e highlight still robust.
114-
115- Cross-cutting improvements from issues
116- - Abort discipline: Any config change cancels in-flight TTS and preloads with unique request keys. Implemented in [ src/v1/playback/engine.ts] ( src/v1/playback/engine.ts:1 ) .
117- - Streaming-first: Introduce [ src/app/api/tts/stream/route.ts] ( src/app/api/tts/stream/route.ts:1 ) for chunked speech where supported; fallback to progressive MP3/AAC.
118- - Dexie-backed caches: Audio buffers cached with TTL and LRU in [ src/v1/db/repositories/AudioCacheRepo.ts] ( src/v1/db/repositories/AudioCacheRepo.ts:1 ) .
119- - Resume positions: store stable location tokens per adapter in [ src/v1/playback/positionStore.ts] ( src/v1/playback/positionStore.ts:1 ) .
120-
121- Minimal logging plan (to validate assumptions)
122- - Export path (#48 ):
123- - Log: bookId, tmp file sizes, final m4b size; response headers; client disconnects.
124- - NLP chunking (#44 ):
125- - Log: dialog-join counts; sample joined strings (first 80 chars).
126- - PDF margins (#40 ):
127- - Log: normalized x min/max; filtered vs kept counts per page.
128- - Voice combo (#47 ):
129- - Log: provider/model/voice string; server echo ensures pass-through.
130- - Chapter MP3 export (#59 ):
131- - Log: chapter count, per-chapter byte sizes, total ZIP size.
132-
133- Acceptance test inventory (added to 1.0)
134- - Streaming playback start-to-speech under 500ms on cached sentences.
135- - Voice switch mid-playback produces single cancellation and single rebuffer.
136- - 1–2 GB m4b export succeeds and downloads via streaming.
137- - Chapterized MP3 ZIP streams and extracts correctly.
138- - Dialog detection joins quotes for typical novels.
139- - PDF margin sliders alter extracted text deterministically.
140-
141- Action items added to v1 backlog
142- - Implement chapterized MP3 export (ZIP) path.
143- - Add range-enabled download endpoint for m4b artifacts.
144- - Add custom voice string input and provider pass-through.
145- - Implement quote-aware sentence grouping in splitter.
146- - Harden PDF x/width margin filtering and add debug overlay.
147-
148- References (current code)
149- - Export pipeline: [ src/app/api/audio/convert/route.ts] ( src/app/api/audio/convert/route.ts:1 )
150- - TTS routing: [ src/app/api/tts/route.ts] ( src/app/api/tts/route.ts:1 )
151- - PDF extraction: [ src/utils/pdf.ts] ( src/utils/pdf.ts:60 )
152- - NLP: [ src/utils/nlp.ts] ( src/utils/nlp.ts:34 )
153-
154- References (v1 new modules)
155- - [ src/v1/playback/engine.ts] ( src/v1/playback/engine.ts:1 )
156- - [ src/v1/playback/media/MediaController.ts] ( src/v1/playback/media/MediaController.ts:1 )
157- - [ src/v1/tts/providers/DeepinfraProvider.ts] ( src/v1/tts/providers/DeepinfraProvider.ts:1 )
158- - [ src/v1/adapters/PdfAdapter.ts] ( src/v1/adapters/PdfAdapter.ts:1 )
159- - [ src/v1/nlp/sentences.ts] ( src/v1/nlp/sentences.ts:1 )
160- - [ src/v1/db/repositories/AudioCacheRepo.ts] ( src/v1/db/repositories/AudioCacheRepo.ts:1 )
100+ - Test PDFs show left/right trimming correctly; e2e highlight still robust.
0 commit comments