Skip to content

Commit e7ce1a3

Browse files
committed
Update docs
1 parent 452bc2a commit e7ce1a3

File tree

5 files changed

+13
-973
lines changed

5 files changed

+13
-973
lines changed

README.md

Lines changed: 4 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -74,12 +74,13 @@ docker pull ghcr.io/richardr1126/openreader-webui:latest
7474

7575
### (Alternate) 🐳 Configuration with Docker Compose and Kokoro-FastAPI
7676

77-
A complete example docker-compose file with Kokoro-FastAPI and OpenReader WebUI is available in [`examples/docker-compose.yml`](examples/docker-compose.yml). You can download and use it:
77+
A complete example docker-compose file with Kokoro-FastAPI and OpenReader WebUI is available in [`docs/examples/docker-compose.yml`](docs/examples/docker-compose.yml). You can download and use it:
7878

7979
```bash
80-
mkdir -p openreader-compose
80+
# Download example docker-compose.yml
81+
curl --create-dirs -L -o openreader-compose/docker-compose.yml https://raw.githubusercontent.com/richardr1126/OpenReader-WebUI/main/docs/examples/docker-compose.yml
82+
8183
cd openreader-compose
82-
curl -O https://raw.githubusercontent.com/richardr1126/OpenReader-WebUI/main/examples/docker-compose.yml
8384
docker compose up -d
8485
```
8586

Lines changed: 9 additions & 69 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
# OpenReader v1 Issue Triage and Mapping
1+
# OpenReader Issue Triage and Mapping
22

33
Repository: https://github.com/richardr1126/OpenReader-WebUI/issues
44
Reviewed via gh at 2025-11-10.
@@ -10,7 +10,7 @@ Summary of open items included:
1010
- #44 Bug: Dialog not chunked together
1111
- #40 Bug: PDF left/right extraction margins not working
1212

13-
Global 1.0 guardrails
13+
Global guardrails
1414
- Streaming-first playback (replace Howler with HTMLAudioElement/MSE)
1515
- Dexie DB replacing vanilla IndexedDB
1616
- Single engine cut-over (no dual engines)
@@ -20,14 +20,9 @@ Issue #59 — Chapter-Based MP3 Export
2020
Type: Feature
2121
Hypothesis / intent:
2222
- Users want one-mp3-per-chapter output (besides full-book m4b).
23-
v1 mapping:
23+
Ideas:
2424
- Add chapterized MP3 pipeline that chunks by adapter “chapter” units.
2525
- Provide ZIP export for many MP3 files (streamed).
26-
Proposed modules:
27-
- [src/v1/playback/audiobook.ts](src/v1/playback/audiobook.ts:1)
28-
- [src/v1/adapters/EpubAdapter.ts](src/v1/adapters/EpubAdapter.ts:1)
29-
- [src/app/api/audio/convert/route.ts](src/app/api/audio/convert/route.ts:1) (extend to “mp3-per-chapter” mode)
30-
- [src/v1/db/repositories/DocumentsRepo.ts](src/v1/db/repositories/DocumentsRepo.ts:1)
3126
API design:
3227
- POST /api/audio/convert?mode=chapters&format=mp3 -> returns stream of a ZIP.
3328
Logging to add:
@@ -44,15 +39,13 @@ Likely root causes:
4439
- Final m4b delivery uses single huge arrayBuffer; browser memory/timeout.
4540
- Missing Content-Disposition/Range; no resumable download.
4641
- Temp-file lifecycle cleanup racing with response.
47-
v1 mitigation:
42+
Ideas for remediation:
4843
- Serve final artifact as file on disk with streaming and Range:
4944
- New endpoint: GET /api/audio/convert/download?bookId=… that streams file with Accept-Ranges.
5045
- UI performs streamed download; no arrayBuffer buffering.
5146
- Optionally support S3-compatible offload in future.
5247
Touched modules:
5348
- [src/app/api/audio/convert/route.ts](src/app/api/audio/convert/route.ts:1)
54-
- [src/v1/playback/audiobook.ts](src/v1/playback/audiobook.ts:1)
55-
- [src/v1/components/Progress/MigrationProgress.tsx](src/v1/components/Progress/MigrationProgress.tsx:1)
5649
Instrumentation to add:
5750
- Book ID, file size on disk, stream chunk counts, and client-abort detection.
5851
- Add explicit log/error surfaces at:
@@ -64,15 +57,10 @@ Issue #47 — Kokoro combined voices (“+” syntax)
6457
Type: Feature
6558
Intent:
6659
- Support voice strings like “bf_emma+af_heart” when provider is Kokoro/FastAPI or DeepInfra Kokoro.
67-
v1 plan:
60+
Ideas for implementation:
6861
- Allow free-form voice string entry and pass-through if not in known set.
6962
- Validate known providers; if “+” present, skip voice validation list.
70-
Components:
71-
- [src/v1/tts/types.ts](src/v1/tts/types.ts:1)
72-
- [src/v1/tts/providers/DeepinfraProvider.ts](src/v1/tts/providers/DeepinfraProvider.ts:1)
73-
- [src/v1/tts/providers/CustomOpenAIProvider.ts](src/v1/tts/providers/CustomOpenAIProvider.ts:1)
74-
- [src/v1/components/player/VoicesControl.tsx](src/v1/components/player/VoicesControl.tsx:1) (v1)
75-
- [src/app/api/tts/route.ts](src/app/api/tts/route.ts:1) ensure it does not coerce voice when provider is deepinfra/custom.
63+
7664
Logging:
7765
- Emit provider, model, raw voice string in request (no PII).
7866
Tests:
@@ -85,8 +73,8 @@ Observations:
8573
Proposed improvements:
8674
- Extend [splitIntoSentences()](src/utils/nlp.ts:34) to apply quote-aware grouping.
8775
- When a sentence begins with an opening quote and the next ends with a closing quote, join them before MAX_BLOCK_LENGTH checks.
88-
v1 mapping:
89-
- Provide composable splitter strategy in [src/v1/nlp/sentences.ts](src/v1/nlp/sentences.ts:1) wrapping existing utility.
76+
Ideas for remediation:
77+
- Provide composable splitter strategy wrapping existing utility.
9078
- Add “dialog-preserve” flag toggled in settings.
9179
Logging to add:
9280
- Count of quote-joined sentences; per-page example of before/after lengths.
@@ -106,55 +94,7 @@ Remediation:
10694
- Add visual debug overlay (dev mode) to draw margin boxes while extracting.
10795
Code touchpoints:
10896
- [src/utils/pdf.ts](src/utils/pdf.ts:60) extractTextFromPDF(): margin math and width fallback.
109-
- v1 adapter: [src/v1/adapters/PdfAdapter.ts](src/v1/adapters/PdfAdapter.ts:1)
11097
Diagnostics to add:
11198
- Per-page: kept/filtered counts, extremes of x positions, computed margins.
11299
Acceptance:
113-
- Test PDFs show left/right trimming correctly; e2e highlight still robust.
114-
115-
Cross-cutting improvements from issues
116-
- Abort discipline: Any config change cancels in-flight TTS and preloads with unique request keys. Implemented in [src/v1/playback/engine.ts](src/v1/playback/engine.ts:1).
117-
- Streaming-first: Introduce [src/app/api/tts/stream/route.ts](src/app/api/tts/stream/route.ts:1) for chunked speech where supported; fallback to progressive MP3/AAC.
118-
- Dexie-backed caches: Audio buffers cached with TTL and LRU in [src/v1/db/repositories/AudioCacheRepo.ts](src/v1/db/repositories/AudioCacheRepo.ts:1).
119-
- Resume positions: store stable location tokens per adapter in [src/v1/playback/positionStore.ts](src/v1/playback/positionStore.ts:1).
120-
121-
Minimal logging plan (to validate assumptions)
122-
- Export path (#48):
123-
- Log: bookId, tmp file sizes, final m4b size; response headers; client disconnects.
124-
- NLP chunking (#44):
125-
- Log: dialog-join counts; sample joined strings (first 80 chars).
126-
- PDF margins (#40):
127-
- Log: normalized x min/max; filtered vs kept counts per page.
128-
- Voice combo (#47):
129-
- Log: provider/model/voice string; server echo ensures pass-through.
130-
- Chapter MP3 export (#59):
131-
- Log: chapter count, per-chapter byte sizes, total ZIP size.
132-
133-
Acceptance test inventory (added to 1.0)
134-
- Streaming playback start-to-speech under 500ms on cached sentences.
135-
- Voice switch mid-playback produces single cancellation and single rebuffer.
136-
- 1–2 GB m4b export succeeds and downloads via streaming.
137-
- Chapterized MP3 ZIP streams and extracts correctly.
138-
- Dialog detection joins quotes for typical novels.
139-
- PDF margin sliders alter extracted text deterministically.
140-
141-
Action items added to v1 backlog
142-
- Implement chapterized MP3 export (ZIP) path.
143-
- Add range-enabled download endpoint for m4b artifacts.
144-
- Add custom voice string input and provider pass-through.
145-
- Implement quote-aware sentence grouping in splitter.
146-
- Harden PDF x/width margin filtering and add debug overlay.
147-
148-
References (current code)
149-
- Export pipeline: [src/app/api/audio/convert/route.ts](src/app/api/audio/convert/route.ts:1)
150-
- TTS routing: [src/app/api/tts/route.ts](src/app/api/tts/route.ts:1)
151-
- PDF extraction: [src/utils/pdf.ts](src/utils/pdf.ts:60)
152-
- NLP: [src/utils/nlp.ts](src/utils/nlp.ts:34)
153-
154-
References (v1 new modules)
155-
- [src/v1/playback/engine.ts](src/v1/playback/engine.ts:1)
156-
- [src/v1/playback/media/MediaController.ts](src/v1/playback/media/MediaController.ts:1)
157-
- [src/v1/tts/providers/DeepinfraProvider.ts](src/v1/tts/providers/DeepinfraProvider.ts:1)
158-
- [src/v1/adapters/PdfAdapter.ts](src/v1/adapters/PdfAdapter.ts:1)
159-
- [src/v1/nlp/sentences.ts](src/v1/nlp/sentences.ts:1)
160-
- [src/v1/db/repositories/AudioCacheRepo.ts](src/v1/db/repositories/AudioCacheRepo.ts:1)
100+
- Test PDFs show left/right trimming correctly; e2e highlight still robust.

0 commit comments

Comments
 (0)