Skip to content

feat: better byte handling#479

Merged
yihong0618 merged 1 commit intoyihong0618:mainfrom
leslieo2:feat-Better-byte-handling
Nov 15, 2025
Merged

feat: better byte handling#479
yihong0618 merged 1 commit intoyihong0618:mainfrom
leslieo2:feat-Better-byte-handling

Conversation

@leslieo2
Copy link
Collaborator

@leslieo2 leslieo2 commented Nov 13, 2025

Summary

  • Parse item content as bytes in BeautifulSoup; avoid premature decoding.
  • Search via soup.get_text(); write back with UTF-8 encoding.
  • Small refactor: use content = item.content; drop ad-hoc decoding.

Scope

  • book_maker/loader/epub_loader.py only.

Why

  • Prevent UnicodeDecodeError; handle mixed encodings consistently.

Risk/Testing

  • Low risk. Process a non-UTF-8 EPUB, verify search works and output renders correctly.

@leslieo2
Copy link
Collaborator Author

@codex review

(cherry picked from commit 9f5b2b1)
(cherry picked from commit 42323b6)
@leslieo2 leslieo2 force-pushed the feat-Better-byte-handling branch from 93e4b4d to 66358cb Compare November 13, 2025 06:39
@yihong0618 yihong0618 merged commit 28a4999 into yihong0618:main Nov 15, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants