@@ -13,6 +13,22 @@ Detect and fix malicious content embedded in EPUB files. Scans entirely in-memor
1313- ** External URL detection** — passive tracking via ` src ` , ` action ` , CSS ` url() ` flagged as WARNING; safe ` <a href> ` links kept as INFO
1414- ** Auto-fix mode** — remove threats and repack as ` [fixed] filename.epub `
1515- ** Markdown report** — export detailed scan results to ` .md ` file
16+ - ** Claude.ai Skill** — package as a skill for use in Claude.ai
17+
18+ ## Quick Start
19+
20+ ``` bash
21+ git clone https://github.com/stanwu/epub-safety-scanner.git
22+ cd epub-safety-scanner
23+
24+ # Scan all EPUBs on your Desktop
25+ python3 epub_safety_scanner.py --path ~ /Desktop/
26+
27+ # Fix any threats found
28+ python3 epub_safety_scanner.py --path ~ /Desktop/ --fix
29+ ```
30+
31+ No external dependencies required — Python 3.9+ stdlib only.
1632
1733## Requirements
1834
@@ -22,51 +38,103 @@ Detect and fix malicious content embedded in EPUB files. Scans entirely in-memor
2238## Installation
2339
2440``` bash
25- git clone < repo-url >
41+ git clone https://github.com/stanwu/epub-safety-scanner.git
2642cd epub-safety-scanner
43+ ```
44+
45+ For development (linting, testing):
46+
47+ ``` bash
2748python3 -m venv .venv
2849source .venv/bin/activate
2950pip install -r requirements-dev.txt
3051```
3152
32- ## Usage
53+ ## CLI Reference
54+
55+ ```
56+ python3 epub_safety_scanner.py --path PATH [OPTIONS]
57+ ```
58+
59+ | Flag | Description |
60+ | ------| -------------|
61+ | ` --path PATH ` | ** (Required)** EPUB file, directory, or glob pattern. Supports ` ~ ` expansion. |
62+ | ` --fix ` | Remove threats and save as ` [fixed] filename.epub ` in the same directory. |
63+ | ` --report FILE ` | Write a detailed Markdown report to the specified file. |
64+ | ` -v, --verbose ` | Show INFO-level findings (external hyperlinks, hidden by default). |
65+ | ` --no-color ` | Disable colored terminal output. |
66+
67+ ** Exit codes:** ` 1 ` if any CRITICAL findings, ` 0 ` otherwise.
68+
69+ ## Usage Scenarios
70+
71+ ### Scenario 1: Scan a Single EPUB
3372
3473``` bash
35- # Scan a single file
36- python3 epub_safety_scanner.py --path book.epub
74+ python3 epub_safety_scanner.py --path ~ /Desktop/book.epub
75+ ```
76+
77+ Outputs a severity summary per file. CLEAN means no threats detected.
3778
38- # Scan a directory (auto-finds *.epub)
79+ ### Scenario 2: Batch Scan a Directory
80+
81+ ``` bash
3982python3 epub_safety_scanner.py --path ~ /Desktop/
83+ ```
4084
41- # Scan with glob pattern
42- python3 epub_safety_scanner.py --path " books/*.epub"
85+ Automatically finds all ` *.epub ` files in the directory. Displays per-file results and a final summary showing how many files have issues.
4386
44- # Show INFO-level findings (external links, hidden by default)
45- python3 epub_safety_scanner.py --path ~ /Desktop/ -v
87+ ### Scenario 3: Fix Threats and Verify
4688
47- # Fix threats and repack as [fixed] filename.epub
89+ ``` bash
90+ # Step 1: Fix all threats
4891python3 epub_safety_scanner.py --path ~ /Desktop/ --fix
4992
50- # Export Markdown report
93+ # Step 2: Verify the fixed files are clean
94+ python3 epub_safety_scanner.py --path ~ /Desktop/" [fixed]*"
95+ ```
96+
97+ Each fixed EPUB is saved as ` [fixed] original.epub ` in the same directory. The original file is left untouched.
98+
99+ ### Scenario 4: Generate a Report for Review
100+
101+ ``` bash
51102python3 epub_safety_scanner.py --path ~ /Desktop/ --report report.md
103+ ```
104+
105+ Creates a Markdown report with:
106+ - Scan date and file count
107+ - Summary table (status per file)
108+ - Per-file details grouped by threat category
109+ - Evidence snippets for each finding
52110
53- # Combine: scan, fix, and report
111+ ### Scenario 5: Full Workflow (Scan + Fix + Report)
112+
113+ ``` bash
54114python3 epub_safety_scanner.py --path ~ /Desktop/ --fix --report report.md
55115```
56116
57- ## Output
117+ Scans all EPUBs, fixes threats, and exports a report — all in one command.
118+
119+ ### Scenario 6: Verbose Mode — Inspect External Links
120+
121+ ``` bash
122+ python3 epub_safety_scanner.py --path ~ /Desktop/ -v
123+ ```
58124
59- Findings are categorized by severity:
125+ Shows INFO-level findings (external ` <a href> ` links) that are hidden by default. Useful for auditing what external URLs an EPUB references. URLs are color-coded: ** green** for safe ` <a href> ` links, ** red** for suspicious external resources.
126+
127+ ## Understanding the Output
60128
61129| Severity | Meaning | Default |
62130| ----------| ---------| ---------|
63- | ** CRITICAL** | High risk — JavaScript, executables, disguised files | Shown |
64- | ** WARNING** | Medium risk — external resources , suspicious CSS, nested archives | Shown |
131+ | ** CRITICAL** | High risk — JavaScript, executables, disguised files, ` <iframe> ` , ` <applet> ` | Shown |
132+ | ** WARNING** | Medium risk — external resource loading ( ` src ` , ` action ` ) , suspicious CSS, nested archives | Shown |
65133| ** INFO** | Low risk — external hyperlinks (` <a href> ` ) | Hidden (use ` -v ` ) |
66134
67- URLs are color-coded in terminal output: ** green ** for safe ` <a href> ` links, ** red ** for suspicious external resources.
68-
69- Exit code ` 1 ` if any CRITICAL findings, ` 0 ` otherwise.
135+ - Files with only INFO findings display as ** CLEAN ** by default
136+ - URLs in the output are color-coded: ** green ** = safe ` <a href> ` , ** red ** = suspicious
137+ - Each finding includes the internal file path and an evidence snippet
70138
71139## Fix Mode
72140
@@ -87,13 +155,37 @@ Exit code `1` if any CRITICAL findings, `0` otherwise.
87155| CSS ` url(https://...) ` , ` @import url(https://...) ` | Removed |
88156| ` <a href="https://..."> ` | ** Preserved** (normal for ebooks) |
89157
158+ If no threats are found, no fixed file is created.
159+
160+ ## Claude.ai Skill
161+
162+ You can package this scanner as a Claude.ai Skill for use in the web interface.
163+
164+ ### Build the Skill ZIP
165+
166+ ``` bash
167+ make skill
168+ ```
169+
170+ This creates ` scan-epub-skill.zip ` containing the scanner and skill definition.
171+
172+ ### Upload to Claude.ai
173+
174+ 1 . Go to [ claude.ai] ( https://claude.ai )
175+ 2 . Navigate to ** Customize > Skills**
176+ 3 . Click ** +** and select ** Upload a skill**
177+ 4 . Upload ` scan-epub-skill.zip `
178+
179+ Once uploaded, Claude will use the scanner when you ask it to check EPUB files for security threats.
180+
90181## Development
91182
92183``` bash
93- make test # Run unit tests
184+ make test # Run unit tests (111 tests)
94185make lint # Run linters (ruff, bandit, mypy)
95186make check # Run all checks (lint + test)
96187make format # Auto-format code
188+ make skill # Package claude.ai skill ZIP
97189```
98190
99191## License
0 commit comments