Skip to content

Commit 78e9ef1

Browse files
stanwuclaude
andcommitted
feat: add claude.ai skill packaging and expand README
Add SKILL.md for claude.ai skill upload. `make skill` packages the scanner as a ZIP for uploading to Customize > Skills on claude.ai. Expand README with quick start, CLI reference, 6 usage scenarios, output guide, and skill upload instructions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
1 parent 6d60438 commit 78e9ef1

File tree

4 files changed

+195
-21
lines changed

4 files changed

+195
-21
lines changed

.gitignore

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -41,3 +41,7 @@ Thumbs.db
4141

4242
# Pre-commit
4343
.pre-commit-cache/
44+
45+
# Skill packaging
46+
*.zip
47+
skills/scan-epub/epub_safety_scanner.py

Makefile

Lines changed: 8 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
.PHONY: test lint format check install clean
1+
.PHONY: test lint format check install clean skill
22

33
PYTHON := .venv/bin/python
44
PIP := .venv/bin/pip
@@ -26,6 +26,13 @@ install:
2626
$(PIP) install -r requirements-dev.txt
2727
$(PYTHON) -m pre_commit install
2828

29+
## Package claude.ai skill as ZIP
30+
skill:
31+
cp epub_safety_scanner.py skills/scan-epub/epub_safety_scanner.py
32+
cd skills && zip -r ../scan-epub-skill.zip scan-epub/
33+
rm skills/scan-epub/epub_safety_scanner.py
34+
@echo "Created scan-epub-skill.zip"
35+
2936
## Clean build artifacts
3037
clean:
3138
find . -type d -name __pycache__ -exec rm -rf {} +

README.md

Lines changed: 112 additions & 20 deletions
Original file line numberDiff line numberDiff line change
@@ -13,6 +13,22 @@ Detect and fix malicious content embedded in EPUB files. Scans entirely in-memor
1313
- **External URL detection** — passive tracking via `src`, `action`, CSS `url()` flagged as WARNING; safe `<a href>` links kept as INFO
1414
- **Auto-fix mode** — remove threats and repack as `[fixed] filename.epub`
1515
- **Markdown report** — export detailed scan results to `.md` file
16+
- **Claude.ai Skill** — package as a skill for use in Claude.ai
17+
18+
## Quick Start
19+
20+
```bash
21+
git clone https://github.com/stanwu/epub-safety-scanner.git
22+
cd epub-safety-scanner
23+
24+
# Scan all EPUBs on your Desktop
25+
python3 epub_safety_scanner.py --path ~/Desktop/
26+
27+
# Fix any threats found
28+
python3 epub_safety_scanner.py --path ~/Desktop/ --fix
29+
```
30+
31+
No external dependencies required — Python 3.9+ stdlib only.
1632

1733
## Requirements
1834

@@ -22,51 +38,103 @@ Detect and fix malicious content embedded in EPUB files. Scans entirely in-memor
2238
## Installation
2339

2440
```bash
25-
git clone <repo-url>
41+
git clone https://github.com/stanwu/epub-safety-scanner.git
2642
cd epub-safety-scanner
43+
```
44+
45+
For development (linting, testing):
46+
47+
```bash
2748
python3 -m venv .venv
2849
source .venv/bin/activate
2950
pip install -r requirements-dev.txt
3051
```
3152

32-
## Usage
53+
## CLI Reference
54+
55+
```
56+
python3 epub_safety_scanner.py --path PATH [OPTIONS]
57+
```
58+
59+
| Flag | Description |
60+
|------|-------------|
61+
| `--path PATH` | **(Required)** EPUB file, directory, or glob pattern. Supports `~` expansion. |
62+
| `--fix` | Remove threats and save as `[fixed] filename.epub` in the same directory. |
63+
| `--report FILE` | Write a detailed Markdown report to the specified file. |
64+
| `-v, --verbose` | Show INFO-level findings (external hyperlinks, hidden by default). |
65+
| `--no-color` | Disable colored terminal output. |
66+
67+
**Exit codes:** `1` if any CRITICAL findings, `0` otherwise.
68+
69+
## Usage Scenarios
70+
71+
### Scenario 1: Scan a Single EPUB
3372

3473
```bash
35-
# Scan a single file
36-
python3 epub_safety_scanner.py --path book.epub
74+
python3 epub_safety_scanner.py --path ~/Desktop/book.epub
75+
```
76+
77+
Outputs a severity summary per file. CLEAN means no threats detected.
3778

38-
# Scan a directory (auto-finds *.epub)
79+
### Scenario 2: Batch Scan a Directory
80+
81+
```bash
3982
python3 epub_safety_scanner.py --path ~/Desktop/
83+
```
4084

41-
# Scan with glob pattern
42-
python3 epub_safety_scanner.py --path "books/*.epub"
85+
Automatically finds all `*.epub` files in the directory. Displays per-file results and a final summary showing how many files have issues.
4386

44-
# Show INFO-level findings (external links, hidden by default)
45-
python3 epub_safety_scanner.py --path ~/Desktop/ -v
87+
### Scenario 3: Fix Threats and Verify
4688

47-
# Fix threats and repack as [fixed] filename.epub
89+
```bash
90+
# Step 1: Fix all threats
4891
python3 epub_safety_scanner.py --path ~/Desktop/ --fix
4992

50-
# Export Markdown report
93+
# Step 2: Verify the fixed files are clean
94+
python3 epub_safety_scanner.py --path ~/Desktop/"[fixed]*"
95+
```
96+
97+
Each fixed EPUB is saved as `[fixed] original.epub` in the same directory. The original file is left untouched.
98+
99+
### Scenario 4: Generate a Report for Review
100+
101+
```bash
51102
python3 epub_safety_scanner.py --path ~/Desktop/ --report report.md
103+
```
104+
105+
Creates a Markdown report with:
106+
- Scan date and file count
107+
- Summary table (status per file)
108+
- Per-file details grouped by threat category
109+
- Evidence snippets for each finding
52110

53-
# Combine: scan, fix, and report
111+
### Scenario 5: Full Workflow (Scan + Fix + Report)
112+
113+
```bash
54114
python3 epub_safety_scanner.py --path ~/Desktop/ --fix --report report.md
55115
```
56116

57-
## Output
117+
Scans all EPUBs, fixes threats, and exports a report — all in one command.
118+
119+
### Scenario 6: Verbose Mode — Inspect External Links
120+
121+
```bash
122+
python3 epub_safety_scanner.py --path ~/Desktop/ -v
123+
```
58124

59-
Findings are categorized by severity:
125+
Shows INFO-level findings (external `<a href>` links) that are hidden by default. Useful for auditing what external URLs an EPUB references. URLs are color-coded: **green** for safe `<a href>` links, **red** for suspicious external resources.
126+
127+
## Understanding the Output
60128

61129
| Severity | Meaning | Default |
62130
|----------|---------|---------|
63-
| **CRITICAL** | High risk — JavaScript, executables, disguised files | Shown |
64-
| **WARNING** | Medium risk — external resources, suspicious CSS, nested archives | Shown |
131+
| **CRITICAL** | High risk — JavaScript, executables, disguised files, `<iframe>`, `<applet>` | Shown |
132+
| **WARNING** | Medium risk — external resource loading (`src`, `action`), suspicious CSS, nested archives | Shown |
65133
| **INFO** | Low risk — external hyperlinks (`<a href>`) | Hidden (use `-v`) |
66134

67-
URLs are color-coded in terminal output: **green** for safe `<a href>` links, **red** for suspicious external resources.
68-
69-
Exit code `1` if any CRITICAL findings, `0` otherwise.
135+
- Files with only INFO findings display as **CLEAN** by default
136+
- URLs in the output are color-coded: **green** = safe `<a href>`, **red** = suspicious
137+
- Each finding includes the internal file path and an evidence snippet
70138

71139
## Fix Mode
72140

@@ -87,13 +155,37 @@ Exit code `1` if any CRITICAL findings, `0` otherwise.
87155
| CSS `url(https://...)`, `@import url(https://...)` | Removed |
88156
| `<a href="https://...">` | **Preserved** (normal for ebooks) |
89157

158+
If no threats are found, no fixed file is created.
159+
160+
## Claude.ai Skill
161+
162+
You can package this scanner as a Claude.ai Skill for use in the web interface.
163+
164+
### Build the Skill ZIP
165+
166+
```bash
167+
make skill
168+
```
169+
170+
This creates `scan-epub-skill.zip` containing the scanner and skill definition.
171+
172+
### Upload to Claude.ai
173+
174+
1. Go to [claude.ai](https://claude.ai)
175+
2. Navigate to **Customize > Skills**
176+
3. Click **+** and select **Upload a skill**
177+
4. Upload `scan-epub-skill.zip`
178+
179+
Once uploaded, Claude will use the scanner when you ask it to check EPUB files for security threats.
180+
90181
## Development
91182

92183
```bash
93-
make test # Run unit tests
184+
make test # Run unit tests (111 tests)
94185
make lint # Run linters (ruff, bandit, mypy)
95186
make check # Run all checks (lint + test)
96187
make format # Auto-format code
188+
make skill # Package claude.ai skill ZIP
97189
```
98190

99191
## License

skills/scan-epub/SKILL.md

Lines changed: 71 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,71 @@
1+
---
2+
name: scan-epub
3+
description: Scan EPUB files for malicious content (JavaScript, tracking URLs, dangerous files) and fix threats. Use when the user wants to check EPUB safety, remove threats, or generate a security report.
4+
dependencies: python>=3.9
5+
---
6+
7+
# EPUB Safety Scanner
8+
9+
Scan EPUB files for malicious content and optionally fix threats.
10+
11+
## How to Use
12+
13+
Run the scanner using the `epub_safety_scanner.py` file included in this skill:
14+
15+
```bash
16+
python3 epub_safety_scanner.py --path <path> [options]
17+
```
18+
19+
## Options
20+
21+
| Flag | Description |
22+
|------|-------------|
23+
| `--path PATH` | **(Required)** EPUB file, directory, or glob pattern |
24+
| `--fix` | Remove threats and save as `[fixed] filename.epub` |
25+
| `--report FILE` | Write a Markdown report to the specified file |
26+
| `-v, --verbose` | Show INFO-level findings (external links, hidden by default) |
27+
| `--no-color` | Disable colored terminal output |
28+
29+
## Examples
30+
31+
**Scan a single file:**
32+
```bash
33+
python3 epub_safety_scanner.py --path ~/Desktop/book.epub
34+
```
35+
36+
**Scan all EPUBs in a directory:**
37+
```bash
38+
python3 epub_safety_scanner.py --path ~/Desktop/
39+
```
40+
41+
**Fix threats and generate a report:**
42+
```bash
43+
python3 epub_safety_scanner.py --path ~/Desktop/ --fix --report report.md
44+
```
45+
46+
**Show all findings including external links:**
47+
```bash
48+
python3 epub_safety_scanner.py --path ~/Desktop/ -v
49+
```
50+
51+
## Understanding the Output
52+
53+
| Severity | Meaning |
54+
|----------|---------|
55+
| **CRITICAL** | High risk — JavaScript, executables, disguised files, iframe, applet |
56+
| **WARNING** | Medium risk — external resource loading (tracking pixels), suspicious CSS |
57+
| **INFO** | Low risk — external hyperlinks (`<a href>`), normal for ebooks |
58+
59+
- INFO findings are hidden by default (use `-v` to show)
60+
- Files with only INFO findings display as CLEAN
61+
- Exit code `1` if any CRITICAL findings, `0` otherwise
62+
63+
## What --fix Does
64+
65+
- **Removes entirely:** `.js` files, executables, nested archives, path traversal entries
66+
- **Strips from HTML:** `<script>`, `<iframe>`, `<applet>`, `<object>`, `<embed>`, event handlers, `<meta refresh>`, `<base>`
67+
- **Neutralizes:** `javascript:` URIs → `#`, `data:text/html` URIs → `#`
68+
- **Removes from CSS:** `expression()`, `-moz-binding`, `behavior`, external `url()`, `@import url()`
69+
- **Removes external:** `src`, `action`, `poster`, `data` attributes pointing to external URLs
70+
- **Preserves:** `<a href="https://...">` (normal for ebooks)
71+
- **Output:** `[fixed] original.epub` in the same directory

0 commit comments

Comments
 (0)