Skip to content

Commit 1f00468

Browse files
committed
docs: download
1 parent 52c4322 commit 1f00468

File tree

1 file changed

+290
-0
lines changed

1 file changed

+290
-0
lines changed
Lines changed: 290 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,290 @@
1+
# Iteration 003: Download/Import System Separation
2+
3+
**Status**: In Review
4+
**Started**: 2026-01-29
5+
**Goal**: Separate downloading from importing into distinct subsystems with explicit state machines, error classification, and immutable audit trails
6+
7+
---
8+
9+
## Problem Statement
10+
11+
The download system combined downloading and importing into a single monolithic flow with multiple issues:
12+
13+
1. **Combined concerns** - `download_job` tracked both download states (created, downloading) AND import states (importing, import_failed, import_complete) in one entity
14+
2. **Single file assumption** - Schema assumed one file per download, but season packs have many episodes
15+
3. **No retry granularity** - If 1 of 10 episodes failed import, the entire job was marked failed
16+
4. **No reimport capability** - Can't retry a failed import without re-downloading
17+
5. **Implicit state transitions** - No validation that state changes were valid (e.g., "created" → "completed" directly)
18+
6. **No error classification** - All errors treated equally; couldn't distinguish permanent failures (invalid magnet) from transient ones (network timeout)
19+
7. **No audit trail** - No visibility into what happened during download/import lifecycle
20+
21+
---
22+
23+
## Design Decisions
24+
25+
### 1. Separate Download Jobs from Import Tasks
26+
27+
**Before:**
28+
```
29+
download_job
30+
├── id, status (8 states including import states)
31+
├── download fields (protocol, link, external_id, progress)
32+
├── import fields (dest_path, import_method)
33+
└── media references (media_item_id, season_id, episode_id)
34+
```
35+
36+
**After:**
37+
```
38+
download_job (6 states, download only)
39+
├── id, status (created, enqueued, downloading, completed, failed, cancelled)
40+
├── download fields (protocol, link, external_id, progress, save_path, content_path)
41+
├── retry logic (attempt_count, next_run_at, last_error, error_category)
42+
└── media references
43+
44+
import_task (5 states, per-file import)
45+
├── id, status (pending, in_progress, completed, failed, cancelled)
46+
├── download_job_id (nullable FK)
47+
├── previous_task_id (reimport chain)
48+
├── source_path, dest_path
49+
├── retry logic (attempt_count, max_attempts, next_run_at, last_error, error_category)
50+
└── media references
51+
```
52+
53+
### 2. One Import Task Per File
54+
55+
Season packs now spawn multiple `import_task` rows:
56+
- 10-episode pack → 10 import tasks
57+
- Each tracks independently: 8 succeed, 2 fail → `partial_failure` status
58+
- Individual reimport without re-downloading
59+
60+
### 3. Explicit State Machines
61+
62+
State transitions are validated before execution:
63+
64+
```go
65+
var DownloadJobTransitions = map[Status][]Status{
66+
"created": {"enqueued", "failed", "cancelled"},
67+
"enqueued": {"downloading", "failed", "cancelled"},
68+
"downloading": {"completed", "failed", "cancelled"},
69+
"completed": {}, // terminal
70+
"failed": {}, // terminal
71+
"cancelled": {}, // terminal
72+
}
73+
74+
var ImportTaskTransitions = map[Status][]Status{
75+
"pending": {"in_progress", "cancelled"},
76+
"in_progress": {"completed", "failed"},
77+
"completed": {}, // terminal - reimport creates new task
78+
"failed": {}, // terminal - reimport creates new task
79+
"cancelled": {}, // terminal
80+
}
81+
```
82+
83+
### 4. Error Categories
84+
85+
Errors are classified for intelligent retry behavior:
86+
87+
```go
88+
type Category string
89+
const (
90+
Transient Category = "transient" // retry: network timeout, 5xx
91+
Permanent Category = "permanent" // fail fast: 401/403, invalid magnet
92+
)
93+
```
94+
95+
Default: unknown errors → transient (allow retry)
96+
97+
### 5. Event Tables for Audit Trail
98+
99+
Every state change and error is logged:
100+
101+
```sql
102+
download_job_event (created, status_changed, error, retry_scheduled)
103+
import_task_event (created, status_changed, error, retry_scheduled, reimport_requested)
104+
```
105+
106+
Combined timeline query aggregates both for unified history view.
107+
108+
### 6. Computed Import Status
109+
110+
`GetDownloadJobWithImportSummary` returns aggregate status:
111+
- `download_pending` - download not yet completed
112+
- `awaiting_import` - completed, no import tasks yet
113+
- `importing` - tasks in progress
114+
- `fully_imported` - all tasks completed
115+
- `partial_failure` - some completed, some failed
116+
- `import_failed` - all tasks failed
117+
118+
### 7. Cancel Cascade
119+
120+
Cancelling a download job automatically cancels all pending import tasks for that job.
121+
122+
### 8. Reimport Chain
123+
124+
Import tasks have `previous_task_id` FK:
125+
- Original import: `previous_task_id = NULL`
126+
- Reimport: `previous_task_id = <original task ID>`
127+
- Recursive CTE retrieves full chain history
128+
129+
Reimport handling: if destination exists and this is a reimport, remove old file first.
130+
131+
---
132+
133+
## Architecture
134+
135+
```
136+
┌─────────────────────────────────────────────────────────────┐
137+
│ internal/jobs/download/worker.go │
138+
│ │
139+
│ Polls download clients, manages download_job lifecycle │
140+
│ Config: 3s interval, 20 claim limit, 10 max attempts │
141+
│ │
142+
│ created → enqueue to client → enqueued │
143+
│ enqueued/downloading → poll status → downloading/completed│
144+
│ completed → spawn import_task(s) │
145+
└─────────────────────────────────────────────────────────────┘
146+
147+
┌─────────────────────────────────────────────────────────────┐
148+
│ internal/jobs/import/worker.go │
149+
│ │
150+
│ Processes import tasks: hardlink/copy to library │
151+
│ Config: 2s interval, 10 claim limit, 5 max attempts │
152+
│ │
153+
│ pending → claim (atomic) → in_progress │
154+
│ validate source → compute dest → hardlink/copy → completed│
155+
└─────────────────────────────────────────────────────────────┘
156+
```
157+
158+
---
159+
160+
## Schema Changes
161+
162+
### New Tables
163+
164+
| Table | Purpose |
165+
|-------|---------|
166+
| `import_task` | Per-file import tracking with reimport chain |
167+
| `download_job_event` | Download job audit log |
168+
| `import_task_event` | Import task audit log |
169+
170+
### Modified Tables
171+
172+
| Table | Change |
173+
|-------|--------|
174+
| `download_job` | Simplified to 6 states, added error_category, removed import fields |
175+
| `media_file_import` | Added `import_task_id` FK |
176+
177+
### Deleted Tables/Columns
178+
179+
| Change | Reason |
180+
|--------|--------|
181+
| `download_job_media_file` | Replaced by `import_task` |
182+
| `download_job.import_*` columns | Moved to `import_task` |
183+
184+
---
185+
186+
## New API Endpoints
187+
188+
```
189+
# Import Tasks
190+
GET /v1/import-tasks - List import tasks (paginated, filterable by status)
191+
GET /v1/import-tasks/counts - Get counts by status
192+
GET /v1/import-tasks/:id - Get import task with details
193+
GET /v1/import-tasks/:id/timeline - Get event log
194+
GET /v1/import-tasks/:id/history - Get reimport chain
195+
POST /v1/import-tasks/:id/reimport - Create new task for reimport
196+
POST /v1/import-tasks/:id/cancel - Cancel pending task
197+
198+
# Download Jobs (enhanced)
199+
GET /v1/download-jobs/:id - Now returns import summary
200+
GET /v1/download-jobs/:id/timeline - Combined download + import events
201+
GET /v1/download-jobs/:id/import-tasks - List import tasks for job
202+
```
203+
204+
---
205+
206+
## Files Created
207+
208+
| File | Purpose |
209+
|------|---------|
210+
| `internal/errors/category.go` | Error categorization (transient/permanent) |
211+
| `internal/jobs/state/machine.go` | State machine definitions and validation |
212+
| `internal/jobs/download/worker.go` | Download worker (replaces old worker) |
213+
| `internal/jobs/import/worker.go` | Import worker |
214+
| `internal/service/import_tasks.go` | Import tasks service |
215+
| `internal/http/handlers/import_tasks.go` | Import tasks HTTP handler |
216+
| `internal/repo/import_tasks.go` | Import tasks repository |
217+
| `internal/db/queries/import_tasks.sql` | Import task SQLC queries |
218+
| `internal/db/queries/download_job_events.sql` | Download event SQLC queries |
219+
| `internal/db/queries/import_task_events.sql` | Import event SQLC queries |
220+
221+
---
222+
223+
## Files Modified
224+
225+
| File | Change |
226+
|------|--------|
227+
| `internal/db/migrations/0008_download_job.up.sql` | Complete rewrite with new schema |
228+
| `internal/db/queries/download_jobs.sql` | Simplified states, added timeline/summary queries |
229+
| `internal/service/download_jobs.go` | Added cancel cascade, timeline, import summary |
230+
| `internal/http/handlers/download_jobs.go` | Added timeline, import-tasks endpoints |
231+
| `internal/http/http.go` | Register import tasks handler |
232+
| `cmd/api/main.go` | Start both download and import workers |
233+
234+
---
235+
236+
## Files Deleted
237+
238+
| File | Reason |
239+
|------|--------|
240+
| `internal/jobs/downloadjobs/worker.go` | Replaced by separate download/import workers |
241+
242+
---
243+
244+
## Worker Configuration
245+
246+
| Worker | Poll Interval | Claim Limit | Max Attempts |
247+
|--------|---------------|-------------|--------------|
248+
| Download | 3s | 20 | 10 |
249+
| Import | 2s | 10 | 5 |
250+
251+
Both use exponential backoff: 2^attempt seconds
252+
253+
---
254+
255+
## Testing Checklist
256+
257+
- [ ] Nuke DB, verify migrations run clean
258+
- [ ] `sqlc generate` succeeds
259+
- [ ] `go build ./...` succeeds
260+
- [ ] Search → enqueue download → verify `download_job` created with status `created`
261+
- [ ] Watch status progress: `created``enqueued``downloading``completed`
262+
- [ ] Verify `import_task`(s) spawned when download completes
263+
- [ ] Verify import progresses: `pending``in_progress``completed`
264+
- [ ] Check `GET /v1/download-jobs/:id` returns `import_status: fully_imported`
265+
- [ ] Test reimport: `POST /v1/import-tasks/:id/reimport`
266+
- [ ] Verify new task created with `previous_task_id` set
267+
- [ ] Test timeline: `GET /v1/download-jobs/:id/timeline`
268+
- [ ] Test cancel cascade: cancel download, verify pending imports cancelled
269+
- [ ] Test series pack: download season, verify multiple import tasks created
270+
- [ ] Simulate partial failure: verify `import_status: partial_failure`
271+
- [ ] Test permanent error (stop qBittorrent): verify job fails after retries
272+
- [ ] Test transient error: verify retry with exponential backoff
273+
274+
---
275+
276+
## Known Issues / TODOs
277+
278+
1. **qBittorrent error categorization not implemented** - Wrapper should mark 401/403 as permanent errors
279+
2. **No "created" event logged for spawned import tasks** - Download worker creates tasks but doesn't log events
280+
3. **ListDownloadJobs has no pagination** - Could be slow with many jobs
281+
282+
---
283+
284+
## Future Work (Out of Scope)
285+
286+
- Import task cleanup/retention policy (delete old completed tasks)
287+
- Frontend UI for import task management
288+
- Configurable worker limits via settings
289+
- Metrics/monitoring for worker health
290+
- Manual import (create import task without download job)

0 commit comments

Comments
 (0)