Skip to content

Commit 1f773fa

Browse files
committed
docs: minor updates
1 parent dcfeaf5 commit 1f773fa

File tree

1 file changed

+10
-9
lines changed

1 file changed

+10
-9
lines changed

docs/jobs_orchestration.md

Lines changed: 10 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -72,21 +72,22 @@ The `populate()` method orchestrates the job execution process:
7272
- For reserved jobs:
7373
- Updates job status to `reserved` during processing
7474
- Records execution metrics (duration, version)
75-
- Updates status to `success` or `error` on completion
75+
- On successful completion: remove job from the jobs table
76+
- On error: update job status to `error`
7677
- Records errors and execution metrics
7778

7879
4. **Cleanup**:
79-
- Optionally purges invalid jobs
80+
- Optionally purges orphaned/outdated jobs
8081

8182
## Job Cleanup Process
8283

83-
The `purge_invalid_jobs` method maintains database consistency by removing invalid jobs:
84+
The `purge_jobs` method maintains database consistency by removing orphaned jobs:
8485

85-
1. **Invalid Success Jobs**:
86+
1. **Orphaned Success Jobs**:
8687
- Identifies jobs marked as `success` but not present in the target table
8788
- These typically occur when target table entries are deleted
8889

89-
2. **Invalid Incomplete Jobs**:
90+
2. **Orphaned Incomplete Jobs**:
9091
- Identifies jobs in `scheduled`/`error`/`ignore` state that are no longer in the `key_source`
9192
- These typically occur when upstream table entries are deleted
9293

@@ -106,16 +107,16 @@ The "freshness" and consistency of the jobs table depends on regular maintenance
106107
- Example: Run every few minutes in a cron job for active pipelines
107108
- Event-driven approach: `inserts` in upstream tables auto trigger this step
108109

109-
2. **Cleanup** (`purge_invalid_jobs`):
110-
- Removes invalid or outdated jobs
110+
2. **Cleanup** (`purge_jobs`):
111+
- Removes orphaned or outdated jobs
111112
- Should be run periodically to maintain consistency
112113
- More resource-intensive than scheduling
113114
- Example: Run daily during low-activity periods
114115
- Event-driven approach: `deletes` in upstream or target tables auto trigger this step
115116

116117
The balance between these operations affects:
117118
- How quickly new jobs are discovered and scheduled
118-
- How long invalid jobs remain in the table
119+
- How long orphaned jobs remain in the table
119120
- Database size and query performance
120121
- Overall system responsiveness
121122

@@ -128,5 +129,5 @@ dj.config["min_scheduling_interval"] = 300 # 5 minutes
128129
# (implement as a cron job or scheduled task)
129130
def daily_cleanup():
130131
for table in your_pipeline_tables:
131-
table.purge_invalid_jobs()
132+
table.purge_jobs()
132133
```

0 commit comments

Comments
 (0)