Problem
When the server crashes, any in-progress jobs are left in a Running state even though the job is no longer running. As a result, some jobs remain stuck indefinitely (see examples in the screenshot: statuses 3/20 and 3/16).

Expected Behavior
Jobs should not stay in a perpetual Running state when execution has stopped.
Proposed Solution
Add a recovery / termination mechanism (similar to a max-job timeout or watchdog) that:
- Detects stale or orphaned running jobs after a crash or restart
- Automatically marks them as Failed or Re-queued
- Prevents jobs from remaining stuck in the Running state indefinitely