Skip to content

Comments

fix: reconcile orphaned parsing tasks after system crash/reboot#13176

Open
timon0305 wants to merge 1 commit intoinfiniflow:mainfrom
timon0305:fix-orphaned-parsing-documents
Open

fix: reconcile orphaned parsing tasks after system crash/reboot#13176
timon0305 wants to merge 1 commit intoinfiniflow:mainfrom
timon0305:fix-orphaned-parsing-documents

Conversation

@timon0305
Copy link

Summary

Fixes #13171

When the system crashes or reboots during document parsing, Redis loses queued tasks while MySQL retains the "Parsing" state (run='1', progress > 0). This causes documents to appear stuck in "Parsing" indefinitely with no backend processing.

This PR adds startup reconciliation logic to detect and re-queue orphaned tasks:

  • DocumentService.get_orphaned_parsing_docs() - Queries documents stuck in Parsing state (run=RUNNING, 0 < progress < 1)
  • TaskService.get_incomplete_tasks_by_doc_ids() - Retrieves incomplete tasks for orphaned documents
  • requeue_orphaned_tasks() - Reconciliation function that re-queues incomplete tasks into Redis, or marks documents as failed if no tasks remain
  • Startup hook in task_executor.py - Uses a Redis distributed lock so only one worker runs reconciliation, preventing duplicate re-queuing

Documents with no remaining incomplete tasks are marked as FAIL with an explanatory message rather than left in a permanently stuck state.

Test plan

  • Deploy with documents in parsing state, restart services, verify orphaned documents resume parsing
  • Verify only one worker runs reconciliation via distributed lock
  • Verify documents with no incomplete tasks are marked as FAIL

@dosubot dosubot bot added the size:M This PR changes 30-99 lines, ignoring generated files. label Feb 22, 2026
@dosubot
Copy link

dosubot bot commented Feb 22, 2026

Related Documentation

Checked 0 published document(s) in 1 knowledge base(s). No updates required.

How did I do? Any feedback?  Join Discord

@dosubot dosubot bot added the 🐞 bug Something isn't working, pull request that fix bug. label Feb 22, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

🐞 bug Something isn't working, pull request that fix bug. size:M This PR changes 30-99 lines, ignoring generated files.

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[Bug]: Critical State Inconsistency: Documents stuck in "Parsing" after system reboot due to MySQL/Redis desync

1 participant