Feature: Support incremental output and task resumption for long documents

## Summary

For long documents processed with the hybrid engine (sliding window mode), with 270 pages, intermediate results are not persisted to disk until all windows complete. If the process is interrupted (timeout, crash, etc.), all progress is lost and must restart from scratch.

## Problem

When processing a 270-page PDF textbook with hybrid-auto-engine, the CLI client times out while the server-side processing is still running. This results in:

1. **No incremental output**: Results from completed windows are kept in memory only. If the process fails at window 3/5, results from windows 1-2 are lost.
2. **Client timeout kills the task**: The CLI polling timeout not only marks the client-side as failed, but also terminates the server-side processing, wasting GPU compute time.

## Suggested Improvements

1. **Incremental disk output**: After each sliding window completes (e.g., every 64 pages), write the intermediate results to disk. This enables:
   - Recovery from failures without recomputing completed windows
   - Partial results available even if the full document fails

2. **Decouple client timeout from server processing**: Client polling timeout should only affect the client, not terminate the server-side task. The server should continue processing until completion or a separate server-side timeout.

3. **Resume support**: Add a \`--resume\` flag or API parameter to continue processing from the last completed window, skipping already-processed pages.

## Environment

- MinerU 3.0.8
- Windows 11, RTX 4060 Ti 8GB
- Python 3.13, PyTorch 2.11.0+cu126
- Processing a 270-page Chinese math textbook PDF

## Workaround

Currently working around this by using \`mineru-api\` server directly with manual HTTP polling, avoiding the CLI client timeout issue.

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feature: Support incremental output and task resumption for long documents #4736

Summary

Problem

Suggested Improvements

Environment

Workaround

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Feature: Support incremental output and task resumption for long documents #4736

Description

Summary

Problem

Suggested Improvements

Environment

Workaround

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions