baseURL=http://minerudeployment-production.up.railway.app/
Method: POST
Purpose: Submit a PDF for asynchronous processing
Request:
curl -X POST "{baseURL}/parse/" \
-F "file=@document.pdf" \
-F "parse_formula=true" \
-F "parse_table=true" \
-F "parse_ocr=true" \
-F "extract_images=true" \
-F "output_format=both" \
-F "dpi=200"Parameters:
file(required): PDF file (max 100MB)parse_formula(optional, default: true): Enable formula recognitionparse_table(optional, default: true): Enable table extractionparse_ocr(optional, default: true): Enable OCR for scanned documentsextract_images(optional, default: true): Extract embedded imagesoutput_format(optional, default: "both"): "markdown", "json", or "both"dpi(optional, default: 200): Image extraction DPIforce_gpu(optional, default: false): Force GPU processing
Response:
{
"task_id": "task_1732345678900",
"status": "pending",
"message": "PDF parsing started using modal_gpu",
"backend": "modal_gpu",
"estimated_time": 60
}Status Codes:
200 OK: Task created successfully400 Bad Request: Invalid file or parameters413 Payload Too Large: File exceeds 100MB500 Internal Server Error: Server error
Method: GET
Purpose: Check processing status and retrieve results
Request:
curl "{baseURL}/status/task_1732345678900"Response (Pending):
{
"task_info": {
"task_id": "task_1732345678900",
"status": "pending",
"backend": "modal_gpu",
"created_at": 1732345678.9,
"started_at": null,
"completed_at": null,
"error_message": null,
"file_info": {
"filename": "document.pdf",
"size": 1024000,
"md5": "abc123..."
}
},
"content": null,
"download_urls": null
}Response (Processing):
{
"task_info": {
"task_id": "task_1732345678900",
"status": "processing",
"backend": "modal_gpu",
"created_at": 1732345678.9,
"started_at": 1732345680.5,
"completed_at": null,
"error_message": null,
"file_info": {...}
},
"content": null,
"download_urls": null
}Response (Completed):
{
"task_info": {
"task_id": "task_1732345678900",
"status": "completed",
"backend": "modal_gpu",
"created_at": 1732345678.9,
"started_at": 1732345680.5,
"completed_at": 1732345720.3,
"error_message": null,
"file_info": {
"filename": "document.pdf",
"size": 1024000,
"md5": "abc123..."
}
},
"content": {
"markdown_content": "# Document Title\n\nContent...",
"json_content": "# Metadata returned from MinerU, need to be stored into a json file",
"images": ["base64_image_1", "base64_image_2"],
"page_count": 10,
"processing_time": 39.8,
"metadata": {
"backend": "modal_gpu",
"status": "completed"
}
},
"download_urls": {
"zip": "https://storage.example.com/abc123.../results.zip"
}
}Task Status Values:
pending: Task created, waiting to startprocessing: Currently being processedcompleted: Processing finished successfullyfailed: Processing failed (check error_message)cached: Result retrieved from cache (shown as completed to clients)
Status Codes:
200 OK: Task found404 Not Found: Task ID doesn't exist