-
Notifications
You must be signed in to change notification settings - Fork 4
Description
1).When using:
"image_export_mode": "referenced", "do_picture_description": true
the response correctly includes the image references/paths, but the actual exported image files are not present in the output. This makes it impossible to retrieve or post-process the images afterward.
Could you clarify whether the images are supposed to be included in the response payload, or if there is an endpoint/mechanism to fetch the extracted images separately? If not, please consider adding support for exporting or retrieving the referenced images.
Syntax :(e.g., )
- .I’ve noticed that when enabling picture descriptions, the extracted image descriptions (generated via OpenAI on my side) are not being inserted or replaced in the final document output. The API returns the descriptions correctly, but the processed Markdown/PDF does not contain them.
In most cases the image nodes remain unchanged, and the description is never amended. This also leads to unnecessary token usage on the LLM side, as the generated descriptions are never reflected in the final result.
Could you clarify how the picture description insertion is handled, and whether any additional configuration is required to ensure the descriptions are merged into the response?
requesting you include some java samples covering this scenario or share same knowledge here i will prepare some examples for community use.
below is the sample prepared (extracted my java code)
{
"options": {
"to_formats": ["md"],
"do_picture_description": true,
"image_export_mode": "referenced",
"include_images": "true",
"picture_description_api": {
"url": "https://api.openai.com/v1/chat/completions",
"headers": {"Authorization": "Bearer {{OPEN_AI_KEY}}"},
"params": {"model": "gpt-5", "max_completion_tokens": 1000},
"prompt": "Provide a comprehensive and detailed description of the image, including all visible objects, text, layout, context, and any relevant visual details.",
"timeout": "90"
},
"picture_description_area_threshold": 0.0
},
"sources": [{"kind": "http", "url": "https://arxiv.org/pdf/2408.09869"}]
}