Skip to content

fix(cli): avoid generating images for non-image exports#3127

Merged
dolfim-ibm merged 4 commits intodocling-project:mainfrom
M-Hassan-Raza:fix/cli-image-export-gating
Mar 23, 2026
Merged

fix(cli): avoid generating images for non-image exports#3127
dolfim-ibm merged 4 commits intodocling-project:mainfrom
M-Hassan-Raza:fix/cli-image-export-gating

Conversation

@M-Hassan-Raza
Copy link
Contributor

CLI currently turns on page and picture image generation for any non-placeholder image export mode, even for outputs like text, doctags, and vtt that never use images.

This keeps the existing behavior for image-capable exports like markdown, HTML, JSON, and YAML, but skips the extra image work for text-only exports. I also added CLI coverage around both paths.

Signed-off-by: Hassan Raza <raihassanraza10@gmail.com>
@github-actions
Copy link
Contributor

github-actions bot commented Mar 13, 2026

DCO Check Passed

Thanks @M-Hassan-Raza, all your commits are properly signed off. 🎉

@mergify
Copy link

mergify bot commented Mar 13, 2026

Merge Protections

Your pull request matches the following merge protections and will not be merged until they are valid.

🟢 Enforce conventional commit

Wonderful, this rule succeeded.

Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/

  • title ~= ^(fix|feat|docs|style|refactor|perf|test|build|ci|chore|revert)(?:\(.+\))?(!)?:

Signed-off-by: Hassan Raza <raihassanraza10@gmail.com>
@dosubot
Copy link

dosubot bot commented Mar 13, 2026

Related Documentation

1 document(s) may need updating based on files changed in this PR:

Docling

How can I improve the resolution or quality of images extracted from a PDF using docling?
View Suggested Changes
@@ -9,7 +9,7 @@
 )
 ```
 
-If you use the CLI, docling sets `images_scale=2` by default when exporting images, but there is no direct CLI flag to set it higher; for more control, you need to customize the pipeline in code ([example](https://github.com/docling-project/docling/blob/aab3ff5d82fc54864657c0c2ff8e0aa21461f23f/docling/cli/main.py#L379-L965)).
+When using the CLI with image-capable export formats (JSON, YAML, HTML, Markdown), docling sets `images_scale=2` by default when image export mode is not set to placeholder. For text-only export formats (text, doctags, vtt), images are not generated regardless of the image export mode setting. There is no direct CLI flag to set `images_scale` higher than 2; for more control, you need to customize the pipeline in code ([example](https://github.com/docling-project/docling/blob/aab3ff5d82fc54864657c0c2ff8e0aa21461f23f/docling/cli/main.py#L379-L965)).
 
 **Note:** There are known issues in some recent versions where setting `images_scale` above 1.0 can cause bugs with image cropping or bounding box scaling, resulting in incorrectly framed images ([details](https://github.com/docling-project/docling/issues/2416)). If you encounter this, a temporary workaround is to extract at `scale=1.0` and manually upscale images, though this will not improve actual detail or framing.
 

[Accept] [Decline]

Note: You must be authenticated to accept/decline updates.

How did I do? Any feedback?  Join Discord

@PeterStaar-IBM PeterStaar-IBM requested a review from cau-git March 14, 2026 12:59
@PeterStaar-IBM
Copy link
Member

@M-Hassan-Raza I am not really sure this is strctly necessary. @cau-git what do you think?

@codecov
Copy link

codecov bot commented Mar 14, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.

📢 Thoughts on this report? Let us know!

@M-Hassan-Raza
Copy link
Contributor Author

@M-Hassan-Raza I am not really sure this is strctly necessary. @cau-git what do you think?

Hey! Yeah this isn't fixing broken output, everything works as-is. It's more of an optimization since when exporting to text-only formats like text, doctags, or vtt, the CLI still generates page and picture images that never get referenced in the output. For larger PDFs that means rendering every page at 2x scale, cropping picture elements, and keeping all those PIL images in memory for nothing. This just skips that work when the output format won't use it. Totally open to keeping things as they are if there's a reason to always generate them though!

Again i haven't traversed the entire codebase so my analysis and understanding might be a bit off 😁

@dolfim-ibm
Copy link
Member

I'm also not sure this is needed, and some of the formats in the disable-list could actually work with images.

@M-Hassan-Raza
Copy link
Contributor Author

Thanks for taking a closer look here, that’s fair.

I went back through the current CLI export paths again to sanity check the assumption. From what I can see, JSON/YAML/HTML/Markdown are the outputs that actually consume image_mode; text explicitly forces placeholder, and doctags / vtt don’t take image_mode in the CLI export path at all.

So the goal here was only to skip the extra image generation work for outputs that currently can’t reference those images anyway. That said, if this feels too marginal or just not worth the added policy in the CLI, I’m completely fine dropping it rather than pushing an optimization that isn’t considered valuable enough.

@cau-git
Copy link
Member

cau-git commented Mar 16, 2026

@M-Hassan-Raza I think your PR is addressing a valid point, it could save resources to not generate page images / element images at the output scale if not desired from the CLI args. The implementation looks correct, since it is only addressing output from the StandardPdfPipeline, not from other input backends.

Here are my two cents:

  • _OUTPUT_FORMATS_SUPPORTING_IMAGE_EXPORT_MODE - This is a whitelist. I would rather turn it around and make it _OUTPUT_FORMATS_NOT_SUPPORTING_IMAGE_EMBEDDING and create the blacklist for it (which is fewer: TEXT, VTT and DOCTAGS), and easier to maintain when we add new serializers.
  • I would recommend to remove the added test methods that use a monkeypatch. I think that is not necessary or useful to test in isolation.

Copy link
Member

@cau-git cau-git left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

see comment above.

Signed-off-by: Hassan Raza <raihassanraza10@gmail.com>
@M-Hassan-Raza
Copy link
Contributor Author

@cau-git addressed your review feedback on this branch:

  • switched the image export policy helper to a denylist for TEXT, DOCTAGS, and VTT
  • removed the monkeypatch-based CLI tests and kept direct helper coverage instead, including the mixed-format case
  • focused checks passed: uv tool run --from ruff==0.11.5 ruff check --config=pyproject.toml docling/cli/main.py tests/test_cli.py and uv run pytest tests/test_cli.py

Could you take another look?

Signed-off-by: Hassan Raza <raihassanraza10@gmail.com>
@dolfim-ibm dolfim-ibm merged commit 5473e07 into docling-project:main Mar 23, 2026
25 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants