fix: handle external image references in docx files#3148
fix: handle external image references in docx files#3148joaquinhuigomez wants to merge 2 commits intodocling-project:mainfrom
Conversation
|
✅ DCO Check Passed Thanks @joaquinhuigomez, all your commits are properly signed off. 🎉 |
Merge ProtectionsYour pull request matches the following merge protections and will not be merged until they are valid. 🟢 Enforce conventional commitWonderful, this rule succeeded.Make sure that we follow https://www.conventionalcommits.org/en/v1.0.0/
|
When a .docx file contains image relationships with TargetMode="External" (e.g. URLs pointing to external images), accessing rel.target_part raises an error because external relationships don't have a target_part. This adds a guard to check rel.is_external before accessing target_part, returning None so the image is gracefully skipped. Fixes docling-project#3113 Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com> Signed-off-by: Joaquin Hui Gomez <joaquinhui1995@gmail.com>
Signed-off-by: Joaquin Hui Gomez <joaquinhui1995@gmail.com>
657cfa0 to
aa2ca5a
Compare
| ) | ||
| if rId in self.docx_obj.part.rels: | ||
| rel = self.docx_obj.part.rels[rId] | ||
| # External relationships (e.g. URLs) don't have a |
There was a problem hiding this comment.
I think we support URI in the image part, maybe we should propagate that?
Codecov Report❌ Patch coverage is
📢 Thoughts on this report? Let us know! |
|
Good point — I'll look into propagating the URI through to the image part instead of just downloading and embedding it. Let me update the PR. |
|
Thanks @joaquinhuigomez for your interest in Docling and your contributions! However, as @PeterStaar-IBM pointed out, we could (optionally) fetch external images like we do in the
Thanks again for your contributions! |
|
Closing this per maintainer guidance. The requested next step is to open a feature request for optionally fetching external images in DOCX (similar to HTMLDocumentBackend), then link a future implementation PR to that issue. I’ll follow that route instead of keeping this PR open. |
No description provided.