-
Notifications
You must be signed in to change notification settings - Fork 147
Description
Feature Request: Support table image output in Markdown export
Description
When exporting a document to Markdown using docling, TableItem elements are currently only exported as text-based Markdown tables (or HTML). Even when image_mode is set to ImageRefMode.REFERENCED or ImageRefMode.EMBEDDED and generate_table_images=True is used, the table images are not included in the Markdown output.
This contrasts with PictureItem elements, which correctly respect the image_mode setting and output image links.
This issue was originally reported in docling (docling-project/docling#2820), but the root cause and fix lie within docling-core's serialization logic.
Current Behavior
MarkdownTableSerializer(indocling_core) strictly exports tables as text/markdown grids.- It ignores the
image_modeparameter for tables. - It ignores the presence of
item.image(if populated).
Expected Behavior
If image_mode is set to REFERENCED or EMBEDDED, and the TableItem has an associated image (item.image), the serializer should:
- Preferentially output the image (or include it alongside the text, depending on desired behavior, but likely behaving similar to
PictureItem). - Generate the appropriate Markdown image syntax (e.g.,
or embedded base64).
Proposed Solution
Modify MarkdownTableItemSerializer (or MarkdownTableSerializer) in docling_core/transforms/serializer/markdown.py.
The serialize method should be updated to check self.params.image_mode:
# Pseudo-code logic to be added to serialize()
if self.params.image_mode in [ImageRefMode.REFERENCED, ImageRefMode.EMBEDDED] and item.image:
# Logic similar to MarkdownPictureSerializer
image_link = self._get_image_link(item) # Helper to generate path/base64
return getattr(self, "create_ser_result")(text=f"", span_source=item)
# Fallback to existing text table generation
...