-
Notifications
You must be signed in to change notification settings - Fork 102
Open
Labels
enhancementNew feature or requestNew feature or request
Description
Currently the save_as_json(filename, image_mode=ImageRefMode.REFERENCED) behaves in a following way:
- If filename is a relative path:
- images are saved into
[filename.stem]_artifactsfolder next to the json file - the references to images are updated in the docling document and point to the relative path
- if filename is an absolute path:
- images are saved into
[filename.stem]_artifactsfolder next to the json file, as before - the references to images are updated in the docling document and point to the absolute path
This behavior is consistent but not desirable in several use-cases:
- In the use-case of running docling inside of the container, or similar environment where we have to control where temporary files go exactly, it's very likely that
filenamehas to be an absolute path but references to the images should still stay relative, which is currently not possible. - The way how referenced images are saved and named, might require additional level of customization from the user. For example, when storing conversion results on s3, one could have a preference of saving all the images in one single prefix, because document is converted into multiple formats and those formats are also stored under different prefixes, so storing images under
..._artifactsis not an efficient option. Also, images might have to be renamed when saved, following a different naming shema, but assave_to_jsonnow also updates references itself, we can't change image names in the references manually inside of the docling document object, before serializing it.
Metadata
Metadata
Assignees
Labels
enhancementNew feature or requestNew feature or request