-
Notifications
You must be signed in to change notification settings - Fork 3.4k
Open
Labels
questionFurther information is requestedFurther information is requested
Description
Question
在图片描述中使用api方法,PictureDescriptionApiOptions,拿不到返回结果,模型api的token是消耗了的
调用脚本:
from pathlib import Path
from docling.datamodel.base_models import InputFormat
from docling.datamodel.pipeline_options import PdfPipelineOptions
from docling.document_converter import DocumentConverter, PdfFormatOption, ImageFormatOption
from docling.datamodel.pipeline_options import PictureDescriptionVlmOptions, granite_picture_description, PictureDescriptionApiOptions
from docling_core.types.doc.base import ImageRefMode
from docling_core.types.doc.document import PictureDescriptionData
from IPython import display
DOC_SOURCE = 图片
pipeline_options = PdfPipelineOptions(artifacts_path="/a/domains/docling/docling_models")
pipeline_options.do_picture_description = True
""" 本地加载vlm模型 """
# temp_picture_description = PictureDescriptionVlmOptions(
# repo_id="Qwen/Qwen3-VL-2B-Instruct",
# prompt="详细描述一下这张图片",
# generation_config = dict(max_new_tokens=500, do_sample=False)
# )
#
# pipeline_options.picture_description_options = (
# temp_picture_description # <-- the model choice
# )
""" 使用vlm模型api """
pipeline_options.enable_remote_services=True # 运行远程服务
pipeline_options.picture_description_options = PictureDescriptionApiOptions(
url="https://ark.cn-beijing.volces.com/api/v3/chat/completions",
headers={
"Content-Type": "application/json",
"Authorization": "Bearer xxx"
},
params=dict(
model="doubao-seed-1-6-flash-250828",
seed=42,
max_completion_tokens=500,
),
prompt="详细描述一下这张图片.",
timeout=60,
)
pipeline_options.images_scale = 2.0
pipeline_options.generate_picture_images = True
converter = DocumentConverter(
format_options={
InputFormat.PDF: PdfFormatOption(
pipeline_options=pipeline_options,
),
InputFormat.IMAGE: ImageFormatOption(
pipeline_options=pipeline_options
)
}
)
extract_result = converter.convert(DOC_SOURCE)
markdown_output = extract_result.document.export_to_markdown()
print("--- markdown result ---")
print(markdown_output)
annotation = extract_result.document.pictures[0].annotations
print("--- annotation result ---")
print(annotation)
output_dir = Path("save_files")
output_dir.mkdir(parents=True, exist_ok=True)
doc_filename = extract_result.input.file.stem
html_filename = output_dir / f"{doc_filename}-with-images.md"
extract_result.document.save_as_markdown(html_filename, image_mode=ImageRefMode.REFERENCED)输出结果:
--- markdown result ---
员工发起用户来电申信取消,员工审核为盛假,录入异常系统但
不扣工程师推荐分并也不向工程师发送虚假取消消息
<!-- image -->
/a/domains/docling/docling_master/picture_description.py:62: DeprecationWarning: Field `annotations` is deprecated; use `meta` instead.
annotation = extract_result.document.pictures[0].annotations
--- annotation result ---
[DescriptionAnnotation(kind='description', text='', provenance='not-implemented')]
Metadata
Metadata
Assignees
Labels
questionFurther information is requestedFurther information is requested