Skip to content

feat(layout): better layout model #369

@Mickls

Description

@Mickls

Before you submit

  • I have searched existing issues
  • I spent at least 5 minutes investigating and preparing this report
  • I confirmed this is not caused by a network issue
  • I have fully read and understood the README
  • I am certain that this issue is with BabelDOC itself and can be reproduced through the BabelDOC cli

Environment

- OS: windows 11 24H2 26100.3775
- Python: 3.13.3
- BabelDOC: 0.3.49

Describe the bug

我有一个英文文件,我使用命令行以及python sdk接口都无法将它翻译为中文,翻译后的文件,无论是双页对照还是单页文件都依旧保持原文内容不变

Steps to Reproduce

我的代码如下

import asyncio

from babeldoc.document_il.translator.translator import OpenAITranslator
from babeldoc.docvision.doclayout import DocLayoutModel
from babeldoc.docvision.table_detection.rapidocr import RapidOCRModel
from babeldoc.translation_config import WatermarkOutputMode, TranslationConfig, TranslateResult
from babeldoc.main import create_progress_handler
from babeldoc.high_level import async_translate

doc_layout_model = DocLayoutModel.load_onnx()
table_model = RapidOCRModel()
watermark_output_mode = WatermarkOutputMode.NoWatermark

lang_in = "en"
lang_out = "zh"
translator = OpenAITranslator(
	lang_in=lang_in,
	lang_out=lang_out,
	model="gpt-4o-mini-2024-07-18",
	base_url="https://openai-xxxx",
	api_key="sk-xxxx",
	ignore_cache=True,
)

tmpdir = "."
config = TranslationConfig(
	input_file=r"xxxx.pdf",
	font=None,
	pages=None,
	output_dir=tmpdir,
	translator=translator,
	debug=False,
	lang_in=lang_in,
	lang_out=lang_out,
	no_dual=False,
	no_mono=True,
	qps=5,
	doc_layout_model=doc_layout_model,
	skip_clean=False,
	dual_translate_first=False,
	disable_rich_text_translate=False,
	enhance_compatibility=True,
	report_interval=0.5,
	min_text_length=5,
	watermark_output_mode=watermark_output_mode,
	split_strategy=None,
	table_model=table_model,
)


async def main():
	progress_context, progress_handler = create_progress_handler(config)
	overall_progress = 0
	with progress_context:
		async for event in async_translate(config):
			progress_handler(event)
			if event["type"] == "progress_update":
				new_overall_progress = event["overall_progress"]
				print("overall_progress:", overall_progress)
				if new_overall_progress > overall_progress:
					overall_progress = new_overall_progress
			if event["type"] == "finish":
				result: TranslateResult = event.get("translate_result")
				print(str(result))
				break


if __name__ == '__main__':
	asyncio.run(main())

我的cli命令如下
uv run babeldoc --files "E:\...\xxx.pdf" --openai --openai-model "gpt-4o-mini" --openai-base-url "https://openai-xxx" --openai-api-key "sk-xxxx"

Expected Behavior

No response

Relevant Log Output or Screenshots


Original PDF File

出错的PDF文件可以参考以下文件
TB2-SDC.VP124-00HSJ-M-M1A-PFD-0011 Rev1 Worst Coal - BMCR.pdf

Additional Context

No response

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions