Skip to content

release v1.0.0

Latest

Choose a tag to compare

@pekopoke pekopoke released this 24 Oct 07:27
· 18 commits to main since this release
898ed30

Include 4 extractors and bench for 545 data

What's Changed

  • fix bug:table 重复 by @pekopoke in #42
  • Optimize table edit distance calculation by using normalize by @pekopoke in #43
  • add extractor version in results by @pekopoke in #44
  • fix back to old formula match by @pekopoke in #45
  • feat: add language and style classify by @e06084 in #46
  • 使用LLM修正预测公式 by @1041206149 in #47
  • feat: refactor _extract_from_markdown with LLM-enhanced table/formula/code extraction by @1041206149 in #48
  • Dev:增加trafilatura输出txt的方法 by @pekopoke in #50
  • 将LLM api 配置放到config.py中 by @1041206149 in #51
  • fix:行内行间代码块中不进行表格和公式提取 by @pekopoke in #52

New Contributors

Full Changelog: v0.2.0...v1.0.0