File tree Expand file tree Collapse file tree 2 files changed +4
-4
lines changed
Expand file tree Collapse file tree 2 files changed +4
-4
lines changed Original file line number Diff line number Diff line change @@ -95,15 +95,15 @@ Evaluation datasets should contain the following fields:
9595
9696- ** trafilatura** : trafilatura extractor
9797- ** resiliparse** : resiliparse extractor
98- - ** llm-webkit ** : llm-webkit extractor
98+ - ** mineru-html ** : mineru-html extractor
9999- ** magic-html** : magic-html extractor
100100- ** Custom extractors** : Implement by inheriting from ` BaseExtractor `
101101
102102## Evaluation Leaderboard
103103
104104| extractor | extractor_version | dataset | total_samples | overall (macro avg) | code_edit | formula_edit | table_TEDS | table_edit | text_edit |
105105| -----------| -------------------| ---------| ---------------| ---------------------| -----------| --------------| ------------| -----------| -----------|
106- | llm-webkit | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
106+ | mineru-html | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
107107| magic-html | 0.1.5 | WebMainBench1.0 | 545 | 0.5141 | 0.4117 | 0.7204 | 0.3984 | 0.2611 | 0.7791 |
108108| trafilatura_md | 2.0.0 | WebMainBench1.0 | 545 | 0.3858 | 0.1305 | 0.6242 | 0.3203 | 0.1653 | 0.6887 |
109109| trafilatura_txt | 2.0.0 | WebMainBench1.0 | 545 | 0.2657 | 0 | 0.6162 | 0 | 0 | 0.7126 |
Original file line number Diff line number Diff line change @@ -93,15 +93,15 @@ print(f"Overall Score: {result.overall_metrics['overall']:.4f}")
9393
9494- ** trafilatura** : trafilatura抽取器
9595- ** resiliparse** : resiliparse抽取器
96- - ** llm-webkit ** : llm-webkit 抽取器
96+ - ** mineru-html ** : mineru-html 抽取器
9797- ** magic-html** : magic-html 抽取器
9898- ** 自定义抽取器** : 通过继承 ` BaseExtractor ` 实现
9999
100100## 评测榜单
101101
102102| extractor | extractor_version | dataset | total_samples | overall(macro avg) | code_edit | formula_edit | table_TEDS | table_edit | text_edit |
103103| -----------| -------------------| ---------| ---------------| ---------------------| -----------| --------------| ------------| -----------| -----------|
104- | llm-webkit | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
104+ | mineru-html | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
105105| magic-html | 0.1.5 | WebMainBench1.0 | 545 | 0.5141 | 0.4117 | 0.7204 | 0.3984 | 0.2611 | 0.7791 |
106106| trafilatura_md | 2.0.0 | WebMainBench1.0 | 545 | 0.3858 | 0.1305 | 0.6242 | 0.3203 | 0.1653 | 0.6887 |
107107| trafilatura_txt | 2.0.0 | WebMainBench1.0 | 545 | 0.2657 | 0 | 0.6162 | 0 | 0 | 0.7126 |
You can’t perform that action at this time.
0 commit comments