Skip to content

Commit ebc5631

Browse files
committed
update readme
1 parent ef3321d commit ebc5631

File tree

2 files changed

+4
-4
lines changed

2 files changed

+4
-4
lines changed

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -95,15 +95,15 @@ Evaluation datasets should contain the following fields:
9595

9696
- **trafilatura**: trafilatura extractor
9797
- **resiliparse**: resiliparse extractor
98-
- **llm-webkit**: llm-webkit extractor
98+
- **mineru-html**: mineru-html extractor
9999
- **magic-html**: magic-html extractor
100100
- **Custom extractors**: Implement by inheriting from `BaseExtractor`
101101

102102
## Evaluation Leaderboard
103103

104104
| extractor | extractor_version | dataset | total_samples | overall (macro avg) | code_edit | formula_edit | table_TEDS | table_edit | text_edit |
105105
|-----------|-------------------|---------|---------------|---------------------|-----------|--------------|------------|-----------|-----------|
106-
| llm-webkit | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
106+
| mineru-html | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
107107
| magic-html | 0.1.5 | WebMainBench1.0 | 545 | 0.5141 | 0.4117 | 0.7204 | 0.3984 | 0.2611 | 0.7791 |
108108
| trafilatura_md | 2.0.0 | WebMainBench1.0 | 545 | 0.3858 | 0.1305 | 0.6242 | 0.3203 | 0.1653 | 0.6887 |
109109
| trafilatura_txt | 2.0.0 | WebMainBench1.0 | 545 | 0.2657 | 0 | 0.6162 | 0 | 0 | 0.7126 |

README_zh.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -93,15 +93,15 @@ print(f"Overall Score: {result.overall_metrics['overall']:.4f}")
9393

9494
- **trafilatura**: trafilatura抽取器
9595
- **resiliparse**: resiliparse抽取器
96-
- **llm-webkit**: llm-webkit 抽取器
96+
- **mineru-html**: mineru-html 抽取器
9797
- **magic-html**: magic-html 抽取器
9898
- **自定义抽取器**: 通过继承 `BaseExtractor` 实现
9999

100100
## 评测榜单
101101

102102
| extractor | extractor_version | dataset | total_samples | overall(macro avg) | code_edit | formula_edit | table_TEDS | table_edit | text_edit |
103103
|-----------|-------------------|---------|---------------|---------------------|-----------|--------------|------------|-----------|-----------|
104-
| llm-webkit | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
104+
| mineru-html | 4.1.1 | WebMainBench1.0 | 545 | 0.8256 | 0.9093 | 0.9399 | 0.7388 | 0.678 | 0.8621 |
105105
| magic-html | 0.1.5 | WebMainBench1.0 | 545 | 0.5141 | 0.4117 | 0.7204 | 0.3984 | 0.2611 | 0.7791 |
106106
| trafilatura_md | 2.0.0 | WebMainBench1.0 | 545 | 0.3858 | 0.1305 | 0.6242 | 0.3203 | 0.1653 | 0.6887 |
107107
| trafilatura_txt | 2.0.0 | WebMainBench1.0 | 545 | 0.2657 | 0 | 0.6162 | 0 | 0 | 0.7126 |

0 commit comments

Comments
 (0)