Skip to content

Commit 80db4a6

Browse files
authored
fix latex-ocr (#2510)
1 parent 6b30748 commit 80db4a6

File tree

1 file changed

+2
-3
lines changed

1 file changed

+2
-3
lines changed

swift/llm/utils/dataset.py

Lines changed: 2 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2257,11 +2257,10 @@ def _process(d):
22572257

22582258
register_dataset(
22592259
DatasetName.latex_ocr_print,
2260-
'AI-ModelScope/LaTeX_OCR',
2261-
['full'],
2260+
'AI-ModelScope/LaTeX_OCR', ['default'],
22622261
_preprocess_latex_ocr_dataset,
22632262
get_dataset_from_repo,
2264-
split=['validation', 'test'], # There are some problems in the training dataset.
2263+
split=['train', 'validation', 'test'],
22652264
hf_dataset_id='linxy/LaTeX_OCR',
22662265
tags=['chat', 'ocr', 'multi-modal', 'vision'])
22672266

0 commit comments

Comments
 (0)