You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Copy file name to clipboardExpand all lines: docs/eval/prompt/multi_language_data_evaluated_by_prompt.md
+12-12Lines changed: 12 additions & 12 deletions
Display the source diff
Display the rich diff
Original file line number
Diff line number
Diff line change
@@ -1,7 +1,7 @@
1
1
# Multi_Lan Dataset
2
2
3
3
## Dataset Introduction
4
-
Multi_Lan Dataset aims to evaluate the ability of Dingo's built-in prompt to mine low-quality data in multi-language pre-training datasets. We extracted a portion of data from the Common Crawl (CC) dataset, which was then annotated by experts in these languages based on seven quality dimensions([quality_metrics](../metrics.md)). If any dimension has problems, the data will be marked as low-quality data.
4
+
Multi_Lan Dataset aims to evaluate the ability of Dingo's built-in prompt to mine low-quality data in multi-language pre-training datasets. We extracted a portion of data from the Common Crawl (CC) dataset, which was then annotated by experts in these languages based on seven quality dimensions([quality_metrics](../../metrics.md)). If any dimension has problems, the data will be marked as low-quality data.
5
5
6
6
| Field Name | Description |
7
7
|--------------|------------------------------|
@@ -16,25 +16,25 @@ Multi_Lan Dataset aims to evaluate the ability of Dingo's built-in prompt to min
16
16
### Dataset Link
17
17
The dataset is available for different languages through the following links:
The built-in **PromptTextQualityV2** is used as the prompt for this test. Specific content can be referred to: [Introduction to PromptTextQualityV2](../../dingo/model/prompt/prompt_text_quality_v2.py)<br>
33
-
The built-in prompt collection can be referred to: [Prompt Collection](../../dingo/model/prompt)
32
+
The built-in **PromptTextQualityV2** is used as the prompt for this test.<br>
33
+
Specific content can be referred to: [Introduction to PromptTextQualityV2](../../../dingo/model/prompt/prompt_text_quality_v2.py)<br>
34
+
The built-in prompt collection can be referred to: [Prompt Collection](../../../dingo/model/prompt)
0 commit comments