MigoXLab
diff --git a/‎docs/eval/dataset_multi_lan.md‎ ‎…lti_language_data_evaluated_by_prompt.md‎docs/eval/dataset_multi_lan.md renamed to docs/eval/prompt/multi_language_data_evaluated_by_prompt.md
Lines changed: 12 additions & 12 deletions b/‎docs/eval/dataset_multi_lan.md‎ ‎…lti_language_data_evaluated_by_prompt.md‎docs/eval/dataset_multi_lan.md renamed to docs/eval/prompt/multi_language_data_evaluated_by_prompt.md
Lines changed: 12 additions & 12 deletions
diff --git a/‎docs/eval/evaluation_3h.md‎ ‎…s/eval/prompt/qa_data_evaluated_by_3h.md‎docs/eval/evaluation_3h.md renamed to docs/eval/prompt/qa_data_evaluated_by_3h.md b/‎docs/eval/evaluation_3h.md‎ ‎…s/eval/prompt/qa_data_evaluated_by_3h.md‎docs/eval/evaluation_3h.md renamed to docs/eval/prompt/qa_data_evaluated_by_3h.md
diff --git a/‎docs/eval/dataset_redpajama.md‎ ‎…pt/redpajama_data_evaluated_by_prompt.md‎docs/eval/dataset_redpajama.md renamed to docs/eval/prompt/redpajama_data_evaluated_by_prompt.md
Lines changed: 3 additions & 2 deletions b/‎docs/eval/dataset_redpajama.md‎ ‎…pt/redpajama_data_evaluated_by_prompt.md‎docs/eval/dataset_redpajama.md renamed to docs/eval/prompt/redpajama_data_evaluated_by_prompt.md
Lines changed: 3 additions & 2 deletions
diff --git a/‎docs/eval/topic_classification.md‎ ‎…/prompt/text_data_classified_by_topic.md‎docs/eval/topic_classification.md renamed to docs/eval/prompt/text_data_classified_by_topic.md b/‎docs/eval/topic_classification.md‎ ‎…/prompt/text_data_classified_by_topic.md‎docs/eval/topic_classification.md renamed to docs/eval/prompt/text_data_classified_by_topic.md
diff --git a/‎docs/eval/dataset_slimpajama.md‎ ‎…ule/slimpajama_data_evaluated_by_rule.md‎docs/eval/dataset_slimpajama.md renamed to docs/eval/rule/slimpajama_data_evaluated_by_rule.md
Lines changed: 2 additions & 2 deletions b/‎docs/eval/dataset_slimpajama.md‎ ‎…ule/slimpajama_data_evaluated_by_rule.md‎docs/eval/dataset_slimpajama.md renamed to docs/eval/rule/slimpajama_data_evaluated_by_rule.md
Lines changed: 2 additions & 2 deletions
@@ -1,7 +1,7 @@
 # Multi_Lan Dataset
 
 ## Dataset Introduction
-Multi_Lan Dataset aims to evaluate the ability of Dingo's built-in prompt to mine low-quality data in multi-language pre-training datasets. We extracted a portion of data from the Common Crawl (CC) dataset, which was then annotated by experts in these languages based on seven quality dimensions（[quality_metrics](../metrics.md)）. If any dimension has problems, the data will be marked as low-quality data.
+Multi_Lan Dataset aims to evaluate the ability of Dingo's built-in prompt to mine low-quality data in multi-language pre-training datasets. We extracted a portion of data from the Common Crawl (CC) dataset, which was then annotated by experts in these languages based on seven quality dimensions（[quality_metrics](../../metrics.md)）. If any dimension has problems, the data will be marked as low-quality data.
 
 | Field Name          | Description                           |
 |--------------|------------------------------|
@@ -16,25 +16,25 @@ Multi_Lan Dataset aims to evaluate the ability of Dingo's built-in prompt to min
 ### Dataset Link
 The dataset is available for different languages through the following links:
 
-| Language | Dataset Link                                        |
-|----------|----------------------------------------------|
-| Russian       | https://huggingface.co/datasets/chupei/cc_ru |
+| Language   | Dataset Link                                 |
+|------------|----------------------------------------------|
+| Russian    | https://huggingface.co/datasets/chupei/cc_ru |
 | Thai       | https://huggingface.co/datasets/chupei/cc_th |
-| Vietnamese      | https://huggingface.co/datasets/chupei/cc_vi |
-| Hungarian     | https://huggingface.co/datasets/chupei/cc_hu |
+| Vietnamese | https://huggingface.co/datasets/chupei/cc_vi |
+| Hungarian  | https://huggingface.co/datasets/chupei/cc_hu |
 | Serbian    | https://huggingface.co/datasets/chupei/cc_sr |
 
 
 ### Dataset Composition
 The dataset includes five languages: Russian, Thai, Vietnamese, Hungarian, and Serbian. Below is a summary of each language's data:
 
 | Language   | Number of dataset | Number of High-Quality Data | Number of Low-Quality Data |
-|------|-------------------|-----------------------------|----------------------------|
-| Russian   | 154               | 71                          | 83                         |
-| Thai   | 267               | 128                         | 139                        |
-| Vietnamese  | 214               | 101                         | 113                        |
-| Hungarian | 225               | 99                          | 126                        |
-| Serbian | 144               | 38                          | 76                         |
+|------------|-------------------|-----------------------------|----------------------------|
+| Russian    | 154               | 71                          | 83                         |
+| Thai       | 267               | 128                         | 139                        |
+| Vietnamese | 214               | 101                         | 113                        |
+| Hungarian  | 225               | 99                          | 126                        |
+| Serbian    | 144               | 38                          | 76                         |
 
 
 
 
@@ -29,8 +29,9 @@ https://huggingface.co/datasets/chupei/redpajama_bad_model
 | Negative Examples: irrelevance          | 49    |
 
 ## Prompt Introduction
-The built-in **PromptTextQualityV2** is used as the prompt for this test. Specific content can be referred to: [Introduction to PromptTextQualityV2](../../dingo/model/prompt/prompt_text_quality_v2.py)<br>
-The built-in prompt collection can be referred to: [Prompt Collection](../../dingo/model/prompt)
+The built-in **PromptTextQualityV2** is used as the prompt for this test.<br>
+Specific content can be referred to: [Introduction to PromptTextQualityV2](../../../dingo/model/prompt/prompt_text_quality_v2.py)<br>
+The built-in prompt collection can be referred to: [Prompt Collection](../../../dingo/model/prompt)
 
 ## Evaluation Results
 ### Concept Introduction
 
@@ -40,8 +40,8 @@ https://huggingface.co/datasets/chupei/slimpajama_goodcase_rule
 | Negative examples: RuleWordNumber               | 7     |
 
 ## Rules Introduction
-This test uses the built-in **pretrain** as the eval_group. For specific rules included, please refer to: [Group Introduction](../groups.md).<br>
-For rules within the group, please refer to: [Rules Introduction](../rules.md).
+This test uses the built-in **pretrain** as the eval_group. For specific rules included, please refer to: [Group Introduction](../../groups.md).<br>
+For rules within the group, please refer to: [Rules Introduction](../../rules.md).
 
 ## Evaluation Results
 ### Definitions