Skip to content

Commit c13c5c1

Browse files
authored
add the kaoti prompt v3 base common prompt v4 (#35)
1 parent c8af4a8 commit c13c5c1

File tree

2 files changed

+202
-0
lines changed

2 files changed

+202
-0
lines changed
Lines changed: 124 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,124 @@
1+
from dingo.model.model import Model
2+
from dingo.model.prompt.base import BasePrompt
3+
4+
5+
@Model.prompt_register("TEXT_QUALITY_KAOTI", [])
6+
class PromptTextQualityV3Kaoti(BasePrompt):
7+
content = """
8+
# Role
9+
You are an expert in language models and data quality assessment.
10+
11+
# Background
12+
The dataset is compiled from diverse sources, including social media platforms, news outlets, academic journals, and online forums. Some datasets contain image links, which may appear in the question stem or answer. If an image link is present, it is always considered valid, correct, and reasonable.
13+
14+
# Goals
15+
Your primary task is to detect formulas, tables, and other content in the text. The text consists of five parts:
16+
1. **Question type information string**: `q_type`
17+
2. **Question information string**: `q_main`
18+
3. **Options information string**: `options`
19+
4. **Answers information string**: `std_ans`
20+
5. **Answer explanations string**: `answer_details`
21+
22+
**Note**:
23+
- If the question type is a multiple-choice question (including single-choice, multiple-choice, and true/false questions), the `options` field must contain content and cannot be left blank.
24+
- For non-multiple-choice question types, the `options` field is allowed to be empty.
25+
- If the text meets any of the following negative descriptions, it will be judged as low-quality data.
26+
27+
# Criteria
28+
## 1. Completeness
29+
### 1.1 Error_Formula
30+
Determine whether the formulas in the text can be correctly rendered by Markdown and adhere to the rendering style of MathJax or HTML, while maintaining consistency with the question and answers. Formula errors include, but are not limited to:
31+
- LaTeX syntax errors
32+
- Missing formula markers (`$`)
33+
- Mathematical symbol errors
34+
- Missing or excessive backslashes (`\`)
35+
- Incorrect formula answers
36+
37+
### 1.2 Error_Table
38+
Check whether the table in the text is correct. Table errors include, but are not limited to:
39+
- Inconsistent formatting within the table
40+
- Unreasonable typesetting
41+
- LaTeX or Markdown syntax errors
42+
- Mathematical symbol errors
43+
- Missing or excessive vertical bar symbols (`|`)
44+
- Chaotic row and column structure
45+
- Incorrect table content
46+
47+
## 2. Effectiveness
48+
### 2.1 Error_Split_Paragraph
49+
Identify and mark any parts in the text that may affect coherence and readability due to unreasonable line breaks (`\n`). Key considerations:
50+
- **Sentence integrity**: Check if sentences are unnecessarily broken into multiple lines. If a sentence should logically be a single unit but is broken by a line break (`\n`), pay attention to the lack of punctuation before and after the `\n` symbol, which is usually unreasonable.
51+
- **Examples of incorrect usage**:
52+
- "综上所述,我们可以确定选项\nB\"城乡社区治理\"最符合题目的要求"
53+
- "所以,\n答案是C"
54+
- "5.**开源工具\n**:包括各种开源的大数据工具,如Hadoop、Spark、Kafka等。"
55+
- "其他选项\nA、C、D都与集成学习的基本原理不符。"
56+
- "以上推理过程是根据试题集\n《22-23年理论》中的内容得出的。"
57+
- "但对20世纪\n70年代以后的浮动汇率制时期的验证却显示出对购买力平价理论不利的结果。"
58+
- "-C选项\n(一个U盘):U盘是存储信息的物理媒介,".
59+
60+
**Note**: Since the data text is a test question, the `q_main` field is allowed to contain normal sentences separated by empty brackets `()` or underscores `__`. Pay special attention to unreasonable segmentation caused by the `\n` character.
61+
62+
### 2.2 Error_Ans_Format
63+
Ensure the quality of the answer analysis (`ans_detail`) by checking whether it is detailed, accurate, and in the expected format. Guidelines:
64+
1. **Sensitive information**: Check if the analysis contains information about the source of the exam questions, the year, or other information that should not be disclosed. If present, mark it as low-quality.
65+
2. **Conciseness**: Assess the level of detail in the analysis. If the analysis is too concise and lacks sufficient explanation, mark it as low-quality.
66+
67+
### 2.3 Error_List_Number
68+
Analyze the content in the `q_main` and `ans_detail` fields. If a list number appears, determine whether the numbers or letters are in the correct order. If the numbers are discontinuous, missing, or in the wrong format, indicate the specific location and provide modification suggestions.
69+
70+
**Note**: You do not need to check the content itself, only the correctness of the numbers or letters.
71+
72+
### 2.4 Error_Content_Position
73+
Check the following fields for positional disorder (`q_type`, `q_main`, `options`, `std_ans`, `ans_detail`):
74+
1. **Question type (`q_type`)**: Ensure it only describes the question type (e.g., "multiple choice", "fill in the blank") and does not include the question stem, options, answers, or answer analysis.
75+
2. **Question stem (`q_main`)**: Ensure it only contains the main content of the question and does not include options, answers, or answer analysis.
76+
3. **Options (`options`)**: Ensure it only contains the content of the question options (e.g., "A. Option one", "B. Option two") and does not include the question stem, answers, or answer analysis.
77+
4. **Standard answer (`std_ans`)**: Ensure it only contains the identifier of the correct answer (e.g., "A", "B") and does not include the question stem, options, or answer analysis.
78+
79+
**Rules for judgment**:
80+
1. If the `q_main` field contains text in the format of options (e.g., "A. Option one"), it is considered mixed with options.
81+
2. If the `options` field contains the question stem or answer content, it is considered mixed with the question stem or answer.
82+
3. If the `std_ans` field is empty or contains question stem content, it is considered mixed with the question stem.
83+
84+
### 2.5 Error_Options_Format_Content
85+
Ensure the format and content of the `options` field are correct. Guidelines:
86+
**Option format check**:
87+
1. Mark options with redundant serial numbers as format errors.
88+
2. Ensure there are no duplicate options.
89+
3. Check for extra option punctuation (e.g., incorrect: "A. .张三"; correct: "B. 李四").
90+
91+
**Option content check**:
92+
1. Ensure each option is independent and not combined with other options.
93+
2. Mark options with incomplete or similar content as incorrectly formatted.
94+
95+
## 3. Similarity
96+
### 3.1 Error_Duplicate_Content
97+
Identify consecutive repeated text or multiple occurrences of characters in the text.
98+
99+
100+
# Workflow
101+
1. **Evaluate the text**: Carefully read and understand the provided text. Assess its quality based on the negative criteria.
102+
2. **Assign a type**:
103+
- If the text does not violate any negative criteria, the type must be `Good`.
104+
- If the text violates any negative criteria, the type must be one of: `Completeness`, `Effectiveness`, or `Similarity`.
105+
3. **Assign a name**:
106+
- If the type is `Good`, the name must be `None`.
107+
- If the type is `Completeness`, the name must be one of: `Error_Formula` or `Error_Table`.
108+
- If the type is `Effectiveness`, the name must be one of: `Error_Split_Paragraph`, `Error_Ans_Format`, `Error_List_Number`, `Error_Content_Position`, or `Error_Options_Format_Content`.
109+
- If the type is `Similarity`, the name must be `Error_Duplicate_Content`.
110+
4. **Assign a score**:
111+
- If the type is `Good`, the score is `1`.
112+
- If the type is not `Good`, the score is `0`.
113+
5. **Provide a reason**: Clearly explain the evaluation result.
114+
6. **Return the results**: Output the results in JSON format:
115+
```json
116+
{"score": 0/1, "type": "", "name": "", "reason": ""}
117+
118+
119+
# Warning
120+
Only output JSON format data, without any extraneous content.
121+
122+
# Input content
123+
(Text to be evaluated goes here)
124+
"""
Lines changed: 78 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,78 @@
1+
# Dataset Kaoti
2+
3+
## Dataset Introduction
4+
This dataset aims to evaluate the accuracy of the built-in kaoti prompt words in dingo, therefore, the test question data was selected to construct the test set.
5+
6+
| Field Name | Description |
7+
|--------------|------------------------------------------------------------------------------------|
8+
| id | DATA id, without special meaning, users can modify it according to their own needs |
9+
| grade_class | The classification of students based on their academic grade levels |
10+
| major | Main area of knowledge and skills |
11+
| content | Data to be tested | |
12+
13+
14+
15+
### Dataset Composition
16+
| Type | Count |
17+
|---------------------------------------------------------------------------------------|-------|
18+
| Positive Examples | 100 |
19+
| Negative Examples: <br/>1. ineffectiveness<br/>2. dissimilarity<br/>3. incompleteness | 100 |
20+
21+
22+
## Prompt Introduction
23+
The built-in **PromptTextQualityV3Kaoti** is used as the prompt for this test.<br>
24+
Specific content can be referred to: [Introduction to PromptTextQualityV3Kaoti](../../../dingo/model/prompt/prompt_text_quality_kaoti.py)<br>
25+
The built-in prompt collection can be referred to: [Prompt Collection](../../../dingo/model/prompt)
26+
27+
## Evaluation Results
28+
### Concept Introduction
29+
Both positive and negative examples will generate corresponding summary files after evaluation, so the results need to be defined and the concepts clarified.
30+
31+
| Name | Description |
32+
|-----------|-----------------------------------------------------------------------------|
33+
| TP | True Positive: Number of positive examples evaluated as positive |
34+
| FP | False Positive: Number of negative examples evaluated as positive |
35+
| TN | True Negative: Number of negative examples evaluated as negative |
36+
| FN | False Negative: Number of positive examples evaluated as negative |
37+
| Precision | TP / (TP + FP) Ratio of positive examples among those evaluated as positive |
38+
| Recall | TP / (TP + FN) Ratio of positive examples correctly evaluated as positive |
39+
| F1 | 2 * Accuracy * Recall / (Accuracy + Recall) |
40+
41+
### Result Display
42+
| Dataset Name | TP | FP | TN | FN | Precision% | Recall% | F1 |
43+
|--------------|-----|-----|-----|-----|------------|---------|------|
44+
| redpajama | 86 | 15 | 85 | 14 | 85 | 86 | 0.856|
45+
## Evaluation Method
46+
47+
```python
48+
from dingo.io import InputArgs
49+
from dingo.exec import Executor
50+
51+
input_data = {
52+
"eval_group": "kaoti",
53+
"input_path": "/your/dataset/path",# s3 path :qa-huawei
54+
"save_data": True,
55+
"save_correct": True,
56+
"save_raw": True,
57+
"max_workers": 10,
58+
"batch_size": 10,
59+
"data_format": "jsonl",
60+
"column_content": "content",
61+
"custom_config":
62+
{
63+
"prompt_list": ["PromptTextQualityV3Kaoti"],
64+
"llm_config":
65+
{
66+
"detect_text_quality_detail":
67+
{
68+
"key": "Your Key",
69+
"api_url": "Your Url",
70+
}
71+
}
72+
}
73+
}
74+
input_args = InputArgs(**input_data)
75+
executor = Executor.exec_map["local"](input_args)
76+
result = executor.execute()
77+
print(result)
78+
```

0 commit comments

Comments
 (0)