Skip to content

Commit 8cb3a61

Browse files
authored
Reasoning ops API doc, en & zh version (#123)
* reasoning op API, zh version * reasoning op API, en version * remove useless op doc
1 parent b28458d commit 8cb3a61

35 files changed

+1316
-385
lines changed

docs/en/notes/api/operators/reasoning/eval/ReasoningCategoryDatasetEvaluator.md

Lines changed: 46 additions & 19 deletions
Original file line numberDiff line numberDiff line change
@@ -30,32 +30,59 @@ Executes the main logic of the operator. It reads a DataFrame from storage, calc
3030

3131

3232
## 🧠 Example Usage
33+
```python
34+
from dataflow.operators.reasoning import ReasoningCategoryDatasetEvaluator
35+
from dataflow.utils.storage import FileStorage
36+
from dataflow.core import LLMServingABC
37+
38+
class ReasoningCategoryDatasetEvaluatorTest():
39+
def __init__(self, llm_serving: LLMServingABC = None):
40+
41+
self.storage = FileStorage(
42+
first_entry_file_name="example.json",
43+
cache_path="./cache_local",
44+
file_name_prefix="dataflow_cache_step",
45+
cache_type="jsonl",
46+
)
47+
48+
self.evaluator = ReasoningCategoryDatasetEvaluator()
49+
50+
def forward(self):
51+
self.evaluator.run(
52+
storage = self.storage.step(),
53+
input_primary_category_key = "primary_category",
54+
input_secondary_category_key = "secondary_category",
55+
)
3356

57+
if __name__ == "__main__":
58+
pl = ReasoningCategoryDatasetEvaluatorTest()
59+
pl.forward()
60+
```
3461

3562
#### 🧾 Default Output Format
36-
The `run` method returns a dictionary containing the statistical information of the categories.
63+
| Field | Type | Description |
64+
| :-------------- | :---- | :---------- |
65+
| key | str | Primary category name. |
66+
| value | dict | Dictionary containing the total number of samples for this primary category (`primary_num`) and the number of samples for each secondary category. |
3767

38-
**Example Input Data in DataFrame:**
68+
Example input (dataframe rows stored in `storage`):
3969
```json
40-
[
41-
{"primary_category": "Humanities", "secondary_category": "History"},
42-
{"primary_category": "STEM", "secondary_category": "Mathematics"},
43-
{"primary_category": "STEM", "secondary_category": "Physics"},
44-
{"primary_category": "STEM", "secondary_category": "Mathematics"}
45-
]
70+
{ "primary_category": "Science", "secondary_category": "Physics" }
71+
{ "primary_category": "Science", "secondary_category": "Chemistry" }
72+
{ "primary_category": "Science", "secondary_category": "Physics" }
73+
{ "primary_category": "Humanities", "secondary_category": "History" }
4674
```
47-
48-
**Example Output (Return Value):**
75+
Example output:
4976
```json
5077
{
51-
"STEM": {
52-
"primary_num": 3,
53-
"Mathematics": 2,
54-
"Physics": 1
55-
},
56-
"Humanities": {
57-
"primary_num": 1,
58-
"History": 1
59-
}
78+
"Science": {
79+
"primary_num": 3,
80+
"Physics": 2,
81+
"Chemistry": 1
82+
},
83+
"Humanities": {
84+
"primary_num": 1,
85+
"History": 1
86+
}
6087
}
6188
```

docs/en/notes/api/operators/reasoning/eval/ReasoningDifficultyDatasetEvaluator.md

Lines changed: 30 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -33,29 +33,42 @@ def run(self, storage: DataFlowStorage, input_diffulty_key: str = "difficulty_sc
3333

3434
## 🧠 Example Usage
3535
```python
36+
from dataflow.operators.reasoning import ReasoningDifficultyDatasetEvaluator
37+
from dataflow.utils.storage import FileStorage
38+
from dataflow.core import LLMServingABC
3639

40+
class ReasoningDifficultyDatasetEvaluatorTest():
41+
def __init__(self, llm_serving: LLMServingABC = None):
42+
43+
self.storage = FileStorage(
44+
first_entry_file_name="example.json",
45+
cache_path="./cache_local",
46+
file_name_prefix="dataflow_cache_step",
47+
cache_type="jsonl",
48+
)
49+
50+
self.evaluator = ReasoningDifficultyDatasetEvaluator()
51+
52+
def forward(self):
53+
self.evaluator.run(
54+
storage = self.storage.step(),
55+
input_diffulty_key = "difficulty_score",
56+
)
57+
58+
if __name__ == "__main__":
59+
pl = ReasoningDifficultyDatasetEvaluatorTest()
60+
pl.forward()
3761
```
3862

39-
#### 🧾 Output Format
40-
The `run` function returns a dictionary containing the statistics of the difficulty distribution. The keys of the dictionary are the unique difficulty levels found in the dataset, and the values are the counts of samples for each level.
63+
#### 🧾 Return Value
4164

42-
**Example Input (Data in `storage`)**:
43-
A dataframe with a column named `difficulty_score` (or as specified by `input_diffulty_key`).
44-
```
45-
[
46-
{"instruction": "Question A...", "difficulty_score": "easy"},
47-
{"instruction": "Question B...", "difficulty_score": "medium"},
48-
{"instruction": "Question C...", "difficulty_score": "easy"},
49-
{"instruction": "Question D...", "difficulty_score": "hard"},
50-
{"instruction": "Question E...", "difficulty_score": "medium"}
51-
]
52-
```
65+
This operator returns a dictionary where the keys are the difficulty levels found in the dataset, and the values are the corresponding sample counts for each difficulty level.
5366

54-
**Example Output (Return value of `run` function)**:
67+
Example return value:
5568
```json
5669
{
57-
"easy": 2,
58-
"medium": 2,
59-
"hard": 1
70+
"Easy": 150,
71+
"Medium": 200,
72+
"Hard": 80
6073
}
6174
```

docs/en/notes/api/operators/reasoning/eval/ReasoningQuestionCategorySampleEvaluator.md

Lines changed: 35 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -22,7 +22,7 @@ def __init__(self, llm_serving: LLMServingABC = None)
2222

2323
| Prompt Template Name | Main Purpose | Applicable Scenarios | Feature Description |
2424
| :--- | :--- | :--- | :--- |
25-
| | | | |
25+
| MathQuestionCategoryPrompt | Multi-level question classification | Classifying user questions into primary and secondary categories | Takes input questions and outputs primary and secondary classifications |
2626

2727
## `run` function
2828

@@ -39,7 +39,40 @@ def run(self, storage: DataFlowStorage, input_key:str = "instruction", output_ke
3939
## 🧠 Example Usage
4040

4141
```python
42-
42+
from dataflow.operators.reasoning import ReasoningQuestionCategorySampleEvaluator
43+
from dataflow.utils.storage import FileStorage
44+
from dataflow.core import LLMServingABC
45+
from dataflow.serving import APILLMServing_request
46+
47+
class ReasoningQuestionCategorySampleEvaluatorTest():
48+
def __init__(self, llm_serving: LLMServingABC = None):
49+
50+
self.storage = FileStorage(
51+
first_entry_file_name="example.json",
52+
cache_path="./cache_local",
53+
file_name_prefix="dataflow_cache_step",
54+
cache_type="jsonl",
55+
)
56+
57+
# use API server as LLM serving
58+
self.llm_serving = APILLMServing_request(
59+
api_url="",
60+
model_name="gpt-4o",
61+
max_workers=30
62+
)
63+
64+
self.evaluator = ReasoningQuestionCategorySampleEvaluator(llm_serving=self.llm_serving)
65+
66+
def forward(self):
67+
self.evaluator.run(
68+
storage = self.storage.step(),
69+
input_key = "instruction",
70+
output_key = "category",
71+
)
72+
73+
if __name__ == "__main__":
74+
pl = ReasoningQuestionCategorySampleEvaluatorTest()
75+
pl.forward()
4376
```
4477

4578
#### 🧾 Default Output Format

docs/en/notes/api/operators/reasoning/eval/ReasoningQuestionDifficultySampleEvaluator.md

Lines changed: 87 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -5,35 +5,105 @@ permalink: /en/api/operators/reasoning/eval/reasoningquestiondifficultysampleeva
55
---
66

77
## 📘 Overview
8-
The `ReasoningQuestionDifficultySampleEvaluator` is an operator designed to evaluate the difficulty level of questions. It leverages a Large Language Model (LLM) to analyze the complexity of a given question and outputs a numerical difficulty score, typically on a scale of 1 to 10.
98

10-
## __init__
9+
[ReasoningQuestionDifficultySampleEvaluator](https://github.com/OpenDCAI/DataFlow/blob/main/dataflow/operators/reasoning/evaluate/reasoning_question_difficulty_sample_evaluator.py)
10+
is a question difficulty evaluation operator. It analyzes the complexity of questions by calling a Large Language Model (LLM) and generates a difficulty score from 1 to 10 for each question.
11+
12+
## `__init__` function
13+
1114
```python
12-
def __init__(self, llm_serving: LLMServingABC = None)
15+
@prompt_restrict(
16+
MathQuestionDifficultyPrompt
17+
)
18+
19+
@OPERATOR_REGISTRY.register()
20+
class ReasoningQuestionDifficultySampleEvaluator(OperatorABC):
21+
def __init__(self, llm_serving: LLMServingABC = None):
1322
```
14-
| Parameter | Type | Default | Description |
15-
| :--- | :--- | :--- | :--- |
16-
| **llm_serving** | LLMServingABC | None | An instance of a large language model serving class, used for executing inference and generation. |
23+
24+
### init Parameter Description
25+
26+
| Parameter Name | Type | Default | Description |
27+
| :-------------- | :------------ | :----- | :----------------------------- |
28+
| **llm_serving** | LLMServingABC | Required | Large language model service instance for executing inference and generation. |
1729

1830
### Prompt Template Descriptions
31+
1932
| Prompt Template Name | Primary Use | Applicable Scenarios | Features |
20-
| :--- | :--- | :--- | :--- |
21-
| **MathQuestionDifficultyPrompt** | | | |
33+
| --------------- | -------- | -------- | -------- |
34+
| MathQuestionDifficultyPrompt | Question difficulty evaluation | Evaluating the difficulty of user questions | Input question, output difficulty score from 1 to 10 |
35+
36+
## run function
2237

23-
## run
2438
```python
25-
def run(self, storage: DataFlowStorage, input_key: str, output_key: str = "difficulty_score")
39+
def run(self, storage: DataFlowStorage, input_key: str, output_key:str="difficulty_score")
2640
```
27-
| Parameter | Type | Default | Description |
28-
| :--- | :--- | :--- | :--- |
29-
| **storage** | DataFlowStorage | Required | An instance of the DataFlow storage, responsible for reading and writing data. |
30-
| **input_key** | str | Required | The name of the input column, corresponding to the question field. |
31-
| **output_key** | str | "difficulty_score" | The name of the output column, corresponding to the generated difficulty score field. |
41+
42+
#### Parameters
43+
44+
| Name | Type | Default | Description |
45+
| :----------- | :-------------- | :------------------- | :----------------------------- |
46+
| **storage** | DataFlowStorage | Required | DataFlow storage instance for reading and writing data. |
47+
| **input_key**| str | Required | Input column name corresponding to the question field. |
48+
| **output_key**| str | "difficulty_score" | Output column name corresponding to the generated difficulty score field. |
3249

3350
## 🧠 Example Usage
3451

52+
```python
53+
from dataflow.operators.reasoning import ReasoningQuestionDifficultySampleEvaluator
54+
from dataflow.utils.storage import FileStorage
55+
from dataflow.core import LLMServingABC
56+
from dataflow.serving import APILLMServing_request
57+
58+
class ReasoningQuestionDifficultySampleEvaluatorTest():
59+
def __init__(self, llm_serving: LLMServingABC = None):
60+
61+
self.storage = FileStorage(
62+
first_entry_file_name="example.json",
63+
cache_path="./cache_local",
64+
file_name_prefix="dataflow_cache_step",
65+
cache_type="jsonl",
66+
)
67+
68+
# use API server as LLM serving
69+
self.llm_serving = APILLMServing_request(
70+
api_url="",
71+
model_name="gpt-4o",
72+
max_workers=30
73+
)
74+
75+
self.evaluator = ReasoningQuestionDifficultySampleEvaluator(llm_serving=self.llm_serving)
76+
77+
def forward(self):
78+
self.evaluator.run(
79+
storage = self.storage.step(),
80+
input_key = "instruction",
81+
output_key = "difficulty_score",
82+
)
83+
84+
if __name__ == "__main__":
85+
pl = ReasoningQuestionDifficultySampleEvaluatorTest()
86+
pl.forward()
87+
```
88+
3589
#### 🧾 Default Output Format
90+
3691
| Field | Type | Description |
3792
| :--- | :--- | :--- |
38-
| ... | ... | ... |
39-
| difficulty_score | float | The numerical difficulty score (1-10) generated by the model. -1 if parsing fails. |
93+
| **difficulty_score** | int | The difficulty score of the question, from 1 to 10. |
94+
95+
Example input:
96+
97+
```json
98+
{
99+
"instruction": "Calculate 2 to the power of 5."
100+
}
101+
```
102+
103+
Example output:
104+
105+
```json
106+
{
107+
"difficulty_score": 3
108+
}
109+
```

0 commit comments

Comments
 (0)