Skip to content

Commit 841c2e7

Browse files
authored
[docs] update custom_dataset_docs (#4792)
1 parent 987dbd7 commit 841c2e7

File tree

2 files changed

+20
-8
lines changed

2 files changed

+20
-8
lines changed

docs/source/Customization/自定义数据集.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -25,15 +25,21 @@ sharegpt格式:
2525
{"system": "<system>", "conversation": [{"human": "<query1>", "assistant": "<response1>"}, {"human": "<query2>", "assistant": "<response2>"}]}
2626
```
2727

28-
alpaca格式:
28+
query-response格式:
2929
```jsonl
30-
{"system": "<system>", "instruction": "<query-inst>", "input": "<query-input>", "output": "<response>"}
30+
{"system": "<system>", "query": "<query2>", "response": "<response2>", "history": [["<query1>", "<response1>"]]}
3131
```
32+
注意:以下字段会自动转成对应的system、query、response字段。
33+
- system: 'system', 'system_prompt'.
34+
- query: 'query', 'prompt', 'input', 'instruction', 'question', 'problem'.
35+
- response: 'response', 'answer', 'output', 'targets', 'target', 'answer_key', 'answers', 'solution', 'text', 'completion', 'content'.
3236

33-
query-response格式:
37+
alpaca格式:
3438
```jsonl
35-
{"system": "<system>", "query": "<query2>", "response": "<response2>", "history": [["<query1>", "<response1>"]]}
39+
{"system": "<system>", "instruction": "<query-inst>", "input": "<query-input>", "output": "<response>"}
3640
```
41+
- 注意:instruction和input字段将组合成query字段。若instruction和input不等于空字符串,`query = f'{instruction}\n{input}'`
42+
3743

3844
## 标准数据集格式
3945

docs/source_en/Customization/Custom-dataset.md

Lines changed: 10 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -26,15 +26,21 @@ ShareGPT format:
2626
{"system": "<system>", "conversation": [{"human": "<query1>", "assistant": "<response1>"}, {"human": "<query2>", "assistant": "<response2>"}]}
2727
```
2828

29-
Alpaca format:
29+
Query-Response format:
3030
```jsonl
31-
{"system": "<system>", "instruction": "<query-inst>", "input": "<query-input>", "output": "<response>"}
31+
{"system": "<system>", "query": "<query2>", "response": "<response2>", "history": [["<query1>", "<response1>"]]}
3232
```
33+
Note: The following fields will be automatically converted to the corresponding system, query, and response fields.
34+
- system: 'system', 'system_prompt'.
35+
- query: 'query', 'prompt', 'input', 'instruction', 'question', 'problem'.
36+
- response: 'response', 'answer', 'output', 'targets', 'target', 'answer_key', 'answers', 'solution', 'text', 'completion', 'content'.
3337

34-
Query-Response format:
38+
Alpaca format:
3539
```jsonl
36-
{"system": "<system>", "query": "<query2>", "response": "<response2>", "history": [["<query1>", "<response1>"]]}
40+
{"system": "<system>", "instruction": "<query-inst>", "input": "<query-input>", "output": "<response>"}
3741
```
42+
- Note: The instruction and input fields will be combined into the query field. If instruction and input are not empty strings, then `query = f'{instruction}\n{input}'`.
43+
3844

3945
## Standard Dataset Format
4046

0 commit comments

Comments
 (0)