Skip to content

Commit 4ea46fb

Browse files
authored
update docs (#3961)
1 parent b911092 commit 4ea46fb

File tree

9 files changed

+84
-32
lines changed

9 files changed

+84
-32
lines changed

docs/source/Instruction/Agent支持.md

Lines changed: 22 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -16,7 +16,7 @@
1616

1717
以下为上述两条数据样本由qwen2_5和qwen2_5_vl的template进行encode后的input_ids和labels,选择的agent_template为**hermes**
1818

19-
样本一:
19+
样本一(并行工具调用)
2020
```text
2121
[INPUT_IDS] <|im_start|>system
2222
You are Qwen, created by Alibaba Cloud. You are a helpful assistant.
@@ -61,7 +61,7 @@ For each function call, return a json object with function name and arguments wi
6161
</tool_call><|im_end|>[-100 * 67]根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。<|im_end|>
6262
```
6363

64-
样本二:
64+
样本二(多模态,混合assistant和tool_call)
6565
```text
6666
[INPUT_IDS] <|im_start|>system
6767
You are a helpful assistant.
@@ -103,7 +103,7 @@ For each function call, return a json object with function name and arguments wi
103103
</tool_call><|im_end|>[-100 * 759]成功打开日历App,现在的时间为中午11点<|im_end|>
104104
```
105105

106-
**react_en**也是最常使用的agent template格式,以下为样本一由qwen2_5使用`agent_template='react_en'`进行encode后的input_ids和labels:
106+
**react_en**是常用的agent template格式之一,以下为样本一由qwen2_5使用`agent_template='react_en'`进行encode后的input_ids和labels:
107107

108108
```text
109109
[INPUT_IDS] <|im_start|>system
@@ -142,7 +142,18 @@ Action Input: {'city': '上海'}
142142
Observation:[-100 * 45]根据天气预报工具,北京今天的空气质量指数为10,属于良好水平;上海今天的空气质量指数为72,属于轻度污染水平。<|im_end|>
143143
```
144144

145-
更多的agent template可选值参考[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template/__init__.py).
145+
更多模型和agent_template的尝试可以使用以下代码,更多的agent template可选值参考[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template/__init__.py)
146+
```python
147+
from swift.llm import get_model_tokenizer, get_template
148+
149+
_, tokenizer = get_model_tokenizer('ZhipuAI/GLM-4-9B-0414', load_model=False)
150+
template = get_template(tokenizer.model_meta.template, tokenizer, agent_template='hermes')
151+
data = {...}
152+
template.set_mode('train')
153+
encoded = template.encode(data)
154+
print(f'[INPUT_IDS] {template.safe_decode(encoded["input_ids"])}\n')
155+
print(f'[LABELS] {template.safe_decode(encoded["labels"])}')
156+
```
146157

147158

148159
## tools格式
@@ -174,21 +185,23 @@ tools = [{
174185

175186
## loss_scale的使用
176187

177-
loss_scale可以对模型输出部分的训练权重进行调节。例如在ReACT格式中,可以设置`--loss_scale react`(loss_scale配置文件书写在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/config/default_loss_scale_config.json)),该参数起到的作用是:
188+
loss_scale可以对模型输出部分的训练损失权重进行调节。例如在ReACT格式中,可以设置`--loss_scale react`(loss_scale配置文件书写在[这里](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/config/default_loss_scale_config.json)),该参数起到的作用是:
178189

179190
'Thought:'和'Final Answer:'部分权重为1,'Action:'和'Action Input:'部分权重为2,'Observation:'字段本身权重为2,'Observation:'后面的工具调用结果权重为0。
180191

181192
具体的loss_scale插件设计,请参考[插件化](../Customization/插件化.md)文档.
182193

183194

184195
## 训练
185-
参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent),支持不同模型的丝滑切换。
196+
- 训练Base模型的Agent能力,通过修改`--model`切换不同模型,参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/agent/qwen2_5.sh)
197+
- 训练GLM4的agent_template为hermes,参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/train/agent/glm4.sh)
198+
- 使用`--loss_scale`对模型输出部分的损失权重进行调整,参加[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale)
186199

187200
## 推理
188201

189-
- 原始模型或者全参数训练参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo_agent.py)
190-
- LoRA训练参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale/infer.md)
202+
- 原始模型或者全参数训练后模型的推理,参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo_agent.py)
203+
- LoRA训练后推理,参考[这里](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale/infer.md)
191204

192205
## 部署
193206

194-
参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/deploy/agent)
207+
服务端和客户端代码,参考[这里](https://github.com/modelscope/ms-swift/blob/main/examples/deploy/agent)

docs/source/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@ Swift DOCUMENTATION
2727
Instruction/导出与推送.md
2828
Instruction/强化微调.md
2929
Instruction/GRPO.md
30+
Instruction/Agent支持.md
3031
Instruction/支持的模型和数据集.md
3132
Instruction/使用tuners.md
32-
Instruction/Agent支持.md
3333
Instruction/常见问题整理.md
3434

3535
.. toctree::

docs/source_en/Customization/Pluginization.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -95,7 +95,7 @@ In the above definition, we added a new `custom` metric. Its value consists of t
9595
## Customizing Optimizers
9696

9797
An example can be found [here](https://github.com/modelscope/swift/blob/main/swift/plugin/optimizer.py).
98-
- Apply different learning rates to different parts of the model. For example, use separate learning rates for ViT and LLM, as referenced [here ](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/lora_llm_full_vit/custom_plugin.py).
98+
- Apply different learning rates to different parts of the model. For example, use separate learning rates for ViT and LLM, as referenced [here](https://github.com/modelscope/ms-swift/blob/main/examples/train/multimodal/lora_llm_full_vit/custom_plugin.py).
9999

100100
Users can add their own optimizers and learning rate schedulers here:
101101

@@ -126,8 +126,8 @@ The example is [here](https://github.com/modelscope/swift/blob/main/swift/plugin
126126
## Customizing Tuners
127127

128128
An example can be found [here](https://github.com/modelscope/swift/blob/main/swift/plugin/tuner.py).
129-
- For the multimodal model, full-parameter training is applied to the ViT part, while LoRA training is used for the LLM part. Refer to [here ](https://github.com/modelscope/ms-swift/tree/main/examples/train/multimodal/lora_llm_full_vit).
130-
- For Phi4-multimodal, train its existing LoRA directly without adding extra LoRA. Refer to [here ](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/tuner_phi4_mm.sh).
129+
- For the multimodal model, full-parameter training is applied to the ViT part, while LoRA training is used for the LLM part. Refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/train/multimodal/lora_llm_full_vit).
130+
- For Phi4-multimodal, train its existing LoRA directly without adding extra LoRA. Refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/train/plugins/tuner_phi4_mm.sh).
131131

132132
Tuner customization is another unique feature of SWIFT. Developers can bypass the complex tuner initialization process and code integration costs by registering new tuners here:
133133

docs/source_en/Instruction/Agent-support.md

Lines changed: 23 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -18,7 +18,7 @@ Example data samples for the pure text Agent and multimodal Agent are as follows
1818

1919
The following are the `input_ids` and `labels` after encoding the two data samples mentioned above using the templates for **qwen2_5** and **qwen2_5_vl** , with the selected `agent_template` being **hermes** :
2020

21-
Sample One:
21+
Sample One (Parallel Tool Calls):
2222

2323
```text
2424
[INPUT_IDS] <|im_start|>system
@@ -64,7 +64,7 @@ According to the weather forecast tool, the air quality index (AQI) in Beijing i
6464
</tool_call><|im_end|>[-100 * 67]According to the weather forecast tool, the air quality index (AQI) in Beijing is 10, which indicates good air quality; whereas in Shanghai, the AQI is 72, indicating mild pollution.<|im_end|>
6565
```
6666

67-
Sample Two:
67+
Sample Two (Multimodal, Mixed Assistant and Tool Call):
6868

6969
```text
7070
[INPUT_IDS] <|im_start|>system
@@ -107,7 +107,7 @@ I can check the current time by opening the calendar app.
107107
</tool_call><|im_end|>[-100 * 759]Successfully opened the calendar app. The current time is 11 o'clock in the morning.<|im_end|>
108108
```
109109

110-
**react_en** is also the most commonly used agent template format. The following are the `input_ids` and `labels` after encoding Sample One using qwen2_5 with `agent_template='react_en'`:
110+
**react_en** is one of the commonly used agent template formats. Below is an example of the `input_ids` and `labels` after encoding by qwen2_5 using `agent_template='react_en'`:
111111

112112
```text
113113
[INPUT_IDS] <|im_start|>system
@@ -146,7 +146,19 @@ Action Input: {'city': 'Shanghai'}
146146
Observation:[-100 * 45]According to the weather forecast tool, the air quality index (AQI) in Beijing is 10, which indicates good air quality; whereas in Shanghai, the AQI is 72, indicating mild pollution.<|im_end|>
147147
```
148148

149-
For more optional values of the agent template, refer to [here](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template/__init__.py).
149+
The following code can be used to experiment with more models and `agent_template` options. For more selectable values of `agent_template`, refer to [here](https://github.com/modelscope/swift/blob/main/swift/plugin/agent_template/__init__.py).
150+
151+
```python
152+
from swift.llm import get_model_tokenizer, get_template
153+
154+
_, tokenizer = get_model_tokenizer('ZhipuAI/GLM-4-9B-0414', load_model=False)
155+
template = get_template(tokenizer.model_meta.template, tokenizer, agent_template='hermes')
156+
data = {...}
157+
template.set_mode('train')
158+
encoded = template.encode(data)
159+
print(f'[INPUT_IDS] {template.safe_decode(encoded["input_ids"])}\n')
160+
print(f'[LABELS] {template.safe_decode(encoded["labels"])}')
161+
```
150162

151163
## Tools Format
152164

@@ -178,7 +190,7 @@ tools = [{
178190

179191
## Usage of loss_scale
180192

181-
`loss_scale` can be used to adjust the training weight of specific parts in the model's output. For example, in the ReACT format, you can set `--loss_scale react` (the `loss_scale` configuration file can be found [here ](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/config/default_loss_scale_config.json)). The role of this parameter is as follows:
193+
`loss_scale` can be used to adjust the training loss weight for the model's output section. For example, in the ReACT format, you can set `--loss_scale react` (the loss_scale configuration file is written [here](https://github.com/modelscope/swift/blob/main/swift/plugin/loss_scale/config/default_loss_scale_config.json)). The role of this parameter is as follows:
182194

183195
- The weight for the 'Thought:' and 'Final Answer:' sections is 1.
184196
- The weight for the 'Action:' and 'Action Input:' sections is 2.
@@ -189,13 +201,15 @@ For the detailed design of the `loss_scale` plugin, please refer to the [Plugin-
189201

190202
## Training
191203

192-
Refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent)for smooth switching between different models.
204+
- Train the Agent capabilities of Base models by switching different models through modifying `--model`. Refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/train/agent/qwen2_5.sh).
205+
- The agent_template for training GLM4 is hermes. Refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/train/agent/glm4.sh).
206+
- Use `--loss_scale` to adjust the loss weight of the model output section. Refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale).
193207

194208
## Inference
195209

196-
- For the original model or full-parameter training, refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo_agent.py).
197-
- For LoRA training, refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale/infer.md).
210+
- For inference of the original model or fully trained model, refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/infer/demo_agent.py).
211+
- For inference after LoRA training, refer to [here](https://github.com/modelscope/ms-swift/tree/main/examples/train/agent/loss_scale/infer.md).
198212

199213
## Deployment
200214

201-
Refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/deploy/agent).
215+
For server and client code, refer to [here](https://github.com/modelscope/ms-swift/blob/main/examples/deploy/agent).

docs/source_en/index.rst

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -27,9 +27,9 @@ Swift DOCUMENTATION
2727
Instruction/Export-and-push.md
2828
Instruction/Reinforced-Fine-tuning.md
2929
Instruction/GRPO.md
30+
Instruction/Agent-support.md
3031
Instruction/Supported-models-and-datasets.md
3132
Instruction/Use-tuners.md
32-
Instruction/Agent-support.md
3333
Instruction/Frequently-asked-questions.md
3434

3535

examples/deploy/agent/client.py

Lines changed: 4 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -36,9 +36,10 @@ def infer(client, model: str, messages, tools):
3636
response = resp.choices[0].message.content
3737
print(f'query: {query}')
3838
print(f'response: {response}')
39+
print(f'tool_calls: {resp.choices[0].message.tool_calls}')
3940

4041
tool = '{"temperature": 32, "condition": "Sunny", "humidity": 50}'
41-
print(f'tool: {tool}')
42+
print(f'tool_response: {tool}')
4243
messages += [{'role': 'assistant', 'content': response}, {'role': 'tool', 'content': tool}]
4344
resp = client.chat.completions.create(model=model, messages=messages, tools=tools, max_tokens=512, temperature=0)
4445
response2 = resp.choices[0].message.content
@@ -58,9 +59,10 @@ def infer_stream(client, model: str, messages, tools):
5859
response += delta
5960
print(delta, end='', flush=True)
6061
print()
62+
print(f'tool_calls: {chunk.choices[0].delta.tool_calls}')
6163

6264
tool = '{"temperature": 32, "condition": "Sunny", "humidity": 50}'
63-
print(f'tool: {tool}')
65+
print(f'tool_response: {tool}')
6466
messages += [{'role': 'assistant', 'content': response}, {'role': 'tool', 'content': tool}]
6567
gen = client.chat.completions.create(
6668
model=model, messages=messages, tools=tools, max_tokens=512, temperature=0, stream=True)

examples/infer/demo_agent.py

Lines changed: 24 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -12,9 +12,10 @@ def infer(engine: 'InferEngine', infer_request: 'InferRequest'):
1212
response = resp_list[0].choices[0].message.content
1313
print(f'query: {query}')
1414
print(f'response: {response}')
15+
print(f'tool_calls: {resp_list[0].choices[0].message.tool_calls}')
1516

1617
tool = '{"temperature": 32, "condition": "Sunny", "humidity": 50}'
17-
print(f'tool: {tool}')
18+
print(f'tool_response: {tool}')
1819
infer_request.messages += [{'role': 'assistant', 'content': response}, {'role': 'tool', 'content': tool}]
1920
resp_list = engine.infer([infer_request], request_config)
2021
response2 = resp_list[0].choices[0].message.content
@@ -35,9 +36,10 @@ def infer_stream(engine: 'InferEngine', infer_request: 'InferRequest'):
3536
response += delta
3637
print(delta, end='', flush=True)
3738
print()
39+
print(f'tool_calls: {resp.choices[0].delta.tool_calls}')
3840

3941
tool = '{"temperature": 32, "condition": "Sunny", "humidity": 50}'
40-
print(f'tool: {tool}\nresponse2: ', end='')
42+
print(f'tool_response: {tool}\nresponse2: ', end='')
4143
infer_request.messages += [{'role': 'assistant', 'content': response}, {'role': 'tool', 'content': tool}]
4244
gen_list = engine.infer([infer_request], request_config)
4345
for resp in gen_list[0]:
@@ -73,6 +75,24 @@ def get_infer_request():
7375
}])
7476

7577

78+
def infer_continue_generate(engine):
79+
# Continue generating after the assistant message.
80+
infer_request = InferRequest(messages=[{
81+
'role': 'user',
82+
'content': 'How is the weather today?'
83+
}, {
84+
'role': 'assistant',
85+
'content': 'It is sunny today, '
86+
}, {
87+
'role': 'assistant',
88+
'content': None
89+
}])
90+
request_config = RequestConfig(max_tokens=512, temperature=0)
91+
resp_list = engine.infer([infer_request], request_config)
92+
response = resp_list[0].choices[0].message.content
93+
print(f'response: {response}')
94+
95+
7696
if __name__ == '__main__':
7797
from swift.llm import InferEngine, InferRequest, PtEngine, RequestConfig
7898
model = 'Qwen/Qwen2.5-1.5B-Instruct'
@@ -89,3 +109,5 @@ def get_infer_request():
89109

90110
infer(engine, get_infer_request())
91111
infer_stream(engine, get_infer_request())
112+
113+
infer_continue_generate(engine)

swift/llm/dataset/dataset/llm.py

Lines changed: 0 additions & 5 deletions
Original file line numberDiff line numberDiff line change
@@ -744,11 +744,6 @@ def preprocess(self, row: Dict[str, Any]) -> Optional[Dict[str, Any]]:
744744
messages = res['messages']
745745
if messages[0]['role'] == 'system':
746746
messages.pop(0)
747-
for message in messages:
748-
if message['role'] == 'function-call':
749-
message['role'] = 'tool_call'
750-
elif message['role'] == 'function-response':
751-
message['role'] = 'tool_response'
752747
return res
753748

754749

swift/llm/dataset/preprocessor/core.py

Lines changed: 6 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -401,6 +401,8 @@ def __init__(
401401
self.content_keys = ['content', 'value'] if content_key is None else [content_key]
402402
self.user_roles = ['user', 'human'] if user_role is None else [user_role]
403403
self.assistant_roles = ['assistant', 'gpt', 'bot'] if assistant_role is None else [assistant_role]
404+
self.tool_call_roles = ['function_call']
405+
self.tool_response_roles = ['function_response', 'observation', 'observations']
404406

405407
self.system_role = system_role
406408
self.repair_messages = repair_messages
@@ -446,6 +448,10 @@ def to_std_messages(self, messages: List[Dict[str, str]], system: Optional[str])
446448
message['role'] = 'user'
447449
elif role in self.assistant_roles:
448450
message['role'] = 'assistant'
451+
elif role.replace('-', '_') in self.tool_call_roles:
452+
message['role'] = 'tool_call'
453+
elif role.replace('-', '_') in self.tool_response_roles:
454+
message['role'] = 'tool_response'
449455

450456
@staticmethod
451457
def _to_std_key(messages: List[Dict[str, str]], std_key: str, optional_keys: List[str]) -> None:

0 commit comments

Comments
 (0)