Skip to content

Commit d586583

Browse files
authored
[docs] update loss_scale docs (#5516)
1 parent f9a925a commit d586583

File tree

3 files changed

+45
-13
lines changed

3 files changed

+45
-13
lines changed

docs/source/Instruction/Agent支持.md

Lines changed: 15 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -186,11 +186,23 @@ tools = [{
186186

187187
## loss_scale的使用
188188

189-
loss_scale可以对模型输出部分的训练损失权重进行调节。例如在ReACT格式中,可以设置`--loss_scale react`(loss_scale配置文件书写在[这里](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss_scale/config/react.json)),该参数起到的作用是:
189+
loss_scale参数可用于调节模型输出部分在训练过程中的损失权重。目前支持两种配置方式:字符串精确匹配和正则表达式匹配。
190190

191-
'Thought:'和'Final Answer:'部分权重为1,'Action:'和'Action Input:'部分权重为2,'Observation:'字段本身权重为2,'Observation:'后面的工具调用结果权重为0。
191+
1. 字符串匹配示例:ReACT 格式
192192

193-
具体的loss_scale插件设计,请参考[插件化](../Customization/插件化.md)文档.
193+
以 ReACT 格式为例,可通过 `--loss_scale react` 启用相应的 loss_scale 配置(配置文件详见 [react.json](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss_scale/config/react.json))。该方式基于字符串精确匹配,配置中的字典映射需提供一个包含两个元素的列表,分别表示:当前匹配字符串本身的损失权重,
194+
从该字符串之后到下一个指定字符串之前的内容的损失权重。该设置的具体效果如下:
195+
- 'Action:' 和 'Action Input:' 字段自身及其后续内容的损失权重均为 2;
196+
- 'Thought:' 和 'Final Answer:' 字段自身及其后续内容的损失权重均为 1;
197+
- 'Observation:' 字段自身的权重为 2,但其后跟随的工具调用结果部分的损失权重为 0。
198+
199+
2. 正则匹配示例:忽略空思维块
200+
201+
在训练推理模型时,我们可能需要忽略数据集中存在的形如 `<think>\n\n</think>\n\n`的空思维标记损失计算。此时可使用 `--loss_scale ignore_empty_think`(配置文件详见 [ignore_empty_think.json](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss_scale/config/ignore_empty_think.json))。该配置采用正则表达式匹配方式,字典映射的列表只需指定一个值,表示匹配内容的损失权重。该设置的具体效果如下:
202+
203+
- 所有与正则表达式`<think>\\s*</think>\\s*`匹配的字符串,loss_scale为0,即不计算损失。
204+
205+
更多的loss_scale插件设计,请参考[插件化](../Customization/插件化.md)文档.
194206

195207

196208
## 训练

docs/source_en/Instruction/Agent-support.md

Lines changed: 24 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -190,14 +190,32 @@ tools = [{
190190

191191
## Usage of loss_scale
192192

193-
`loss_scale` can be used to adjust the training loss weight for the model's output section. For example, in the ReACT format, you can set `--loss_scale react` (the loss_scale configuration file is written [here](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss_scale/config/react.json)). The role of this parameter is as follows:
193+
The `loss_scale` parameter can be used to adjust the loss weights for different parts of the model output during training. Currently, two configuration methods are supported: exact string matching and regular expression (regex) matching.
194194

195-
- The weight for the 'Thought:' and 'Final Answer:' sections is 1.
196-
- The weight for the 'Action:' and 'Action Input:' sections is 2.
197-
- The weight for the 'Observation:' field itself is 2.
198-
- The weight for the tool invocation results following the 'Observation:' field is 0.
195+
1. String Matching Example: ReACT Format
199196

200-
For the detailed design of the `loss_scale` plugin, please refer to the [Plugin-based Architecture](../Customization/Pluginization.md)documentation.
197+
Take the ReACT format as an example. You can enable the corresponding `loss_scale` configuration via `--loss_scale react` (see configuration file [react.json](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss_scale/config/react.json)). This method relies on exact string matching. The dictionary mapping in the configuration must provide a list of two elements, representing:
198+
199+
- The loss weight for the matched string itself,
200+
- The loss weight for content following the matched string, up to (but not including) the next specified string.
201+
202+
The specific effects of this configuration are as follows:
203+
204+
- The `'Action:'` and `'Action Input:'` keywords and their subsequent content both have a loss weight of 2;
205+
- The `'Thought:'` and `'Final Answer:'` keywords and their subsequent content both have a loss weight of 1;
206+
- The `'Observation:'` field itself has a loss weight of 2, but the subsequent tool call result content has a loss weight of 0.
207+
208+
209+
2. Regular Expression Matching Example: Ignoring Empty Thought Blocks
210+
211+
When training reasoning models, it may be necessary to exclude loss computation for empty thought blocks in the dataset, such as sequences like `<think>\n\n</think>\n\n`.
212+
213+
In such cases, use `--loss_scale ignore_empty_think` (see configuration file [ignore_empty_think.json](https://github.com/modelscope/ms-swift/blob/main/swift/plugin/loss_scale/config/ignore_empty_think.json)). This configuration uses regular expression matching, where the dictionary mapping only needs to specify a single value—the loss weight for the matched content.
214+
215+
The specific effect of this setting is:
216+
- Any string matching the regular expression `<think>\\s*</think>\\s*` is assigned a `loss_scale` of 0, meaning no loss is computed for these segments.
217+
218+
For more `loss_scale` plugin designs, please refer to the [Pluginization](../Customization/Pluginization.md) documentation.
201219

202220
## Training
203221

swift/plugin/agent_template/glm4.py

Lines changed: 6 additions & 4 deletions
Original file line numberDiff line numberDiff line change
@@ -109,13 +109,15 @@ def get_toolcall(self, response: str) -> List['Function']:
109109

110110
def _format_tools(self, tools: List[Union[str, dict]], system: str, user_message=None) -> str:
111111
tool_descs = [
112-
'# Tools\n\nYou may call one or more functions to assist with the user query.\n\nYou are provided with function signatures within <tools></tools> XML tags:\n<tools>'
112+
'# Tools\n\nYou may call one or more functions to assist with the user query.\n\n'
113+
'You are provided with function signatures within <tools></tools> XML tags:\n<tools>'
113114
]
114115
for tool in tools:
115116
tool_descs.append(f'{json.dumps(tool, ensure_ascii=False)}')
116-
tool_descs.append(
117-
'</tools>\n\nFor each function call, output the function name and arguments within the following XML format:\n<tool_call>{function-name}\n<arg_key>{arg-key-1}</arg_key>\n<arg_value>{arg-value-1}</arg_value>\n<arg_key>{arg-key-2}</arg_key>\n<arg_value>{arg-value-2}</arg_value>\n...\n</tool_call>'
118-
)
117+
tool_descs.append('</tools>\n\nFor each function call, output the function name and arguments within '
118+
'the following XML format:\n<tool_call>{function-name}\n<arg_key>{arg-key-1}</arg_key>\n'
119+
'<arg_value>{arg-value-1}</arg_value>\n<arg_key>{arg-key-2}</arg_key>\n'
120+
'<arg_value>{arg-value-2}</arg_value>\n...\n</tool_call>')
119121
tool_descs = '\n'.join(tool_descs)
120122
if system.strip():
121123
tool_descs += '<|system|>\n' + system.strip()

0 commit comments

Comments
 (0)