Skip to content

Commit b5f1789

Browse files
authored
support alpha-umi algo and dataset (#752)
1 parent 37f27e8 commit b5f1789

File tree

6 files changed

+74
-2
lines changed

6 files changed

+74
-2
lines changed

docs/source/LLM/Agent微调最佳实践.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -440,7 +440,7 @@ print()
440440
441441
442442
### 微调
443-
`dataset`换为`ms-agent-for-agentfabric``ms-agent-for-agentfabric-default`
443+
`dataset`换为`ms-agent-for-agentfabric-default``ms-agent-for-agentfabric-addition`
444444
```shell
445445
# Experimental environment: 8GPU
446446
nproc_per_node=8

docs/source/LLM/支持的模型和数据集.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -277,6 +277,10 @@
277277
|damo-agent-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|422115|161|965.7±440.9, min=321, max=31535|chat, agent, multi-round|-|
278278
|damo-agent-mini-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|39964|152|1230.9±350.1, min=558, max=4982|chat, agent, multi-round|-|
279279
|agent-instruct-all-en|[huangjintao/AgentInstruct_copy](https://modelscope.cn/datasets/huangjintao/AgentInstruct_copy/summary)|1866|0|1144.3±635.5, min=206, max=6412|chat, agent, multi-round|-|
280+
|toolbench-for-alpha-umi-backbone|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|500951|0|1479.3±885.4, min=221, max=18467|chat, agent|-|
281+
|toolbench-for-alpha-umi-caller|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|369279|0|1405.7±784.8, min=214, max=17912|chat, agent|-|
282+
|toolbench-for-alpha-umi-planner|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|494975|0|1391.1±866.5, min=173, max=18451|chat, agent|-|
283+
|toolbench-for-alpha-umi-summarizer|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|83132|0|1641.5±839.0, min=123, max=9852|chat, agent|-|
280284
|code-alpaca-en|[wyj123456/code_alpaca_en](https://modelscope.cn/datasets/wyj123456/code_alpaca_en/summary)|20016|0|100.1±60.1, min=29, max=1776|chat, coding|[sahil2801/CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)|
281285
|🔥leetcode-python-en|[AI-ModelScope/leetcode-solutions-python](https://modelscope.cn/datasets/AI-ModelScope/leetcode-solutions-python/summary)|2359|0|723.8±233.5, min=259, max=2117|chat, coding|-|
282286
|🔥codefuse-python-en|[codefuse-ai/CodeExercise-Python-27k](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary)|27224|0|483.6±193.9, min=45, max=3082|chat, coding|-|

docs/source_en/LLM/Agent-best-practice.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -430,7 +430,7 @@ This section focuses on the interactive framework AgentFabric within Modelscope-
430430
Due to the mismatch between the system prompt in ms-agent and that in Modelscope-Agent, direct training yields suboptimal results. To address this, we have created a new dataset [ms_agent_for_agentfabric](https://modelscope.cn/datasets/AI-ModelScope/ms_agent_for_agentfabric/summary) by converting the format from ms-agent, which is now integrated into SWIFT. The `ms-agent-for-agentfabric-default` includes 30,000 entries converted from ms-agent data, while `ms-agent-for-agentfabric-additional` contains 488 entries filtered from actual function call access data by the open-source AgentFabric framework.
431431
432432
### Fine-tuning
433-
Replace `dataset` with `ms-agent-for-agentfabric` and `ms-agent-for-agentfabric-default`:
433+
Replace `dataset` with `ms-agent-for-agentfabric-default` and `ms-agent-for-agentfabric-addition`:
434434
```shell
435435
# Experimental environment: 8GPU
436436
nproc_per_node=8

docs/source_en/LLM/Supported-models-datasets.md

Lines changed: 4 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -277,6 +277,10 @@ The table below introduces the datasets supported by SWIFT:
277277
|damo-agent-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|422115|161|965.7±440.9, min=321, max=31535|chat, agent, multi-round|-|
278278
|damo-agent-mini-zh|[damo/MSAgent-Bench](https://modelscope.cn/datasets/damo/MSAgent-Bench/summary)|39964|152|1230.9±350.1, min=558, max=4982|chat, agent, multi-round|-|
279279
|agent-instruct-all-en|[huangjintao/AgentInstruct_copy](https://modelscope.cn/datasets/huangjintao/AgentInstruct_copy/summary)|1866|0|1144.3±635.5, min=206, max=6412|chat, agent, multi-round|-|
280+
|toolbench-for-alpha-umi-backbone|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|500951|0|1479.3±885.4, min=221, max=18467|chat, agent|-|
281+
|toolbench-for-alpha-umi-caller|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|369279|0|1405.7±784.8, min=214, max=17912|chat, agent|-|
282+
|toolbench-for-alpha-umi-planner|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|494975|0|1391.1±866.5, min=173, max=18451|chat, agent|-|
283+
|toolbench-for-alpha-umi-summarizer|[shenweizhou/alpha-umi-toolbench-processed-v2](https://modelscope.cn/datasets/shenweizhou/alpha-umi-toolbench-processed-v2/summary)|83132|0|1641.5±839.0, min=123, max=9852|chat, agent|-|
280284
|code-alpaca-en|[wyj123456/code_alpaca_en](https://modelscope.cn/datasets/wyj123456/code_alpaca_en/summary)|20016|0|100.1±60.1, min=29, max=1776|chat, coding|[sahil2801/CodeAlpaca-20k](https://huggingface.co/datasets/sahil2801/CodeAlpaca-20k)|
281285
|🔥leetcode-python-en|[AI-ModelScope/leetcode-solutions-python](https://modelscope.cn/datasets/AI-ModelScope/leetcode-solutions-python/summary)|2359|0|723.8±233.5, min=259, max=2117|chat, coding|-|
282286
|🔥codefuse-python-en|[codefuse-ai/CodeExercise-Python-27k](https://modelscope.cn/datasets/codefuse-ai/CodeExercise-Python-27k/summary)|27224|0|483.6±193.9, min=45, max=3082|chat, coding|-|

swift/llm/agent/utils.py

Lines changed: 19 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -55,5 +55,24 @@ def calculate_loss_scale(response: str,
5555
agent_content.append(c['key'])
5656
agent_content.append(c['content'])
5757
return agent_content, weights
58+
elif ('Action:' in response
59+
or 'Next:' in response) and use_loss_scale: # alpha-umi
60+
agent_keyword = ['Next:', 'Action:', 'Action Input:']
61+
agent_parts = split_str_parts_by(response, agent_keyword)
62+
weights = []
63+
agent_content = []
64+
for c in agent_parts:
65+
if c['key'] in ('Action:', 'Action Input:', 'Next:'):
66+
weights += [2.0]
67+
weights += [2.0]
68+
elif c['key'] in ('Thought:', 'Final Answer:', ''):
69+
weights += [1.0]
70+
weights += [1.0]
71+
elif c['key'] in ('Observation:', ):
72+
weights += [2.0]
73+
weights += [0.0]
74+
agent_content.append(c['key'])
75+
agent_content.append(c['content'])
76+
return agent_content, weights
5877
else:
5978
return [response], [1.0]

swift/llm/utils/dataset.py

Lines changed: 45 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -75,6 +75,10 @@ class DatasetName:
7575
damo_agent_zh = 'damo-agent-zh'
7676
damo_agent_mini_zh = 'damo-agent-mini-zh'
7777
agent_instruct_all_en = 'agent-instruct-all-en'
78+
toolbench_for_alpha_umi_backbone = 'toolbench-for-alpha-umi-backbone'
79+
toolbench_for_alpha_umi_caller = 'toolbench-for-alpha-umi-caller'
80+
toolbench_for_alpha_umi_planner = 'toolbench-for-alpha-umi-planner'
81+
toolbench_for_alpha_umi_summarizer = 'toolbench-for-alpha-umi-summarizer'
7882
# coding
7983
code_alpaca_en = 'code-alpaca-en'
8084
leetcode_python_en = 'leetcode-python-en'
@@ -1115,6 +1119,14 @@ def _preprocess_capcha_images(dataset: HfDataset) -> HfDataset:
11151119
return dataset
11161120

11171121

1122+
def _repair_planner(conversations: list) -> list:
1123+
if isinstance(conversations, str):
1124+
conversations = ast.literal_eval(conversations)
1125+
if len(conversations) == 2 and conversations[0]['from'] != 'user':
1126+
conversations[0]['from'] = 'user'
1127+
return conversations
1128+
1129+
11181130
register_dataset(
11191131
DatasetName.capcha_images,
11201132
'AI-ModelScope/captcha-images', [('default', 'train')],
@@ -1158,6 +1170,39 @@ def _preprocess_capcha_images(dataset: HfDataset) -> HfDataset:
11581170
get_dataset_from_repo,
11591171
tags=['chat', 'coding', '🔥'])
11601172

1173+
register_dataset(
1174+
DatasetName.toolbench_for_alpha_umi_backbone,
1175+
'shenweizhou/alpha-umi-toolbench-processed-v2', [('backbone', 'train')],
1176+
None,
1177+
ConversationsPreprocessor('system', system_role=None),
1178+
get_dataset_from_repo,
1179+
tags=['chat', 'agent'])
1180+
1181+
register_dataset(
1182+
DatasetName.toolbench_for_alpha_umi_caller,
1183+
'shenweizhou/alpha-umi-toolbench-processed-v2', [('caller', 'train')],
1184+
None,
1185+
ConversationsPreprocessor('system', 'caller', None),
1186+
get_dataset_from_repo,
1187+
tags=['chat', 'agent'])
1188+
1189+
register_dataset(
1190+
DatasetName.toolbench_for_alpha_umi_planner,
1191+
'shenweizhou/alpha-umi-toolbench-processed-v2', [('planner', 'train')],
1192+
None,
1193+
ConversationsPreprocessor(
1194+
repair_conversations=_repair_planner, error_strategy='delete'),
1195+
get_dataset_from_repo,
1196+
tags=['chat', 'agent'])
1197+
1198+
register_dataset(
1199+
DatasetName.toolbench_for_alpha_umi_summarizer,
1200+
'shenweizhou/alpha-umi-toolbench-processed-v2', [('summarizer', 'train')],
1201+
None,
1202+
ConversationsPreprocessor('system', 'conclusion', None),
1203+
get_dataset_from_repo,
1204+
tags=['chat', 'agent'])
1205+
11611206

11621207
def _preprocess_blossom_math(dataset: HfDataset) -> HfDataset:
11631208
response = []

0 commit comments

Comments
 (0)